Haskell Logging Made Complicated

Nathan Collins

14 February 2014

Motivating Spire Example

check (id (?w1 x) id (?w2 x) x) <= Bool
+ infer (id (?w1 x) id (?w2 x) x)
+ | infer (id (?w1 x) id (?w2 x))
+ | + infer (id (?w1 x) id)
+ | + | infer (id (?w1 x))
+ | + | + infer id => Pi T:*. T -> T
+ | + | + forcePi Pi T:*. T -> T ~>
+ | + | + check (?w1 x) <= *
+ | + | + | infer (?w1 x)
+ | + | + | + infer ?w1 => ?w1T
+ | + | + | + forcePi ?w1T ~> Pi xπ : ?w1TAπ . ?w1TBπ
+ | + | + | + check x <= ?w1TAπ
+ | + | + | + | infer x => Bool
+ | + | + | + | Bool =u= ?w1TAπ
+ | + | + | > ?w1TBπ x
+ | + | + | ?w1TBπ x =u= *
+ | + | > (?w1 x) -> (?w1 x)         // Via '(T -> T)[T |-> (?w1 x)]'.
+ | + | forcePi (?w1 x) -> (?w1 x) ~>
+ | + | check id <= (?w1 x)
+ | + | + infer id => Pi T:*. T -> T
+ | + | + Pi T:*. T -> T =u= (?w1 x) // This solves '(?w1 x)' ...
+ | + > (?w1 x)                      // ... but we don't return the solution! So,
+ | + forcePi (?w1 x) ~> Pi xπ : ?w1Aπ x . ?w1Bπ x xπ
+ | + check (?w2 x) <= ?w1Aπ x       // instead of a trivial 'forcePi', we start
+ | + | infer (?w2 x)                // introducing problems we can't solve ...
+ | + | + infer ?w2 => ?w2T
+ | + | + forcePi ?w2T ~> Pi xπ : ?w2TAπ . ?w2TBπ xπ
+ | + | + check x <= ?w2TAπ
+ | + | + | infer x => Bool
+ | + | + | Bool =u= ?w2TAπ
+ | + | > ?w2TBπ x                   // Via '[xπ |-> x]'.
+ | + | ?w2TBπ x =u= ?w1Aπ x
+ | > ?w1Bπ x (?w2 x)                // Not linear!
+ | forcePi ?w1Bπ x (?w2 x) ~> Pi xπ : ?w1BπAπ x (?w2 x) . ?w1BπBπ x (?w2 x) xπ
+ | check x <= ?w1BπAπ x (?w2 x)
+ | + infer x => Bool
+ | + Bool =u= ?w1BπAπ x (?w2 x)     // Can't solve this!
+ > ?w1BπBπ x (?w2 x) x              // Via '[xπ |-> x]'.
+ ?w1BπBπ x (?w2 x) x =u= Bool       // Can't solve this!

Comparison with Python

Python's "decorator" idiom:

def memoize(f):               |  def fib(n):
  cache = dict()              |    if n <= 1:
  def memoized(*args):        |      return n
    if args not in cache:     |    else:
      cache[args] = f(*args)  |      return fib(n-1) + fib(n-2)
    return cache[args]        |
  return memoized             |  fib = memoize(fib)

Before fib = memoize(fib):

%prun fib(25)
  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
242785/1    0.481    0.000    0.481    0.481 <example>:4(fib)

After:

%prun fib(25)
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  49/1    0.000    0.000    0.000    0.000 <example>:20(memoized)
  26/1    0.000    0.000    0.000    0.000 <example>:4(fib)

Comparison with Python (Continued)

Logging decorator:

LOG_INDENT = 0                        |  def fib(n):                   
def log(f):                           |    if n <= 1:                  
  def logged(*args):                  |      return n                  
    global LOG_INDENT                 |    else:                       
    print '| '*LOG_INDENT + \         |      return fib(n-1) + fib(n-2)
          '%s%s'%(f.__name__ , args)  |
    LOG_INDENT += 1                   |  fib = log(fib)
    r = f(*args)                      |
    LOG_INDENT -= 1                   |
    print '| '*LOG_INDENT + '%s'%r    |
    return r                          |
  return logged                       |

# fib = log(fib):                     |  # fib = log(memoize(fib))
fib(4,)                               |  memoized(4,)       
| fib(3,)                             |  | memoized(3,)     
| | fib(2,)                           |  | | memoized(2,)   
| | | fib(1,)                         |  | | | memoized(1,) 
| | | 1                               |  | | | 1            
| | | fib(0,)                         |  | | | memoized(0,) 
| | | 0                               |  | | | 0            
| | 1                                 |  | | 1              
| | fib(1,)                           |  | | memoized(1,)   
| | 1                                 |  | | 1              
| 2                                   |  | 2                
| fib(2,)                             |  | memoized(2,)     
| | fib(1,)                           |  | 1                
| | 1                                 |  3                  
| | fib(0,)                           |                     
| | 0                                 |  memoized(4,)       
| 1                                   |  3                  
3                                     |

Characterizing the decorator pattern

Like a fixed point with a shim in the knot: replace \[fib = fix ~f\] with \[ fib = fix ~(log ~f) ~, \] where \[ \begin{aligned} f~r~x = &if~x \le 1 ~then ~x \\ &else ~r ~(x-1) + r ~(x-2) \end{aligned} \] and \(log\) is the decorator "shim".

Actual approach we take: \[ \begin{aligned} fib = &log ~fib' \\ fib'~x = &if~x \le 1 ~then ~x \\ &else ~fib ~(x-1) + fib ~(x-2) \end{aligned} \]

STLC Example

Log an STLC type checker and format as \(\LaTeX\) proof tree:

\infer[ \to \text{I} ]{
  \cdot \vdash \lambda x {:} P . \lambda y {:} Q . x : P {\to} Q {\to} P
}{ \infer[ \to \text{I} ]{
     x {:} P \vdash \lambda y {:} Q . x : Q {\to} P
   }{ \infer[ \texts    c{Axiom} ]{
        x {:} P , y {:} Q \vdash x : P }{  } } }

Typing of K combinator — Typing of `K` combinator

Failed typing of Y combinator — Failed typing of `Y` combinator

STLC Continued

type Stream = LogStream ProofTree
type M a = ErrorT String (ReaderT Ctx (Writer Stream)) a

type InferTy = Tm -> M Ty
infer , infer' :: InferTy
infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'

infer' (TmVar x) = maybe err pure . lookup x =<< ask where
  err = throwError $ "Variable " ++ x ++ " not in context!"
infer' (Lam x t e) = (t :->:) <$> (local (++ [(x,t)]) . infer $ e)
infer' (e :@: e1) = do
  t <- infer e
  t1 <- infer e1
  case t of
    t1' :->: t2 | t1' == t1 -> pure t2
    _ -> throwError $ "Can't apply " ++ show t ++ " to " ++ show t1 ++ "!"

STLC Continued

User defined:

instance ProofTree (Proxy (SimpleCall "infer" Ctx InferTy ())) where
  callAndReturn t = conclusion ctx tm (Right ty) where
    (tm , ()) = _arg t
    ty = _ret t
    ctx = _before t
  callAndError t = conclusion ctx tm (Left error) where
    ...
    error = maybe "\\,!" (const "\\,\\uparrow") how

conclusion :: Ctx -> Tm -> Either String Ty -> (String , String)
conclusion ctx tm e = (judgment , rule tm) where
  rule (TmVar _) = "\\textsc{Axiom}"
  rule (Lam {}) = "\\to \\text{I}"
  rule (_ :@: _) = "\\to \\text{E}"

  tyOrError = either id pp e

  judgment = pp ctx ++ " \\vdash " ++ pp tm ++ " : " ++ tyOrError

Library:

proofTree :: Ex2T (LogTree ProofTree) -> String
proofTree (Ex2T t@(CallAndReturn {})) =
  "\\infer[ " ++ rule ++ " ]{ " ++ conclusion ++
    " }{ " ++ intercalate " & " premises ++ " }"
  where
    (conclusion , rule) = callAndReturn t
    premises = map proofTree (_children t)
proofTree (Ex2T t@(CallAndError {})) = ...

Digging Deeper

Things to understand:

How traces are collected:
- Event streams and trees.
- Logger shim.
- Curry/uncurry.
- Continuations.
How traces are post processed:
- Processor classes (constraints).
- Constraint coercion
- Constraint implication.
- Uniform vs call-specific instances.

Collecting traces: Events

A type call may describe a call signature:

class Signature call where
  name :: call -> String
  type Before call
  type Arg call
  type Ret call
  type After call

Before and After are collected state.

Log events:

type SigWith (c :: * -> Constraint) call =
  (Signature call , c call)

data LogEvent (c :: * -> Constraint)
  = forall call. SigWith c call =>
    BeginCall call (Before call) (Arg call)
  | forall call. SigWith c call =>
    EndCall call (Before call) (Arg call) (Ret call) (After call)
type LogStream c = [LogEvent c]

Think of BeginCall and EndCall as decorated parentheses.

Collecting traces: Logging

Recall STLC example:

type InferTy = Tm -> M Ty
infer , infer' :: InferTy
infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'

Want:

infer arg = do
  before <- ask
  logEvent (BeginCall <'infer' call> before arg)
  ret <- infer' arg
  after <- return ()
  logEvent (EndCall   <'infer' call> before arg ret after)
  return ret

Solution:

simpleLogger (p::Proxy tag) mBefore mAfter (f::t) = collectAndCallCont k f where
  k :: (GetArg t , GetMonad t (GetRet t)) -> GetMonad t (GetRet t)
  k (arg , mRet) = do
    let call = Proxy::Proxy (SimpleCall tag before t after)
    before <- mBefore
    logEvent (BeginCall call before arg)
    ret <- mRet
    after <- mAfter
    logEvent (EndCall call before arg ret after)
    return ret

Collecting traces: uncurrying

Example: the type (a -> b -> IO c) is in the UncurryM class with

GetArg (a -> b -> IO c) ~ (a , (b , ()))
GetRet (a -> b -> IO c) ~ c
GetMonad (a -> b -> IO c) ~ IO

UncurryM instances:

-- Left out the classes because they're boring.

instance UncurryM b => UncurryM (a -> b) where
  type GetArg   (a -> b) = (a , GetArg b)
  type GetRet   (a -> b) = GetRet b
  type GetMonad (a -> b) = GetMonad b

instance (Monad m , Monad (t m)) => UncurryM (t m r) where
  type GetArg   (t m r) = ()
  type GetRet   (t m r) = r
  type GetMonad (t m r) = (t m)

instance UncurryM (IO r) where
  type GetArg   (IO r) = ()
  type GetRet   (IO r) = r
  type GetMonad (IO r) = IO

Carefully chosen base cases to avoid overlap!

Collecting traces: continuations

UncurryM argument tuple is backwards.

E.g.

GetArg (a -> b -> IO c)    ~    (a , (b , ()))

but collecting arguments in order gives:

    ((() , a) , b)

Solution is continuations:

class UncurryM c => CollectAndCallCont c where
  collectAndCallCont ::
    ((GetArg c , GetMonad c (GetRet c)) -> d) ->
    c -> GetArg c `Curried` d

instance CollectAndCallCont (IO r) where
  collectAndCallCont k tmr = k (() , tmr)

instance (Monad m , Monad (t m)) => CollectAndCallCont (t m r) where
  collectAndCallCont k tmr = k (() , tmr)

instance CollectAndCallCont b => CollectAndCallCont (a -> b) where
  collectAndCallCont k f x =
    collectAndCallCont (\(xs , tmr) -> k ((x , xs) , tmr)) (f x)

where e.g.

(a , (b , ())) `Curried` d    ~    a -> b -> d

Collecting traces: Trees

Recall logger intuition:

infer arg = do
  before <- ask
  logEvent (BeginCall <'infer' call> before arg)
  ret <- infer' arg
  after <- return ()
  logEvent (EndCall   <'infer' call> before arg ret after)
  return ret

We parse BeginCall / EndCall stream into forest:

data LogTree (c :: * -> Constraint) call (name :: Symbol) where
  CallAndReturn :: SigWith c call =>
    { _call     :: call
    , _before   :: Before call
    , _arg      :: Arg call
    , _children :: LogForest c
    , _ret      :: Ret call
    , _after    :: After call
    } -> LogTree c call "CallAndReturn"
  CallAndError :: SigWith c call =>
    { _call'     :: call
    , _before'   :: Before call
    , _arg'      :: Arg call
    , _children' :: LogForest c
    , _how       :: Maybe (Ex2T (LogTree c))
    } -> LogTree c call "CallAndError"

type LogForest c = [Ex2T (LogTree c)]

data Ex2T f where
  Ex2T :: f a b -> Ex2T f

Note that call is no longer existential.

Using traces: Post-processor classes

Class from the STLC example:

class ProofTree call where
  callAndReturn :: LogTree ProofTree call "CallAndReturn" ->
    (String , String)
  callAndError :: LogTree ProofTree call "CallAndError" ->
    (String , String)

Note the "recursive" occurrence of the ProofTree class!

Generate proof tree for LogTree with all nodes in ProofTree:

proofTree :: Ex2T (LogTree ProofTree) -> String
proofTree (Ex2T t@(CallAndReturn {})) =
  "\\infer[ " ++ rule ++ " ]{ " ++ conclusion ++
    " }{ " ++ intercalate " & " premises ++ " }"
  where
    (conclusion , rule) = callAndReturn t
    premises = map proofTree (_children t)
proofTree (Ex2T t@(CallAndError {})) = ...

Using traces: multiple post-processors?

Recall the monad from the STLC example:

type Stream = LogStream ProofTree
type M a = ErrorT String (ReaderT Ctx (Writer Stream)) a

type InferTy = Tm -> M Ty
infer , infer' :: InferTy
infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'

We specify the post-processor at stream creation time!

What if we want to use multiple post-processors?

Using traces: constraint conjunction

Constraint conjunction:

infixr :&&:
class (c1 t , c2 t) => (c1 :&&: c2) t
instance (c1 t , c2 t) => (c1 :&&: c2) t

Example usage:

type Stream = LogStream (ProofTree :&&: UnixTree)
type M a = ErrorT String (ReaderT Ctx (Writer Stream)) a

Using traces: constraint implication?

Recall the ProofTree class:

class ProofTree call where
  callAndReturn :: LogTree ProofTree call "CallAndReturn" ->
    (String , String)
  callAndError :: LogTree ProofTree call "CallAndError" ->
    (String , String)

Doesn't make sense to use ProofTree :&&: UnixTree in place of ProofTree!

Morally have ProoTree :&&: UnixTree "implies" ProofTree and UnixTree.

Need to make this precise!

Using traces: constraint implication!

Make constraints first class:

data Reify c a where
  Reify :: c a => Reify c a

Then constraint implication is coercion of reification:

type Implies c1 c2 = forall a. Reify c1 a -> Reify c2 a

Note that all witnesses for Implies c1 c2 are

\case Reify -> Reify

Using traces: constraint coercion

Why we want implication:

coerceLogTree :: forall c1 c2.
  Implies c1 c2 -> Ex2T (LogTree c1) -> Ex2T (LogTree c2)

Complicated identity function:

coerceLogTree impl (Ex2T (CallAndReturn (call::call) before arg children ret after)) =
  case impl (Reify::Reify c1 call) of
    Reify -> Ex2T $ CallAndReturn call before arg children' ret after
  where
    children' = map (coerceLogTree impl) children
coerceLogTree impl (Ex2T (CallAndError (call::call) before arg children who)) =
  case impl (Reify::Reify c1 call) of
    Reify -> Ex2T $ CallAndError call before arg children' who'
  where
    children' = map (coerceLogTree impl) children
    who' = fmap (coerceLogTree impl) who

Aside: alternate implication definition

Contra-variant premise version:

type Implies' c1 c2 = forall a b. (c2 a => b) -> (c1 a => b)

All witnesses are \x -> x!

Equivalence:

implies1 :: forall c1 c2. Implies c1 c2 -> Implies' c1 c2
implies2 :: forall c1 c2. Implies' c1 c2 -> Implies c1 c2

Proof of equivalence:

implies1 = f where
  f :: forall c1 c2 a b. Implies c1 c2 -> (c2 a => b) -> (c1 a => b)
  f impl x = case impl (Reify::Reify c1 a) of Reify -> x

implies2 impl = f impl where
  f :: forall c1 c2 a
     . ((c2 a => Reify c2 a) -> (c1 a => Reify c2 a))
    -> Reify c1 a -> Reify c2 a
  f impl x = case x of
    Reify -> impl (Reify::c2 a => Reify c2 a)

Using traces: uniform post-processor instances

The interface is instance-per-signature:

instance UnixTree (Proxy (SimpleCall "f" () FTy ())) where
  ...
instance UnixTree (Proxy (SimpleCall "g" () GTy ())) where
  ...
instance UnixTree (Proxy (SimpleCall "h" () HTy ())) where
  ...

Annoying if all the instances are basically the same :P

Instead:

instance UnixTree (HetCall Show "default:bracket") where
  ...

Can have many "default" implementations:

instance UnixTree (HetCall Show "default:header") where
  ...
instance UnixTree (HetCall Show "default:oneline") where
  ...

Uniform post-processors: examples

┬ ()               ┬ ()                           ─ <()> h Nothing = <error: here>
├ h Nothing        └ h Nothing = <error: here>
╘ <error: here>

┬ ()               ┬ ()                           ┬ <()> h (Just 6) = <error: there>
├ h (Just 6)       ├ h (Just 6) = <error: there>  └─┬ <()> f 6 = <error: there>
├─┬ ()             └─┬ ()                           └─┬ <()> f 5 = <error: here>
│ ├ f 6              ├ f 6 = <error: there>           └─┬ <()> f 2 = "XX Y" <()>
│ ├─┬ ()             └─┬ ()                             ├─┬ <()> f 1 = "X Y" <()>
│ │ ├ f 5              ├ f 5 = <error: here>            │ ├── <()> f 0 = " Y" <()>
│ │ ├─┬ ()             └─┬ ()                           │ └── <()> g "X" " Y" = "X Y" <()>
│ │ │ ├ f 2              ├ f 2 = "XX Y"                 └── <()> g "X" "X Y" = "XX Y" <()>
│ │ │ ├─┬ ()             ├ ()
│ │ │ │ ├ f 1            ├─┬ ()
│ │ │ │ ├─┬ ()           │ ├ f 1 = "X Y"
│ │ │ │ │ ├ f 0          │ ├ ()
│ │ │ │ │ ╞ " Y"         │ ├─┬ ()
│ │ │ │ │ ╘ ()           │ │ ├ f 0 = " Y"
│ │ │ │ ├─┬ ()           │ │ └ ()
│ │ │ │ │ ├ g "X" " Y"   │ └─┬ ()
│ │ │ │ │ ╞ "X Y"        │   ├ g "X" " Y" = "X Y"
│ │ │ │ │ ╘ ()           │   └ ()
│ │ │ │ ╞ "X Y"          └─┬ ()
│ │ │ │ ╘ ()               ├ g "X" "X Y" = "XX Y"
│ │ │ ├─┬ ()               └ ()
│ │ │ │ ├ g "X" "X Y"
│ │ │ │ ╞ "XX Y"
│ │ │ │ ╘ ()
│ │ │ ╞ "XX Y"
│ │ │ ╘ ()
│ │ ╘ <error: here>
│ ╘ <error: there>
╘ <error: there>

Uniform post-processors: heterogeneous signatures

Recall:

instance UnixTree (HetCall Show "default:bracket") where ...

HetCall c means all data satisfies c:

data H c where
  H :: c a => a -> H c

unH :: (forall a. c a => a -> b) -> H c -> b
unH f (H x) = f x

data HetCall (c:: * -> Constraint) tag = HetCall String

instance Signature (HetCall c tag) where
  name (HetCall s) = s
  type Before (HetCall c tag) = H c
  -- The argument tuple is unfolded into a list of hets.
  type Arg    (HetCall c tag) = [H c]
  type Ret    (HetCall c tag) = H c
  type After  (HetCall c tag) = H c

Uniform post-processors: heterogeneous coercion

Can't invert, so wait until needed:

heterogenize :: forall c' c tag. c' (HetCall c tag) =>
  Proxy c -> Proxy tag -> Ex2T (LogTree (SigAll c)) -> Ex2T (LogTree c')

class ( Signature call
      , c (Before call)
      , HFold c (Arg call)
      , c (Ret call)
      , c (After call) )
      => SigAll c call

class HFold c t where
  hfoldr :: Proxy c -> (forall t'. c t' => t' -> a -> a) -> a -> t -> a

hmap :: HFold c t => Proxy c -> (forall t'. c t' => t' -> a) -> t -> [a]
hmap p f = hfoldr p (\x as -> f x : as) []

Implementation:

heterogenize p1 p2 (Ex2T (CallAndReturn {..})) =
  Ex2T (CallAndReturn (HetCall (name _call)::HetCall c tag)
                      (H _before)
                      (hmap (Proxy::Proxy c) H _arg)
                      (map (heterogenize p1 p2) _children)
                      (H _ret)
                      (H _after))
heterogenize p1 p2 (Ex2T (CallAndError {..})) = ...

Future work / conclusions

Future work:

Memoization
Hash-consing
Tree-exploration debugging

Conclusions:

Type system put up a good fight
Constraint-kinds extension is useful
Heterogeneous/existential types are tricky
Decorator pattern is useful