Internal Representation of Hakaru terms
The Hakaru AST can be found defined in haskell/Language/Hakaru/Syntax/AST.hs. It is made up of several parts which this section and the next one will explain.
We should note, this datatype makes use of Abstract Binding Trees which we discuss in more detail in the next section. ABTs can be understood as a way to abstract the use of variables in the AST. The advantage of this is it allows all variable substitution and manipulation logic to live in one place and not be specific to a particular AST.
Datakind
The AST is typed using the Hakaru kind, defined in haskell/Language/Types/DataKind.hs. All Hakaru types are defined in terms of the primitives in this datakind.
-- | The universe\/kind of Hakaru types.
data Hakaru
= HNat -- ^ The natural numbers; aka, the non-negative integers.
-- | The integers.
| HInt
-- | Non-negative real numbers. Unlike what you might expect,
-- this is /not/ restructed to the @[0,1]@ interval!
| HProb
-- | The affinely extended real number line. That is, the real
-- numbers extended with positive and negative infinities.
| HReal
-- | The measure monad
| HMeasure !Hakaru
-- | The built-in type for uniform arrays.
| HArray !Hakaru
-- | The type of Hakaru functions.
| !Hakaru :-> !Hakaru
-- | A user-defined polynomial datatype. Each such type is
-- specified by a \"tag\" (the @HakaruCon@) which names the type, and a sum-of-product representation of the type itself.
| HData !HakaruCon [[HakaruFun]]
Please read Datakind.hs for more details.
Term
The Term datatype includes all the syntactic constructions for the Hakaru language.
For all those where we know the number of arguments we expect that language construct
to get, we define the (:$)
constructor, which takes SCons
and SArgs
datatypes
as arguments.
-- | The generating functor for Hakaru ASTs. This type is given in
-- open-recursive form, where the first type argument gives the
-- recursive form. The recursive form @abt@ does not have exactly
-- the same kind as @Term abt@ because every 'Term' represents a
-- locally-closed term whereas the underlying @abt@ may bind some
-- variables.
data Term :: ([Hakaru] -> Hakaru -> *) -> Hakaru -> * where
-- Simple syntactic forms (i.e., generalized quantifiers)
(:$) :: !(SCon args a) -> !(SArgs abt args) -> Term abt a
-- N-ary operators
NaryOp_ :: !(NaryOp a) -> !(Seq (abt '[] a)) -> Term abt a
-- Literal\/Constant values
Literal_ :: !(Literal a) -> Term abt a
Empty_ :: !(Sing ('HArray a)) -> Term abt ('HArray a)
Array_
:: !(abt '[] 'HNat)
-> !(abt '[ 'HNat ] a)
-> Term abt ('HArray a)
-- -- User-defined data types
-- A data constructor applied to some expressions. N.B., this
-- definition only accounts for data constructors which are
-- fully saturated. Unsaturated constructors will need to be
-- eta-expanded.
Datum_ :: !(Datum (abt '[]) (HData' t)) -> Term abt (HData' t)
-- Generic case-analysis (via ABTs and Structural Focalization).
Case_ :: !(abt '[] a) -> [Branch a abt b] -> Term abt b
-- Linear combinations of measures.
Superpose_
:: L.NonEmpty (abt '[] 'HProb, abt '[] ('HMeasure a))
-> Term abt ('HMeasure a)
Reject_ :: !(Sing ('HMeasure a)) -> Term abt ('HMeasure a)
SCons and SArgs
When using (:$)
we have a way to describe primitives where we
know the number of arguments they should get. In that regard,
SArgs is a typed list of abt terms indexed by its size.
-- | The arguments to a @(':$')@ node in the 'Term'; that is, a list
-- of ASTs, where the whole list is indexed by a (type-level) list
-- of the indices of each element.
data SArgs :: ([Hakaru] -> Hakaru -> *) -> [([Hakaru], Hakaru)] -> *
where
End :: SArgs abt '[]
(:*) :: !(abt vars a)
-> !(SArgs abt args)
-> SArgs abt ( '(vars, a) ': args)
These are combined with SCons which describes the constructor, and
the types it expects for its arguments. For example suppose we had
an AST for a function f
and it’s argument x
, we could construct
a Term for applying f
to x
by writing App_:$ f :* x :* End
.
-- | The constructor of a @(':$')@ node in the 'Term'. Each of these
-- constructors denotes a \"normal\/standard\/basic\" syntactic
-- form (i.e., a generalized quantifier). In the literature, these
-- syntactic forms are sometimes called \"operators\", but we avoid
-- calling them that so as not to introduce confusion vs 'PrimOp'
-- etc. Instead we use the term \"operator\" to refer to any primitive
-- function or constant; that is, non-binding syntactic forms. Also
-- in the literature, the 'SCon' type itself is usually called the
-- \"signature\" of the term language. However, we avoid calling
-- it that since our 'Term' has constructors other than just @(:$)@,
-- so 'SCon' does not give a complete signature for our terms.
--
-- The main reason for breaking this type out and using it in
-- conjunction with @(':$')@ and 'SArgs' is so that we can easily
-- pattern match on /fully saturated/ nodes. For example, we want
-- to be able to match @MeasureOp_ Uniform :$ lo :* hi :* End@
-- without needing to deal with 'App_' nodes nor 'viewABT'.
data SCon :: [([Hakaru], Hakaru)] -> Hakaru -> * where
Lam_ :: SCon '[ '( '[ a ], b ) ] (a ':-> b)
App_ :: SCon '[ LC (a ':-> b ), LC a ] b
Let_ :: SCon '[ LC a, '( '[ a ], b ) ] b
CoerceTo_ :: !(Coercion a b) -> SCon '[ LC a ] b
UnsafeFrom_ :: !(Coercion a b) -> SCon '[ LC b ] a
PrimOp_
:: (typs ~ UnLCs args, args ~ LCs typs)
=> !(PrimOp typs a) -> SCon args a
ArrayOp_
:: (typs ~ UnLCs args, args ~ LCs typs)
=> !(ArrayOp typs a) -> SCon args a
MeasureOp_
:: (typs ~ UnLCs args, args ~ LCs typs)
=> !(MeasureOp typs a) -> SCon args ('HMeasure a)
Dirac :: SCon '[ LC a ] ('HMeasure a)
MBind :: SCon
'[ LC ('HMeasure a)
, '( '[ a ], 'HMeasure b)
] ('HMeasure b)
Plate :: SCon
'[ LC 'HNat
, '( '[ 'HNat ], 'HMeasure a)
] ('HMeasure ('HArray a))
Chain :: SCon
'[ LC 'HNat, LC s
, '( '[ s ], 'HMeasure (HPair a s))
] ('HMeasure (HPair ('HArray a) s))
Integrate
:: SCon '[ LC 'HReal, LC 'HReal, '( '[ 'HReal ], 'HProb) ] 'HProb
Summate
:: HDiscrete a
-> HSemiring b
-> SCon '[ LC a, LC a, '( '[ a ], b) ] b
Product
:: HDiscrete a
-> HSemiring b
-> SCon '[ LC a, LC a, '( '[ a ], b) ] b
Expect :: SCon '[ LC ('HMeasure a), '( '[ a ], 'HProb) ] 'HProb
Observe :: SCon '[ LC ('HMeasure a), LC a ] ('HMeasure a)
You’ll notice in SCon
there are definitions for PrimOp, MeasureOp, and ArrayOp
these are done more organizational purposes and have constructions for the
different categories of primitives.
MeasureOp
Primitives of type measure are defined in MeasureOp.
-- | Primitive operators to produce, consume, or transform
-- distributions\/measures. This corresponds to the old @Mochastic@
-- class, except that 'MBind' and 'Superpose_' are handled elsewhere
-- since they are not simple operators. (Also 'Dirac' is handled
-- elsewhere since it naturally fits with 'MBind', even though it
-- is a siple operator.)
data MeasureOp :: [Hakaru] -> Hakaru -> * where
Lebesgue :: MeasureOp '[ 'HReal, 'HReal ] 'HReal
Counting :: MeasureOp '[] 'HInt
Categorical :: MeasureOp '[ 'HArray 'HProb ] 'HNat
Uniform :: MeasureOp '[ 'HReal, 'HReal ] 'HReal
Normal :: MeasureOp '[ 'HReal, 'HProb ] 'HReal
Poisson :: MeasureOp '[ 'HProb ] 'HNat
Gamma :: MeasureOp '[ 'HProb, 'HProb ] 'HProb
Beta :: MeasureOp '[ 'HProb, 'HProb ] 'HProb
ArrayOp
Primitives that involve manipulating value of type array, end up in ArrayOp.
-- | Primitive operators for consuming or transforming arrays.
data ArrayOp :: [Hakaru] -> Hakaru -> * where
Index :: !(Sing a) -> ArrayOp '[ 'HArray a, 'HNat ] a
Size :: !(Sing a) -> ArrayOp '[ 'HArray a ] 'HNat
Reduce :: !(Sing a) -> ArrayOp '[ a ':-> a ':-> a, a, 'HArray a ] a
PrimOp
All primitive operations which don’t return something of type array or measure are placed in PrimOp
-- | Simple primitive functions, and constants.
data PrimOp :: [Hakaru] -> Hakaru -> * where
-- -- -- Here we have /monomorphic/ operators
-- -- The Boolean operators
Not :: PrimOp '[ HBool ] HBool
-- And, Or, Xor, Iff
Impl :: PrimOp '[ HBool, HBool ] HBool
-- Impl x y == Or (Not x) y
Diff :: PrimOp '[ HBool, HBool ] HBool
-- Diff x y == Not (Impl x y)
Nand :: PrimOp '[ HBool, HBool ] HBool
-- Nand aka Alternative Denial, Sheffer stroke
Nor :: PrimOp '[ HBool, HBool ] HBool
-- Nor aka Joint Denial, aka Quine dagger, aka Pierce arrow
-- -- Trigonometry operators
Pi :: PrimOp '[] 'HProb
Sin :: PrimOp '[ 'HReal ] 'HReal
Cos :: PrimOp '[ 'HReal ] 'HReal
Tan :: PrimOp '[ 'HReal ] 'HReal
Asin :: PrimOp '[ 'HReal ] 'HReal
Acos :: PrimOp '[ 'HReal ] 'HReal
Atan :: PrimOp '[ 'HReal ] 'HReal
Sinh :: PrimOp '[ 'HReal ] 'HReal
Cosh :: PrimOp '[ 'HReal ] 'HReal
Tanh :: PrimOp '[ 'HReal ] 'HReal
Asinh :: PrimOp '[ 'HReal ] 'HReal
Acosh :: PrimOp '[ 'HReal ] 'HReal
Atanh :: PrimOp '[ 'HReal ] 'HReal
-- -- Other Real\/Prob-valued operators
RealPow :: PrimOp '[ 'HProb, 'HReal ] 'HProb
Exp :: PrimOp '[ 'HReal ] 'HProb
Log :: PrimOp '[ 'HProb ] 'HReal
Infinity :: HIntegrable a -> PrimOp '[] a
GammaFunc :: PrimOp '[ 'HReal ] 'HProb
BetaFunc :: PrimOp '[ 'HProb, 'HProb ] 'HProb
-- -- -- Here we have the /polymorphic/ operators
-- -- HEq and HOrd operators
Equal :: !(HEq a) -> PrimOp '[ a, a ] HBool
Less :: !(HOrd a) -> PrimOp '[ a, a ] HBool
-- -- HSemiring operators (the non-n-ary ones)
NatPow :: !(HSemiring a) -> PrimOp '[ a, 'HNat ] a
-- -- HRing operators
Negate :: !(HRing a) -> PrimOp '[ a ] a
Abs :: !(HRing a) -> PrimOp '[ a ] (NonNegative a)
Signum :: !(HRing a) -> PrimOp '[ a ] a
-- -- HFractional operators
Recip :: !(HFractional a) -> PrimOp '[ a ] a
-- -- HRadical operators
NatRoot :: !(HRadical a) -> PrimOp '[ a, 'HNat ] a
-- -- HContinuous operators
Erf :: !(HContinuous a) -> PrimOp '[ a ] a