Examples
Gaussian Mixture Model
Below is a model for a Gaussian Mixture model. This can be seen as a Bayesian version of K-means clustering.
# Prelude to define dirichlet
def add(a prob, b prob):
a + b
def sum(a array(prob)):
reduce(add, 0, a)
def normalize(x array(prob)):
total = sum(x)
array i of size(x):
x[i] / total
def dirichlet(as array(prob)):
xs <~ plate i of int2nat(size(as)-1):
beta(summate j from i+1 to size(as): as[j],
as[i])
return array i of size(as):
x = product j from 0 to i: xs[j]
x * if i+1==size(as): 1 else: real2prob(1-xs[i])
# num of clusters
K = 5
# num of points
N = 20
# prior probability of picking cluster K
pi <~ dirichlet(array _ of K: 1)
# prior on mean and precision
mu <~ plate _ of K:
normal(0, 5e-9)
tau <~ plate _ of K:
gamma(2, 0.05)
# observed data
x <~ plate _ of N:
i <~ categorical(pi)
normal(mu[i], tau[i])
return (x, mu). pair(array(real), array(real))
Latent Dirichlet Allocation
Below is the Latent Dirichlet Allocation (LDA) topic model.
K = 2 # number of topics
M = 3 # number of docs
V = 7 # size of vocabulary
# number of words in each document
doc = [4, 5, 3]
topic_prior = array _ of K: 1.0
word_prior = array _ of V: 1.0
phi <~ plate _ of K: # word dist for topic k
dirichlet(word_prior)
# likelihood
z <~ plate m of M:
theta <~ dirichlet(topic_prior)
plate _ of doc[m]: # topic marker for word n in doc m
categorical(theta)
w <~ plate m of M: # for doc m
plate n of doc[m]: # for word n in doc m
categorical(phi[z[m][n]])
return (w, z)