# Examples

### Gaussian Mixture Model

Below is a model for a Gaussian Mixture model. This can be seen as a Bayesian version of K-means clustering.

# Prelude to define dirichlet
a + b

def sum(a array(prob)):

def normalize(x array(prob)):
total = sum(x)
array i of size(x):
x[i] / total

def dirichlet(as array(prob)):
xs <~ plate i of int2nat(size(as)-1):
beta(summate j from i+1 to size(as): as[j],
as[i])
return array i of size(as):
x = product j from 0 to i: xs[j]
x * if i+1==size(as): 1 else: real2prob(1-xs[i])

# num of clusters
K = 5
# num of points
N = 20

# prior probability of picking cluster K
pi  <~ dirichlet(array _ of K: 1)
# prior on mean and precision
mu  <~ plate _ of K:
normal(0, 5e-9)
tau <~ plate _ of K:
gamma(2, 0.05)
# observed data
x   <~ plate _ of N:
i <~ categorical(pi)
normal(mu[i], tau[i])

return (x, mu). pair(array(real), array(real))


### Latent Dirichlet Allocation

Below is the Latent Dirichlet Allocation (LDA) topic model.

K = 2 # number of topics
M = 3 # number of docs
V = 7 # size of vocabulary

# number of words in each document
doc = [4, 5, 3]

topic_prior = array _ of K: 1.0
word_prior  = array _ of V: 1.0

phi <~ plate _ of K:     # word dist for topic k
dirichlet(word_prior)

# likelihood
z   <~ plate m of M:
theta <~ dirichlet(topic_prior)
plate _ of doc[m]: # topic marker for word n in doc m
categorical(theta)

w   <~ plate m of M: # for doc m
plate n of doc[m]: # for word n in doc m
categorical(phi[z[m][n]])

return (w, z)