# Dirichlet distribution¶

## Story¶

The Dirichlet distribution is a generalization of the Beta distribution. It is a probability distribution describing probabilities of outcomes. Instead of describing probability of one of two outcomes of a Bernoulli trial, like the Beta distribution does, it describes probability of $$K$$ outcomes. The Beta distribution is the special case of the Dirichlet distribution with $$K=2$$.

## Parameters¶

The parameters are $$\alpha_1$$, $$\alpha_2$$, …, $$\alpha_K$$, all strictly positive, defined analogously to $$\alpha$$ and $$\beta$$ of the Beta distribution.

## Support¶

The Dirichlet distribution has support on the interval [0, 1] such that $$\sum_{i=1}^K \theta_i = 1$$.

## Probability density function¶

\begin{align} f(\boldsymbol{\theta};\boldsymbol{\alpha}) = \frac{1}{B(\boldsymbol{\alpha})}\,\prod_{i=1}^K \theta_i^{\alpha_i-1}, \end{align}

where

\begin{align} B(\boldsymbol{\alpha}) = \frac{\prod_{i=1}^K\Gamma(\alpha_i)}{\Gamma\left(\sum_{i=1}^K \alpha_i\right)} \end{align}

is the multivariate Beta function.

## Moments¶

Mean of $$\theta_i$$: $$\left<\theta_i\right> = \displaystyle{\frac{\alpha_i}{\sum_{i=k}^K \alpha_k}}$$

Variance of $$\theta_i$$: $$\displaystyle{\frac{\left<\theta_i\right>(1-\left<\theta_i\right>)}{1 + \sum_{k=1}^K \alpha_k}}$$

Covariance of $$\theta_i, \theta_j$$ with $$j\ne i$$: $$\displaystyle{-\frac{\left<\theta_i\right>\left<\theta_j\right>}{1 + \sum_{k=1}^K \alpha_k}}$$

## Usage¶

The usage below assumes that alpha is a length $$K$$ array.

Package

Syntax

NumPy

rg.dirichlet(alpha)

SciPy

scipy.stats.dirichlet(alpha)

Stan

dirichlet(alpha)

## Notes¶

• In some cases, we may wish to specify the distribution of an ordered Dirichlet distributed vector $$\theta$$. That is, we want $$\theta \sim \text{Dirichlet}(\alpha_1, \alpha_2, \ldots, \alpha_L)$$ with $$\theta_i < \theta_{i+1}$$ for all $$i < K$$. Because of the relationship of the Dirchlet distribution to a set of Gamma distributed random variables, we may specify this in Stan as follows.

data {
int<lower=1> K;
}

parameters {
vector<lower=0>[K] alpha;
positive_ordered[K] lambda;
}

transformed parameters {
simplex[K] theta = lambda / sum(lambda);
}

model {
target += gamma_lpdf(lambda | alpha, 1);
}