# Multinomial distribution¶

## Story¶

This is a generalization of the Binomial distribution. Instead of a Bernoulli trial consisting of two outcomes, each trial has $$K$$ outcomes. The probability of getting $$y_1$$ of outcome 1, $$y_2$$ of outcome 2, …, and $$y_K$$ of outcome $$K$$ out of a total of $$N$$ trials is Multinomially distributed.

## Example¶

There are two alleles in a population, A and a. Each individual may have genotype AA, Aa, or aa. The probability distribution describing having $$y_1$$ AA individuals, $$y_2$$ Aa individuals, and $$y_3$$ aa individuals in a population of $$N$$ total individuals is Multinomially distributed.

## Parameters¶

$$N$$, the total number of trials, and $$\boldsymbol{\theta} = \left\{\theta_1, \theta_2, \ldots,\theta_K\right\}$$, the probabilities of each outcome. Note that $$\sum_{i=1}^K \theta_i = 1$$ and there is the further restriction that $$N = \sum_{i=1}^K y_i$$.

## Support¶

The $$K$$-nomial distribution is supported on $$\mathbb{N}^K$$.

## Probability mass function¶

\begin{align} f(\mathbf{y};\boldsymbol{\theta}, N) = \frac{N!}{y_1!\,y_2!\cdots y_K!}\,\theta_1^{y_1}\,\theta_2^{y_2}\cdots\theta_K^{y_K}. \end{align}

## Moments¶

Mean of $$y_i$$: $$N\theta_i$$

Variance of $$y_i$$: $$N\theta_i(1-\theta_i)$$

Covariance of $$y_i, y_j$$ with $$j\ne i$$: $$-N\theta_i\theta_j$$

## Usage¶

The usage below assumes theta is a length $$K$$ array.

Package

Syntax

NumPy

rg.multinomial(N, theta)

SciPy

scipy.stats.multinomial(N, theta)

Stan sampling

multinomial(theta)

Stan rng

multinomial_rng(theta, N)

## Notes¶

• For a sampling statement in Stan, the value of $$N$$ is implied.