# Hypergeometric distribution

## Story

Consider an urn with $$a$$ white balls and $$b$$ black balls. Draw $$N$$ balls from this urn without replacement. The number white balls drawn, $$n$$, is Hypergeometrically distributed.

## Example

There are $$a+b$$ finches on an island, and $$a$$ of them are tagged (and therefore $$b$$ of them are untagged). You capture $$N$$ finches. The number of tagged finches $$n$$ is Hypergeometrically distributed.

## Parameters

There are three parameters: the number of draws $$N$$, the number of white balls $$a$$, and the number of black balls $$b$$.

## Support

The Hypergeometric distribution is supported on the set of integers between $$\mathrm{max}(0, N-b)$$ and $$\mathrm{min}(N, a)$$, inclusive.

## Probability mass function

\begin{split}\begin{align} f(n; N, a, b) = \frac{\begin{pmatrix}a \\ n\end{pmatrix} \begin{pmatrix}b \\ N-n\end{pmatrix}}{\begin{pmatrix}a+b \\ N\end{pmatrix}}. \end{align}\end{split}

## Moments

Mean: $$\displaystyle{N\,\frac{a}{a+b}}$$

Variance: $$\displaystyle{N\,\frac{ab}{(a + b)^2}\,\frac{a+b-N}{a+b-1}}$$

## Usage

Package

Syntax

NumPy

rg.hypergeometric(a, b, N)

SciPy

scipy.stats.hypergeom(a+b, a, N)

Stan

hypergeometric(N, a, b)

## Notes

• This distribution is analogous to the Binomial distribution, except that the Binomial distribution describes draws from an urn with replacement. In the analogy, the Binomial parameter $$\theta$$ is $$\theta = a/(a+b)$$.

• SciPy uses a different parametrization than NumPy and Stan. Let $$M = a+b$$ be the total number of balls in the urn. Then, noting the order of the parameters, since this is what scipy.stats.hypergeom expects, the PMF may be written as

\begin{split}\begin{align} f(n;M,a,N) = \frac{\begin{pmatrix}a \\ n\end{pmatrix} \begin{pmatrix}M-a \\ N-n\end{pmatrix}}{\begin{pmatrix}M \\ N\end{pmatrix}}. \end{align}\end{split}
• Although NumPy and Stan use the same parametrization, note the difference in the ordering of the arguments.