Negative Binomial distribution


We perform a series of Bernoulli trials with probability \(\beta/(1+\beta)\) of success. The number of failures, \(y\), before we get \(\alpha\) successes is Negative Binomially distributed.

An equivalent story is this: Draw a parameter \(\lambda\) out of a Gamma distribution with parameters \(\alpha\) and \(\beta\). Then draw a number \(y\) out of a Poisson distribution with parameter \(\lambda\). Then \(y\) is Negative Binomially distributed with parameters \(\alpha\) and \(\beta\). For this reason, the Negative Binomial distribution is sometimes called the Gamma-Poisson distribution.


Bursty gene expression can give mRNA count distributions that are Negative Binomially distributed. Here, “success” is that a burst in gene expression stops. In this case, the parameter \(1/\beta\) is the mean number of transcripts in a burst of expression. The parameter \(\alpha\) is related to the frequency of the bursts. If multiple bursts are possible within the lifetime of mRNA, then \(\alpha > 1\). Then, the number of “failures” is the number of mRNA transcripts that are made in the characteristic lifetime of mRNA.


There are two parameters: \(\alpha\), the desired number of successes, and \(\beta\), which is the mean of the \(\alpha\) identical Gamma distributions that give the Negative Binomial. The probability of success of each Bernoulli trial is given by \(\beta/(1+\beta)\).


The Negative-Binomial distribution is supported on the set of nonnegative integers.

Probability mass function

\[\begin{split}\begin{align} f(y;\alpha,\beta) = \begin{pmatrix} y+\alpha-1 \\ \alpha-1 \end{pmatrix} \left(\frac{\beta}{1+\beta}\right)^\alpha \left(\frac{1}{1+\beta}\right)^y. \end{align}\end{split}\]

Generally speaking, \(\alpha\) need not be an integer, so we may write the PMF as

\[\begin{align} f(y;\alpha,\beta) = \frac{\Gamma(y+\alpha)}{\Gamma(\alpha) \, y!}\,\left(\frac{\beta}{1+\beta}\right)^\alpha \left(\frac{1}{1+\beta}\right)^y. \end{align}\]

See the notes below for other parametrizations.


Mean: \(\displaystyle{\frac{\alpha}{\beta}}\)

Variance: \(\displaystyle{\frac{\alpha(1+\beta)}{\beta^2}}\)





rg.negative_binomial(alpha, beta/(1+beta))

NumPy with (µ, φ) parametrization

rg.negative_binomial(phi, phi/(mu+phi))


scipy.stats.nbinom(alpha, beta/(1+beta))

SciPy with (µ, φ) parametrization

scipy.stats.nbinom(phi, phi/(mu+phi))


neg_binomial(alpha, beta)

Stan with (µ, φ) parametrization

neg_binomial_2(mu, phi)


  • The Negative Binomial distribution may be parametrized such that the probability mass function is

\[\begin{align} f(y;\mu,\phi) = \frac{\Gamma(y+\phi)}{\Gamma(\phi) \, y!}\,\left(\frac{\phi}{\mu+\phi}\right)^\phi\left(\frac{\mu}{\mu+\phi}\right)^y. \end{align}\]

These parameters are related to the parametrization above by \(\phi = \alpha\) and \(\mu = \alpha/\beta\). In the limit of \(\phi\to\infty\), which can be taken for the PMF, the Negative Binomial distribution becomes Poisson with parameter \(\mu\). This also gives meaning to the parameters \(\mu\) and \(\phi\); \(\mu\) is the mean of the Negative Binomial, and \(\phi\) controls extra width of the distribution beyond Poisson. The smaller \(\phi\) is, the broader the distribution.

In this parametrization, the pertinent moments are

Mean: \(\displaystyle{\mu}\)

Variance: \(\displaystyle{\mu\left(1 + \frac{\mu}{\phi}\right)}\).

In Stan, the Negative Binomial distribution using the \((\mu,\phi)\) parametrization is called neg_binomial_2.

  • SciPy and NumPy use yet another parametrization. The PMF for SciPy is

\[\begin{align} f(y;n, p) = \frac{\Gamma(y+n)}{\Gamma(n) \, y!}\,p^n \left(1-p\right)^y. \end{align}\]

The parameter \(1-p\) is the probability of success of a Bernoulli trial (as defined in the story above). The parameters are related to the others we have defined by \(n=\alpha=\phi\) and \(p=\beta/(1+\beta) = \phi/(\mu+\phi)\). In this parametrization, the pertinent moments are

Mean: \(\displaystyle{n\,\frac{1-p}{p}}\)

Variance: \(\displaystyle{n\,\frac{1-p}{p^2}}\).

Note that Wikipedia uses this parametrization except defining \(p\) to be the probability of failure of a Bernoulli trial, in accordance with the story above.

PMF and CDF plots

In the α-β formulation:

In the µ-φ formulation: