Beta distribution


Story

Say you wait for two multistep Poisson processes to arive. The individual steps of each process happen at the same rate, but the first multistep process requires \(\alpha\) steps and the second requires \(\beta\) steps. The fraction of the total waiting time taken by the first process is Beta distributed.


Parameters

There are two shape parameters, both strictly positive: \(\alpha\) and \(\beta\), defined in the above story.


Support

The Beta distribution has support on the interval [0, 1].


Probability density function

\[\begin{align} f(\theta; \alpha, \beta) = \frac{\theta^{\alpha-1}(1-\theta)^{\beta-1}}{B(\alpha, \beta)}, \end{align}\]

where \(B(\alpha, \beta)\) is the beta function.


Cumulative distribution function

\[\begin{align} F(\theta; \alpha, \beta) = I_\theta(\alpha, \beta) = \frac{1}{B(\alpha, \beta)}\,\int_0^\theta \mathrm{d}x\,x^{\alpha-1}(1-x)^{\beta-1}, \end{align}\]

where \(I_\theta(\alpha, \beta)\) is the regularized incomplete beta function.


Moments

Mean: \(\displaystyle{\frac{\alpha}{\alpha + \beta}}\)

Variance: \(\displaystyle{\frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}}\)


Usage

Package

Syntax

NumPy

rng.beta(alpha, beta)

NumPy with (φ, κ) parametrizaton

rng.beta(phi*kappa, (1-phi)*kappa)

SciPy

scipy.stats.beta(alpha, beta)

SciPy with (φ, κ) parametrizaton

scipy.stats.beta(phi*kappa, (1-phi)*kappa)

Distributions.jl

Beta(alpha, beta)

Distributions.jl (φ, κ) parametrizaton

Beta(phi*kappa, (1-phi)*kappa)

Stan

beta(alpha, beta)

Stan with (φ, κ) parametrizaton

beta(phi*kappa, (1-phi)*kappa)



Notes

  • The story of the Beta distribution is difficult to parse. Most importantly, the Beta distribution allows us to put probabilities on unknown probabilities. It is only defined on \(0 \le \theta \le 1\), and \(\theta\) here can be interpreted as a probability, say of success in a Bernoulli trial.

  • The case where \(\alpha = \beta = 0\) is not technically a probability distribution because the PDF cannot be normalized. Nonetheless, it is often used as an improper prior, and this prior is known a Haldane prior, names after biologist J. B. S. Haldane. The case where \(\alpha = \beta = 1/2\) is sometimes called a Jeffreys prior.

  • The Beta distribution may also be parametrized in terms of the location parameter \(\phi\) and concentration \(\kappa\), which are related to \(\alpha\) and \(\beta\) as

    \[\begin{split}\begin{align} &\phi = \frac{\alpha}{\alpha + \beta}, \\ &\kappa = \alpha + \beta. \end{align}\end{split}\]

    The location parameter \(\phi\) is the mean of the distribution and \(\kappa\) is a measure of how broad it is. To convert back to an \((\alpha, \beta)\) parametrization from a \((\phi, \kappa)\) parametrization, use

    \[\begin{split}\begin{align} &\alpha = \phi \kappa, \\ &\beta = (1-\phi)\kappa. \end{align}\end{split}\]

    The mean and variance in terms of \(\phi\) and \(\kappa\) are

    Mean: \(\displaystyle{\phi}\)

    Variance: \(\displaystyle{\frac{\phi(1-\phi)}{1+\kappa}}\).


PDF and CDF plots

In the α-β formulation:


In the φ-κ formulation: