# Beta distribution

## Story

Say you wait for two multistep Poisson processes to arive. The individual steps of each process happen at the same rate, but the first multistep process requires $$\alpha$$ steps and the second requires $$\beta$$ steps. The fraction of the total waiting time taken by the first process is Beta distributed.

## Parameters

There are two parameters, both strictly positive: $$\alpha$$ and $$\beta$$, defined in the above story.

## Support

The Beta distribution has support on the interval [0, 1].

## Probability density function

\begin{align} f(\theta; \alpha, \beta) = \frac{\theta^{\alpha-1}(1-\theta)^{\beta-1}}{B(\alpha, \beta)}, \end{align}

where

\begin{align} B(\alpha, \beta) = \frac{\Gamma(\alpha)\,\Gamma(\beta)}{\Gamma(\alpha + \beta)} \end{align}

is the Beta function.

## Moments

Mean: $$\displaystyle{\frac{\alpha}{\alpha + \beta}}$$

Variance: $$\displaystyle{\frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}}$$

## Usage

Package

Syntax

NumPy

rg.beta(alpha, beta)

SciPy

scipy.stats.beta(alpha, beta)

Stan

beta(alpha, beta)

## Notes

• The story of the Beta distribution is difficult to parse. Most importantly, the Beta distribution allows us to put probabilities on unknown probabilities. It is only defined on $$0 \le \theta \le 1$$, and $$\theta$$ here can be interpreted as a probability, say of success in a Bernoulli trial.

• The case where $$\alpha = \beta = 0$$ is not technically a probability distribution because the PDF cannot be normalized. Nonetheless, it is often used as an improper prior, and this prior is known a Haldane prior, names after biologist J. B. S. Haldane. The case where $$\alpha = \beta = 1/2$$ is sometimes called a Jeffreys prior.

• The Beta distribution may also be parametrized in terms of the location parameter $$\phi$$ and concentration $$\kappa$$, which are related to $$\alpha$$ and $$\beta$$ as

\begin{split}\begin{align} &\phi = \frac{\alpha}{\alpha + \beta}, \\ &\kappa = \alpha + \beta. \end{align}\end{split}

The location parameter $$\phi$$ is the mean of the distribution and $$\kappa$$ is a measure of how broad it is. To convert back to an $$(\alpha, \beta)$$ parametrization from a $$(\phi, \kappa)$$ parametrization, use

\begin{split}\begin{align} &\alpha = \phi \kappa, \\ &\beta = (1-\phi)\kappa. \end{align}\end{split}

The mean and variance in terms of $$\phi$$ and $$\kappa$$ are

Mean: $$\displaystyle{\phi}$$

Variance: $$\displaystyle{\frac{\phi(1-\phi)}{1+\kappa}}$$.