Normal distribution


Story

Any quantity that emerges as the sum of a large number of subprocesses tends to be Normally distributed provided none of the subprocesses is very broadly distributed.


Example

We measure the length of many C. elegans eggs. The lengths are Normally distributed. Many biological measurements, like the height of people, are (approximately) Normally distributed. Many processes contribute to setting the length of an egg or the height of a person.


Parameters

The Normal distribution has two parameters, the location parameter \(\mu\), which determines the location of its peak, and the scale parameter \(\sigma\), which is strictly positive (the \(\sigma \to 0\) limit defines a Dirac delta function) and determines the width of the peak.

These parameters are commonly referred to as the mean and standard deviation, respectively. Those terms are widely used in other contexts, such as for point estimates of moments of an arbitrary distribution, so we avoid using those terms to avoid confusion.


Support

The Normal distribution is supported on the set of real numbers.


Probability density function

\[\begin{align} f(y;\mu, \sigma) = \frac{1}{\sqrt{2\pi\sigma^2}}\,\mathrm{e}^{-(y-\mu)^2/2\sigma^2}. \end{align}\]

Cumulative distribution function

\[\begin{align} F(y;\mu, \sigma) =\frac{1}{2}\left(1 + \text{erf}\left(\frac{y - \mu}{\sigma\sqrt{2}}\right)\right), \end{align}\]

where \(\text{erf}(x)\) denotes the error function.


Moments

Mean: \(\mu\)

Variance: \(\sigma^2\)


Usage

Package

Syntax

NumPy

rng.normal(mu, sigma)

SciPy

scipy.stats.norm(mu, sigma)

Distributions.jl

Normal(mu, sigma)

Stan

normal(mu, sigma)



Notes

  • The Normal distribution is often referred to as the Gaussian distribution, particularly in the physical sciences, named for one of its discoverers, Carl Gauss.


PDF and CDF plots