# Pareto distribution¶

## Story¶

There is no real story to the Pareto distribution, except that it is a distribution where the tail of the PDF or PMF follows a power law ($$f(y) \sim y^{-\alpha-1}$$). Such distributions often arise in physical scenarios.

## Example¶

The Gutenberg-Richter Law says that the magnitudes of earthquakes in a given region are Pareto distributed. Other random variables that are often described by power laws include size of human settlement (many small towns, a few huge cities), income distribution (many poor, few obscenely rich).

## Parameters¶

The Pareto distribution has two paramters, $$\alpha$$ and $$y_\mathrm{min}$$. The parameter $$\alpha$$ sets the power in the power law and $$y_\mathrm{min}$$ is a lower cutoff to ensure that the distribution is normalizable. Both $$\alpha$$ and $$y_\mathrm{min}$$ must be positive.

## Support¶

The Pareto distribution has support on real numbers greater than or equal to $$y_\mathrm{min}$$.

## Probability density function¶

\begin{align} f(y;y_\mathrm{min}, \alpha) = \frac{\alpha}{y} \,\left(\frac{y_\mathrm{min}}{y}\right)^\alpha. \end{align}

## Moments¶

Mean: The mean is infinite for $$\alpha \le 1$$ and $$\displaystyle{\frac{\alpha y_\mathrm{min}}{\alpha - 1}}$$ for $$\alpha > 1$$.

Variance: The variance is infinite for $$\alpha \le 2$$ and $$\displaystyle{\frac{\alpha y_\mathrm{min}^2}{(\alpha - 1)^2(\alpha - 2)}}$$ for $$\alpha > 2$$.

## Usage¶

Package

Syntax

NumPy

y_min * (1 + rg.pareto(alpha))

SciPy

scipy.stats.pareto(alpha, scale=y_min)

Stan

pareto(y_min, alpha)

## Notes¶

• A Pareto distribution is sometimes referred to as a power law distribution. Generically, a distribution is said to be a power law distribution if its tail decays like $$y^{-\beta}$$ for some positive $$\beta$$.

• The Type II Pareto distribution is often used. It is a Pareto distribution, except with a redefinition of $$y \to y - \mu + y_\mathrm{min}$$. This shifts $$y$$ such that its support starts at $$y=\mu$$. In the case there $$\mu = 0$$, the Type II distribution is called a Lomax distribution. NumPy’s Pareto sample samples out of a Lomax distirbution with $$y_\mathrm{min}$$ set to one. Thus, to sample out of a Pareto distribution, the transformations described in the usage table above are necessary. To use a Type II Pareto distribution in Stan, $$y_\mathrm{min}$$ is renamed $$\lambda$$, and the syntax is pareto_type_2(mu, lambda, alpha).

• The Pareto distribution is often best visualized by plotting the complementary cumulative distribution function (CCDF), denoted $$\bar{F}(y)$$, which is related to the CDF $$F(y)$$ by $$\bar{F}(y) = 1 - F(y)$$. The CCDF for a Pareto distribution is

\begin{split}\begin{align} \bar{F}(y) = \left\{\begin{array}{lll} \left(\frac{y_\mathrm{min}}{y}\right)^\alpha & & y \ge y_\mathrm{min} \\ 1 & & y < y_\mathrm{min} \end{array} \right. \end{align}\end{split}

Thus, the power law is clear. A plot of the CCDF on a log-log plot yields a line with slope equal to $$-\alpha$$, as show below for $$y_\mathrm{min} = 1$$ and $$\alpha = 2$$.