Pareto distribution


There is no real story to the Pareto distribution, except that it is a distribution where the tail of the PDF or PMF follows a power law (\(f(y) \sim y^{-\alpha-1}\)). Such distributions often arise in physical scenarios.


The Gutenberg-Richter Law says that the magnitudes of earthquakes in a given region are Pareto distributed. Other random variables that are often described by power laws include size of human settlement (many small towns, a few huge cities), income distribution (many poor, few obscenely rich).


The Pareto distribution has two paramters, \(\alpha\) and \(y_\mathrm{min}\). The parameter \(\alpha\) sets the power in the power law and \(y_\mathrm{min}\) is a lower cutoff to ensure that the distribution is normalizable. Both \(\alpha\) and \(y_\mathrm{min}\) must be positive.


The Pareto distribution has support on real numbers greater than or equal to \(y_\mathrm{min}\).

Probability density function

\[\begin{align} f(y;y_\mathrm{min}, \alpha) = \frac{\alpha}{y} \,\left(\frac{y_\mathrm{min}}{y}\right)^\alpha. \end{align}\]


Mean: The mean is infinite for \(\alpha \le 1\) and \(\displaystyle{\frac{\alpha y_\mathrm{min}}{\alpha - 1}}\) for \(\alpha > 1\).

Variance: The variance is infinite for \(\alpha \le 2\) and \(\displaystyle{\frac{\alpha y_\mathrm{min}^2}{(\alpha - 1)^2(\alpha - 2)}}\) for \(\alpha > 2\).





y_min * (1 + rg.pareto(alpha))


scipy.stats.pareto(alpha, scale=y_min)


pareto(y_min, alpha)


  • A Pareto distribution is sometimes referred to as a power law distribution. Generically, a distribution is said to be a power law distribution if its tail decays like \(y^{-\beta}\) for some positive \(\beta\).

  • The Type II Pareto distribution is often used. It is a Pareto distribution, except with a redefinition of \(y \to y - \mu + y_\mathrm{min}\). This shifts \(y\) such that its support starts at \(y=\mu\). In the case there \(\mu = 0\), the Type II distribution is called a Lomax distribution. NumPy’s Pareto sample samples out of a Lomax distirbution with \(y_\mathrm{min}\) set to one. Thus, to sample out of a Pareto distribution, the transformations described in the usage table above are necessary. To use a Type II Pareto distribution in Stan, \(y_\mathrm{min}\) is renamed \(\lambda\), and the syntax is pareto_type_2(mu, lambda, alpha).

  • The Pareto distribution is often best visualized by plotting the complementary cumulative distribution function (CCDF), denoted \(\bar{F}(y)\), which is related to the CDF \(F(y)\) by \(\bar{F}(y) = 1 - F(y)\). The CCDF for a Pareto distribution is

\[\begin{split}\begin{align} \bar{F}(y) = \left\{\begin{array}{lll} \left(\frac{y_\mathrm{min}}{y}\right)^\alpha & & y \ge y_\mathrm{min} \\ 1 & & y < y_\mathrm{min} \end{array} \right. \end{align}\end{split}\]

Thus, the power law is clear. A plot of the CCDF on a log-log plot yields a line with slope equal to \(-\alpha\), as shown below for \(y_\mathrm{min} = 1\) and \(\alpha = 2\).

PDF and CDF plots