Categorical distribution¶
Story¶
A probability is assigned to each of a set of discrete outcomes.
Example¶
A hen will peck at grain A with probability \(\theta_\mathrm{A}\), grain B with probability \(\theta_\mathrm{B}\), and grain C with probability \(\theta_\mathrm{C}\).
Parameters¶
The distribution is parametrized by the probabilities assigned to each event. We define \(\theta_y\) to be the probability assigned to outcome \(y\). The set of \(\theta_y\)’s are the parameters, and are constrained by
Support¶
If we index the categories with sequential integers from 1 to N, the distribution is supported for integers 1 to N, inclusive when described using the indices of the categories.
Probability mass function¶
Moments¶
Moments are not defined for a Categorical distribution because the value of \(y\) is not necessarily numeric.
Usage¶
Package 
Syntax 

NumPy 

SciPy 

Stan 

Notes¶
This distribution must be manually constructed if you are using the
scipy.stats
module usingscipy.stats.rv_discrete()
. The categories need to be encoded by an index. For interactive plotting purposes, below, we need to specify a custom PMF and CDF.To sample out of a Categorical distribution, use
numpy.random.choice()
, specifying the values of \(\theta\) using the p kwarg.