# Dirichlet

### From Statipedia

## Contents |

## The Dirichlet Distribution

- As the Beta distribution describes knowledge about a unknown probability p and its complement 1-p, the Dirichlet distribution describes knowledge about a vector of probabilities.
- If the vector is of length k, the sum of the k probabilities is 1. pk = 1 - p1 - p2 - p3... - pk-1
- The sum of the k non-negative paramter values can be regarded as the "prior sample size," as though we observed the result of n1+n2+...+nk Multinomial trials, though the parameter values can be non-integer.
- For any one of the k probabilities, the marginal distribution is Beta. For example, the marginal distribution for pk is beta(nk,n1+n2+...+nk-1 = n-nk). The first parameter is that for the selected probability and the second parameter is the sum of all the other parameters.

## Probability Density Function

- http://en.wikipedia.org/wiki/File:LogDirichletDensity-alpha_0.3_to_alpha_2.0.gif is a neat movie that shows how the log of the Dirichlet density changes as the three parameter values change from (0.3,0.3,0.3) to (2.0,2.0,2.0). When the parameters are all 2.0, the density is not yet normal:

- When all parameters are large, the density approaches Multivariate Normal. Here is the probability density function for Dirichlet(10,10,10), followed by the corresponding contour plot. The contour plot shows that the density isn't yet normal, else the contours would all be ovals:

- Here is the 3-parameter Dirichlet probability density function. The parameters are alpha1, alpha2 and alpha3:

## Cumulative Distribution Function

- Is there a way to take advantage of the marginal's beta distributions in calculating the CDF?
- For example, if P ~ Dirichlet(alpha1,alpha2,alpha3), then pdirichlet(p1,p2) = Pr{P1<p1}*Pr{P2<p2|P1<p1}) = pbeta(P1,alpha1,alpha2+alpha3)*Pr{P2<p2|P1<p1}
- Pr{P1<p1 and P2<p2} = ???

- Below is the CDF of the Dirichlet(2,2,2) distribution, expressed using integrals and evaluated using numerical integration.
**Is there a way to derive the CDF using R???**

## Generating Random Variables

- In library(rv), rvdirichlet(n,alpha), where alpha is the parameter vector.
- a <- rvdirichlet(1, alpha=c(6, 3, 1))

- In library(LearnBayes), rdirichlet(n,par), where n is number of simulations required and par is a vector of parameters. Remember that you first need to issue the command library(LearnBayes).
- b <- rdirichlet(10,par=c(2,5,4,10))

- [,1] [,2] [,3] [,4]

- [1,] 0.03149738 0.21819450 0.2976455 0.4526626
- [2,] 0.15486848 0.31653148 0.0768092 0.4517908
- [3,] 0.06978287 0.21769461 0.3286143 0.3839083
- [4,] 0.12048778 0.28362439 0.1864341 0.4094537
- [5,] 0.03458861 0.22649794 0.1337141 0.6051994
- [6,] 0.02321834 0.25692287 0.3006527 0.4192061
- [7,] 0.20431257 0.03111881 0.2642011 0.5003675
- [8,] 0.03782862 0.25024523 0.4772171 0.2347090
- [9,] 0.14861767 0.24180046 0.2599963 0.3495856
- [10,] 0.05559632 0.15193633 0.3154328 0.4770346

## Parameter Estimation

- The "String cutting" example (taken from Wikipedia) shows some data - a sample of size 15 from a Dirichlet distribution.

- The data can be shown as a set of ordered pairs, where one is the length of the
**BLUE**cutting and the other is the length of the**RED**cutting. The length of the**YELLOW**cutting is determined by the other two cuttings. The above "data" are something like this:- B = c(0.483,0.493,0.526,0.551,0.586,0.491,0.518,0.434,0.526,0.423,0.517,0.413,0.528,0.476,0.482)
- R = c(0.178,0.178,0.197,0.156,0.152,0.195,0.191,0.132,0.098,0.163,0.087,0.204,0.211,0.183,0.153)
- Below, the log likelihood function is maximized. Here, a = length(
**BLUE**), b = length(**RED**), and c = length(**YELLOW**)

*"length" here is the length of the piece of string, not the number of elements in the vector*

- The maximum likelihood solution is Dirichlet(46.3, 15.5, 32.0). The probability density function associted with this solution agrees very nicely with the data (see below).
**I (Mike) used Mathcad to produce the plots below. Is there a way to find the max likelihood solution and plot the resulting data and PDF using R???**

## Related Distributions

- The Dirichlet distribution is conjugate of the Multinomial distribution in the same way that the Beta is conjugate of the Binomial distribution. In a Bayesian analysis, if a Dirichlet(m1, m2,...mk) prior expresses prior knowledge about the parameters of a Multinomial random variable, and data are obtained from N trials (N = n1, n2, ... nk), then the posterior information about the parameters is expressed by Dirichlet(m1+n1, m2+n2, ...mk+nk).
- The Beta distribution is a Dirichlet distribution with two parameters.

## External Links

Back to Probability_Distributions