Instatistics, aPolya urn model(also known as aPolya urn schemeor simply asPólya's urn), named afterGeorge Pólya, is a type ofstatistical modelused as an idealizedmental exerciseto understand the nature of certain statistical distributions.
In anurn model, objects of real interest (such as atoms, people, cars, etc.) are represented as colored balls in anurnor other container. In the basic urn model, the urn containsxwhite andyblack balls; one ball is drawn randomly from the urn and its color observed; it is then placed back in the urn, and the selection process is repeated. Questions can then be asked about the probability of drawing one color or another, or some other properties.(取出黑球或白球后,再放回,再随机取出)
The Polya urn model differs only in that, when a ball of a particular color is drawn, that ball is put back along with a new ball of the same color. Thus, unlike in the basic model, the contents of the urn change over time, with a self-reinforcing property sometimes expressed asthe rich get richer.(随机取出黑球或白球,然后另找个和这个球颜色一样的球,伴随原来取出的球一起放回去.再随机抽取新的球)
Note that in some sense, the Polya urn model is the "opposite" of the model ofsampling without replacement(是说只取不放么?--貌似是). When sampling without replacement, every time a particular value is observed, it is less likely to be observed again, whereas in a Polya urn model, an observed value ismorelikely to be observed again. In both of these models, the act of measurement has an effect on the outcome of future measurements. (For comparison, whensampling with replacement, observation of a particular value has no effect on how likely it is to observe that value again.) Note also that in a Polya urn model, successive acts of measurement over time have less and less effect on future measurements, whereas in sampling without replacement, the opposite is true: After a certain number of measurements of a particular value, that value will never be seen again.
Distributions related to the Polya urn
- beta-binomial distribution: The distribution of the number of successful draws (trials), eg. number of white balls extraction of white ball, givenndraws from a Polya urn.
- multivariate Polya distribution(also known as theDirichlet compound multinomial distribution): The distribution over the number of balls of each color, givenndraws from a Polya urn where there arekdifferent colors instead of only two.
- martingalesand thebeta distribution: Letwandbbe the number of white and black balls initially in the urn, andnwthe number of white balls currently in the urn afterndraws. Then the sequence of valuesforis amartingaleand converges to thebeta distribution.
- Dirichlet process,Chinese restaurant process: Imagine a modified Polya urn scheme as follows. We start with an urn withαblack balls. When drawing a ball from the urn, if we draw a black ball, put the ball back along with a new ball of a new non-black color randomly generated from auniform distribution, and consider the newly generated color to be the "value" of the draw. Otherwise, put the ball back along with another ball of the same color, as for the standard Polya urn scheme. The colors of an infinite sequence of draws from this modified Polya urn scheme follow aChinese restaurant process. If, instead of generating a new color, we draw a random value from a given base distribution and use that value to label the ball, the labels of an infinite sequence of draws follow aDirichlet process.