Yule-Simon Distribution
Yule-Simon Distribution
The preferential attachment process can also be studied as an urn process in which balls are added to a
growing number of urns, each ball being allocated to an urn with probability linear in the number (of balls)
the urn already contains.
The distribution also arises as a compound distribution, in which the parameter of a geometric distribution is
treated as a function of random variable having an exponential distribution. Specifically, assume that
follows an exponential distribution with scale or rate :
with density
Then a Yule–Simon distributed variable K has the following geometric distribution conditional on W:
where are the rate and shape parameters of the gamma distribution prior on .
This algorithm is derived by Garcia[2] by directly optimizing the likelihood. Roberts and Roberts[5]
generalize the algorithm to Bayesian settings with the compound geometric formulation described above.
Additionally, Roberts and Roberts[5] are able to use the Expectation Maximisation (EM) framework to
show convergence of the fixed point algorithm. Moreover, Roberts and Roberts[5] derive the sub-linearity
of the convergence rate for the fixed point algorithm. Additionally, they use the EM formulation to give 2
alternate derivations of the standard error of the estimator from the fixed point equation. The variance of the
estimator is
the standard error is the square root of the quantity of this estimate divided by N.
Generalizations
The two-parameter generalization of the original Yule distribution replaces the beta function with an
incomplete beta function. The probability mass function of the generalized Yule–Simon(ρ, α) distribution is
defined as
with . For the ordinary Yule–Simon(ρ) distribution is obtained as a special case. The use
of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
See also
Zeta distribution
Scale-free network
Beta negative binomial distribution
Bibliography
Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York:
Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule
distribution".)
References
1. Simon, H. A. (1955). "On a class of skew distribution functions". Biometrika. 42 (3–4): 425–
440. doi:10.1093/biomet/42.3-4.425 (https://doi.org/10.1093%2Fbiomet%2F42.3-4.425).
2. Garcia Garcia, Juan Manuel (2011). "A fixed-point algorithm to estimate the Yule-Simon
distribution parameter" (https://zenodo.org/record/848773). Applied Mathematics and
Computation. 217 (21): 8560–8566. doi:10.1016/j.amc.2011.03.092 (https://doi.org/10.101
6%2Fj.amc.2011.03.092).
3. Yule, G. U. (1924). "A Mathematical Theory of Evolution, based on the Conclusions of Dr. J.
C. Willis, F.R.S" (https://doi.org/10.1098%2Frstb.1925.0002). Philosophical Transactions of
the Royal Society B. 213 (402–410): 21–87. doi:10.1098/rstb.1925.0002 (https://doi.org/10.1
098%2Frstb.1925.0002).
4. Pachon, Angelica; Polito, Federico; Sacerdote, Laura (2015). "Random Graphs Associated
to Some Discrete and Continuous Time Preferential Attachment Models". Journal of
Statistical Physics. 162 (6): 1608–1638. arXiv:1503.06150 (https://arxiv.org/abs/1503.0615
0). doi:10.1007/s10955-016-1462-7 (https://doi.org/10.1007%2Fs10955-016-1462-7).
S2CID 119168040 (https://api.semanticscholar.org/CorpusID:119168040).
5. Roberts, Lucas; Roberts, Denisa (2017). "An Expectation Maximization Framework for
Preferential Attachment Models". arXiv:1710.08511 (https://arxiv.org/abs/1710.08511)
[stat.CO (https://arxiv.org/archive/stat.CO)].