Poisson Distribution Explained - Intuition, Examples, and Derivation - Towards Data Science
Poisson Distribution Explained - Intuition, Examples, and Derivation - Towards Data Science
Search Medium
You have 2 free member-only stories left this month. Sign up for Medium and get an extra one
Ms Aerin Follow
Save
38
1. Why did Poisson invent Poisson Distribution?
7.6K
To predict the # of events occurring in the future!
If you’ve ever sold something, this “event” can be defined, for example, as a customer
purchasing something from you (the moment of truth, not just browsing). It can be
how many visitors you get on your website a day, how many clicks your ads get for the
next month, how many phone calls you get during your shift, or even how many
people will die from a fatal disease next year, etc.
I’d like to predict the # of ppl who would clap next week because I
get paid weekly by those numbers.
What is the probability that exactly 20 people (or 10, 30, 50, etc.)
will clap for the blog post next week?
2. For now, let’s assume we don’t know anything about the Poisson
Distribution. Then how do we solve this problem?
One way to solve this would be to start with the number of reads. Each person who
reads the blog has some probability that they will really like it and clap.
This is a classic job for the binomial distribution, since we are calculating the
probability of the number of successful events (claps).
However, here we are given only one piece of information — 17 ppl/week, which is a
“rate” (the average # of successes per week, or the expected value of x). We don’t know
anything about the clapping probability p, nor the number of blog visitors n.
Therefore, we need a little more information to tackle this problem. What more do we
need to frame this probability as a binomial problem? We need two things: the
probability of success (claps) p & the number of trials (visitors) n.
These are stats for 1 year. A total of 59k people read my blog. Out of 59k people, 888 of
them clapped.
Therefore, the # of people who read my blog per week (n) is 59k/52 = 1134. The # of
people who clapped per week (x) is 888/52 =17.
Using the Binomial PMF, what is the probability that I’ll get exactly 20 successes (20 people
who clap) next week?
<Binomial Probability for different x’s>
╔══════╦════════════════╗
║ x ║ Binomial P(X=x)║
╠══════╬════════════════╣
║ 10 ║ 0.02250 ║
║ 17 ║ 0.09701 ║ 🡒 The average rate has the highest P!
║ 20 ║ 0.06962 ║ 🡒 Nice. 20 is also quite Likely!
║ 30 ║ 0.00121 ║
║ 40 ║ < 0.000001 ║ 🡒 Well, I guess I won’t get 40 claps..
╚══════╩════════════════╝
Then, what is Poisson for? What are the things that only Poisson can do, but Binomial
can’t?
In the above example, we have 17 ppl/wk who clapped. This means 17/7 = 2.4 people
clapped per day, and 17/(7*24) = 0.1 people clapping per hour.
If we model the success probability by hour (0.1 people/hr) using the binomial
random variable, this means most of the hours get zero claps but some hours will get
exactly 1 clap. However, it is also very possible that certain hours will get more than 1
clap (2, 3, 5 claps, etc.)
The problem with binomial is that it CANNOT contain more than 1 event in the unit of
time (in this case, 1 hr is the unit time). The unit of time can only have 0 or 1 event.
Then, how about dividing 1 hour into 60 minutes, and make unit time smaller, for
example, a minute? Then 1 hour can contain multiple events. (Still, one minute will
contain exactly one or zero events.)
Kind of. But what if, during that one minute, we get multiple claps? (i.e. someone
shared your blog post on Twitter and the traffic spiked at that minute.) Then what? We
can divide a minute into seconds. Then our time unit becomes a second and again a
minute can contain multiple events. But this binary container problem will always
exist for ever-smaller time units.
The idea is, we can make the Binomial random variable handle multiple events by
dividing a unit time into smaller units. By using smaller divisions, we can make the
original unit time contain more than one event.
Using the limit, the unit times are now infinitesimal. We no longer have to worry
about more than one event occurring within the same unit time. And this is how we
derive Poisson distribution.
If you use Binomial, you cannot calculate the success probability only with the rate
(i.e. 17 ppl/week). You need “more info” (n & p) in order to use the binomial PMF.
The Poisson Distribution, on the other hand, doesn’t require you to know n or p. We
are assuming n is infinitely large and p is infinitesimal. The only parameter of the
Poisson distribution is the rate λ (the expected value of x). In real life, only knowing
the rate (i.e., during 2pm~4pm, I received 3 phone calls) is much more common than
knowing both n & p.
4. Let’s derive the Poisson formula mathematically from the Binomial PMF.
From https://en.wikipedia.org/wiki/Poisson_distribution
Plug your own data into the formula and see if P(x)
makes sense to you!
Below is mine.
1. Even though the Poisson distribution models rare events, the rate λ can be any
number. It doesn’t always have to be small.
https://en.wikipedia.org/wiki/Poisson_distribution
If the number of events per unit time follows a Poisson distribution, then the amount
of time between events follows the exponential distribution. The Poisson distribution
is discrete and the exponential distribution is continuous, yet the two distributions are
closely related.
Give a tip
Sign up for The Variable
By Towards Data Science
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials and cutting-edge
research to original features you don't want to miss. Take a look.
By signing up, you will create a Medium account if you don’t already have one. Review
our Privacy Policy for more information about our privacy practices.