The Poisson and geometric distributions model discrete random variables arising from counting
Processes. The Poisson distribution counts the number of rare events in a fixed interval, while the
Geometric distribution counts the number of trials until the first success.
Board Coverage
Board
Paper
Notes
AQA
Paper 2
Both Poisson and geometric in depth
Edexcel
S2, S3
Poisson in S2; geometric in S3
OCR (A)
Paper 2
Poisson and geometric
CIE (9231)
S2
Poisson covered; geometric not required
:::info The formula booklet provides the Poisson PMF. You must know when to apply each distribution
And how to carry out hypothesis testing with discrete distributions. The geometric distribution has
Two common conventions for the support: r=1,2,3,… (number of trials) or
r=0,1,2,… (number of failures). AQA uses r=1,2,….
:::
1. The Poisson Distribution
1.1 Definition
Definition. A discrete random variable X follows a Poisson distribution with parameter
λ (where λ>0), written X∼Po(λ)If
P(X=r)=L◆B◆e−λλr◆RB◆◆LB◆r!◆RB◆,r=0,1,2,…
The Poisson distribution models the number of events occurring in a fixed interval of time or space
When:
Events occur independently
Events occur at a constant average rate λ
The probability of more than one event in a sufficiently small interval is negligible
1.2 Derivation as a Limit of the Binomial
Theorem. If n→∞ and p→0 such that np=λ remains constant, then
B(n,p)→Po(λ).
This is the defining property of the Poisson distribution: the mean equals the variance.
1.5 Additivity of Poisson distributions
If X∼Po(λ) and Y∼Po(μ) are independent, then
X+Y∼Po(λ+μ)
1.6 Cumulative probabilities
Cumulative Poisson probabilities are found using:
P(X≤r)=∑k=0rL◆B◆e−λλk◆RB◆◆LB◆k!◆RB◆
These are obtained from tables or a calculator. Key relationships:
P(X>r)=1−P(X≤r)P(a≤X≤b)=P(X≤b)−P(X≤a−1)
1.7 Poisson hypothesis testing
The procedure mirrors binomial hypothesis testing:
Define X and state X∼Po(λ0) under H0
State H0:λ=λ0 and H1
State the significance level α
Find the critical region
Compare the observed value
Conclude in context
Example. A call centre receives an average of 3.2 calls per minute. In a particular minute, 7
Calls are received. Test at the 5% significance level whether the rate has increased.
X∼Po(3.2). H0:λ=3.2, H1:λ>3.2.
P(X≥7)=1−P(X≤6)=1−0.9554=0.0446<0.05.
Reject H0. There is sufficient evidence that the rate has increased.
Example. Find the critical region for a two-tailed test at the 5% level with
X∼Po(5).
Lower tail: P(X≤0)=e−5≈0.0067≤0.025. P(X≤1)=0.0404>0.025. So
X≤0.
Upper tail: P(X≥10)=1−0.9682=0.0318≤0.025? No.
P(X≥11)=1−0.9830=0.0170≤0.025. So X≥11.
Critical region: X≤0 or X≥11.
2. The Geometric Distribution
2.1 Definition
Definition. A discrete random variable X follows a geometric distribution with parameter
p (where 0<p≤1), written X∼Geo(p)If X is the number of the trial on
Which the first success occurs:
P(X=r)=(1−p)r−1p,r=1,2,3,…
Each trial is independent with probability p of success.
2.2 Proof that E(X)=p1
Proof
E(X)=r=1∑∞rqr−1pwhereq=1−p
Let S=∑r=1∞rqr−1. Recall the geometric series
∑r=0∞qr=1−q1 for ∣q∣<1.
Differentiating both sides with respect to q:
∑r=1∞rqr−1=(1−q)21
Therefore:
E(X)=p⋅(1−q)21=p⋅p21=p1■
2.3 Proof that Var(X)=p21−p
Proof
First compute E(X2)=E(X(X−1))+E(X).
E(X(X−1))=r=2∑∞r(r−1)qr−1p=pqr=2∑∞r(r−1)qr−2
Starting from ∑r=0∞qr=1−q1Differentiating twice:
:::info info success, the probability of waiting at least n more trials is exactly the same as if
You were starting fresh. The process “forgets” its history.
:::
2.5 Cumulative distribution function
P(X≤r)=1−qr=1−(1−p)r
2.6 Geometric hypothesis testing
Example. A bag contains red and blue balls. The probability of drawing a red ball is p. In an
Experiment, the first red ball is drawn on the 10th draw. Test at the 5% level whether p=0.3.
X∼Geo(0.3). H0:p=0.3, H1:p<0.3 (the ball took longer than expected, so
p may be smaller).
p−value=P(X≥10)=(1−0.3)10−1=0.79≈0.0404<0.05.
Reject H0. There is sufficient evidence that p<0.3.
Critical region approach. For H1:p<0.3 at the 5% level, find c such that
P(X≥c)≤0.05:
3. Modelling with Poisson and Geometric Distributions
3.1 When to use each
Situation
Distribution
Number of events in a fixed interval, rare events
Poisson Po(λ)
Number of trials until first success
Geometric Geo(p)
Fixed number of trials, counting successes
Binomial B(n,p)
3.2 Poisson as approximation to Binomial
When n is large and p is small such that np≤10:
B(n,p)≈Po(np)
Example.X∼B(200,0.02). Then λ=np=4So X≈Po(4).
P(X≤2)≈e−4(1+4+216)=13e−4≈0.2381.
3.3 Conditions check
Before applying the Poisson distribution, verify:
Events occur at a constant average rate
Events are independent
At most one event can occur in a sufficiently small sub-interval
:::caution warning not confuse this with the normal approximation to the binomial, which requires
np>5 and n(1−p)>5.
:::
Problems
Problem 1
A factory produces items with defects occurring at an average rate of 2.5 per hour. Find the probability of exactly 4 defects in a given hour, and the probability of more than 6 defects in a 2-hour period.
Solution 1
For one hour: $X \sim \mathrm{Po}(2.5)$. $P(X=4) = \dfrac{e^{-2.5}(2.5)^4}{4!} = \dfrac◆LB◆0.08209 \times 39.0625◆RB◆◆LB◆24◆RB◆ \approx 0.1336$.
For two hours: Y∼Po(5) (by additivity).
P(Y>6)=1−P(Y≤6)=1−0.7622=0.2378.
Problem 2
A die is rolled repeatedly until a 6 appears. Find the probability that the first 6 appears on the 5th roll, and the probability that it takes more than 10 rolls.
Solution 2
$X \sim \mathrm{Geo}(1/6)$. $P(X=5) = \left(\dfrac{5}{6}\right)^4 \cdot \dfrac{1}{6} = \dfrac{625}{1296} \cdot \dfrac{1}{6} \approx 0.0804$.
Problem 3
Prove that $E(X) = \lambda$ for $X \sim \mathrm{Po}(\lambda)$Showing all steps of the summation.
Solution 3
$E(X) = \sum_{r=0}^{\infty}r\cdot\dfrac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=1}^{\infty}\dfrac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{r=1}^{\infty}\dfrac◆LB◆\lambda^{r-1}◆RB◆◆LB◆(r-1)!◆RB◆$
Problem 4
The number of emails received per hour follows $\mathrm{Po}(8)$. Find the probability of receiving between 6 and 12 emails (inclusive) in a given hour.
Solution 4
$X \sim \mathrm{Po}(8)$. $P(6 \leq X \leq 12) = P(X \leq 12) - P(X \leq 5)$.
Problem 5
A manufacturer claims that on average 1 in 20 items is defective. In a batch of 500 items, use the Poisson approximation to find the probability of at most 35 defectives.
Solution 5
$X \sim B(500, 1/20)$. $\lambda = np = 500/20 = 25$.
Problem 7
A shop receives an average of 6 customers per 30 minutes. Find the critical region for a test at the 5% significance level of $H_0: \lambda = 6$ against $H_1: \lambda > 6$Where $X$ is the number of customers in a 30-minute period.
Solution 7
Under $H_0$: $X \sim \mathrm{Po}(6)$.
Problem 8
$X \sim \mathrm{Geo}(p)$. Find $P(X = 3 \mid X > 1)$ and show it equals $P(X = 2)$.
Solution 8
$P(X = 3 \mid X > 1) = \dfrac{P(X = 3)}{P(X > 1)} = \dfrac{q^2 p}{q} = qp = P(X = 2)$.
This is a direct consequence of the memoryless property: given that the first trial was a failure,
The distribution of the remaining trials is the same as starting fresh.
Problem 9
The number of accidents per week at a junction follows $\mathrm{Po}(3)$. After new traffic lights are installed, 8 accidents are observed in one week. Test at the 5% level whether the rate has increased.
Solution 9
$X \sim \mathrm{Po}(3)$. $H_0: \lambda = 3$, $H_1: \lambda > 3$. $\alpha = 0.05$.
p−value=P(X≥8)=1−P(X≤7)=1−0.9881=0.0119<0.05.
Reject H0. There is sufficient evidence that the accident rate has increased.
Problem. A factory produces items with a defect rate of 0.02. In a batch of 200 items, find the
Probability of exactly 3 defective items using (a) the binomial distribution and (b) the Poisson
Approximation.
Example 7.2: Geometric distribution and memoryless property
Problem. A fair die is rolled until a 6 appears. Find the probability that more than 4 rolls are
Needed. Verify the memoryless property: P(X>m+n∣X>m)=P(X>n).
Example 7.4: Hypothesis testing with the Poisson distribution
Problem. A traffic survey records the number of cars passing a point in 10-second intervals. The
Observed frequencies for k cars are compared with the expected frequencies under H0:
X∼Po(3). Calculate the expected frequency for each value of k if 200 intervals
Were observed.
Solution. Under H0: P(X=k)=L◆B◆e−3⋅3k◆RB◆◆LB◆k!◆RB◆.
k
P(X=k)
Expected freq (×200)
0
e−3=0.0498
9.96
1
3e−3=0.1494
29.87
2
4.5e−3=0.2240
44.81
3
4.5e−3=0.2240
44.81
4
3.375e−3=0.1680
33.60
5
2.025e−3=0.1008
20.17
≥6
1−∑05
≈16.78
Example 7.5: Fitting a Poisson distribution
Problem. The number of email messages received per hour is recorded over 100 hours:
{0:5,1:15,2:25,3:30,4:15,5:7,6:3}. Estimate the parameter λ and calculate
Expected frequencies.
Example 7.6: Conditional probability with geometric distribution
Problem. In a game, the probability of winning each round is p=0.3 independently. Given that
A player has not won in the first 5 rounds, find the probability that they win within the next 3
Rounds.
Solution.X∼Geo(0.3). By the memoryless property:
P(X≤8∣X>5)=P(X≤3)=1−(0.7)3=1−0.343=0.657
Example 7.7: Sum of independent Poisson variables
Problem.X∼Po(3) and Y∼Po(5) are independent. State the
Distribution of X+Y and find P(X+Y=6).
As n→∞: L◆B◆n(n−1)⋯(n−k+1)◆RB◆◆LB◆nk◆RB◆→1 and
(1−L◆B◆λ◆RB◆◆LB◆n◆RB◆)n−k→e−λ.
Therefore P(X=k)→L◆B◆e−λλk◆RB◆◆LB◆k!◆RB◆. ■
8. Connections to Other Topics
8.1 Poisson distribution and exponential distribution
If events occur according to a Poisson process with rate λThe time between consecutive
Events follows the exponential distribution Exp(λ). See
Exponential and Continuous Random Variables.
8.2 Geometric distribution and series summation
The probability generating function GX(t)=1−qtpt of the geometric distribution
Connects to the summation of geometric series. See
Further Algebra.
8.3 Poisson and hypothesis testing
Goodness-of-fit tests using the chi-squared statistic compare observed and expected (Poisson)
Frequencies. See
Chi-Squared Tests.
9. Additional Exam-Style Questions
Question 11
A shop receives on average 4 customers per hour. Find the probability that: (a) Exactly 3
Customers arrive in a given hour. (b) More than 2 customers arrive in a 30-minute period.
Solution
(a)X∼Po(4).
P(X=3)=L◆B◆e−4⋅64◆RB◆◆LB◆6◆RB◆=3e432≈0.1954
(b) For 30 minutes: Y∼Po(2).
P(Y>2)=1−P(Y≤2)=1−e−2(1+2+2)=1−5e−2≈0.3233
Question 12
A coin is tossed until the first head appears. The probability of heads is p.
(a) Find E(X) and Var(X) where X is the number of tosses needed.
Prove that if X1,X2,…,Xn are independent with Xi∼Po(λi)
Then S=∑Xi∼Po(∑λi).
Solution
The probability generating function of Xi∼Po(λi) is
GXi(t)=eλi(t−1).
For independent random variables, the PGF of the sum is the product:
GS(t)=∏i=1neλi(t−1)=e(t−1)∑λi
This is the PGF of Po(∑λi). Therefore
S∼Po(∑λi). ■
Question 14
A typist makes on average 2 errors per page. Find the probability that a particular page has:
(a) No errors. (b) At most 3 errors. (c) Exactly 2 errors given that it has at most 3
Errors.
Problem. A factory produces items with a defect rate of 0.002. In a batch of 1000, find the
Probability of exactly 3 defects using (a) the binomial distribution and (b) the Poisson
Approximation.
Example 8.2: Sum of independent Poisson random variables
Problem. Emails arrive at a rate of 5 per hour and texts at 3 per hour. Find the probability
That the total number of messages in a 2-hour period exceeds 20.
Problem. A call centre claims an average of 6 calls per minute. In a 10-minute period, 72 calls
Are received. Test at the 5% level whether the rate has increased.
So λ−1≤m≤λMeaning the mode is ⌊λ⌋ (and also λ
if λ is an integer).
Example 8.6: Relationship between Poisson and exponential
Problem. Events occur according to a Poisson process with rate λ=4 per hour. Find the
Probability that the time between two consecutive events exceeds 30 minutes.
Solution. For a Poisson process with rate λThe inter-arrival time
T∼Exp(λ).
P(T>0.5)=e−4×0.5=e−2≈0.135
Example 8.7: Variance of the geometric distribution
Problem. Derive Var(X) for X∼Geo(p)Defined as the number of trials
Until the first success.
Solution.E(X)=p1. Using Var(X)=E(X2)−[E(X)]2:
E(X2)=k=1∑∞k2p(1−p)k−1.
Using the identity k=1∑∞k2rk−1=(1−r)31+r with
r=1−p:
E(X2)=p3p(2−p)=p22−p
Var(X)=p22−p−p21=p21−p
9. Common Pitfalls
Pitfall
Correct Approach
Confusing the two definitions of the geometric distribution
”Number of trials until first success”: E(X)=1/p; “Number of failures before first success”: E(X)=(1−p)/p
Using the Poisson approximation when np>10 or n<20
The Poisson approximation requires n large and p small, with np moderate
Forgetting that Poisson probabilities sum to 1 only over all k from 0 to ∞
Never truncate without adjusting
Applying the Poisson to events that are not independent
The Poisson process requires independent events at a constant average rate
10. Additional Exam-Style Questions
Question 8
A typist makes an average of 2 errors per page. Find the probability that a 3-page document contains
Exactly 5 errors.
The inter-arrival times of a Poisson process follow the exponential distribution. If events occur at
Rate λ per unit time, the time between consecutive events is Exp(λ). See
Exponential and Continuous Random Variables.
11.2 Poisson and binomial
The Poisson distribution approximates the binomial when n is large and p is small, with
λ=np.
11.3 Poisson and chi-squared tests
The chi-squared goodness-of-fit test is used to test whether data follows a Poisson or geometric
Distribution. See
Chi-Squared Tests.
12. Key Results Summary
Distribution
PMF
E(X)
Var(X)
Po(λ)
P(X=x)=L◆B◆e−λλx◆RB◆◆LB◆x!◆RB◆
λ
λ
Geo(p) (trials)
P(X=x)=p(1−p)x−1
p1
p21−p
Geo(p) (failures)
P(X=x)=p(1−p)x
p1−p
p21−p
Property
Poisson
Geometric
Memoryless
No
Yes
Additive: X1+X2
Po(λ1+λ2) if independent
Not simple
PMF tail behaviour
Decays faster than geometric
Slower decay
13. Further Exam-Style Questions
Question 11
A shop receives customers at a rate of 8 per hour. Find the probability that: (a) exactly 5
Customers arrive in a 30-minute period; (b) more than 10 customers arrive in an hour; (c) the time
Between two consecutive arrivals exceeds 20 minutes.
If events of type A occur at rate λA and type B at rate λBIndependently, Then
the total event process is Poisson with rate λA+λB.
14.2 Poisson distribution and the Poisson point process
A Poisson point process in 2D with rate λ per unit area has the property that the number of
Points in a region of area A follows Po(λA).
14.3 The geometric distribution as a special case of the negative binomial
The negative binomial distribution counts the number of trials until r successes. The geometric
Distribution is the case r=1.
NegBin(r,p): P(X=n)=(r−1n−1)pr(1−p)n−r for n=r,r+1,…
14.4 Relationship to exponential families
Both the Poisson and geometric distributions belong to the exponential family of distributions,
Which have PDF/PMF of the form f(x;θ)=h(x)exp(η(θ)T(x)−A(θ)).
15. Further Exam-Style Questions
Question 13
A radioactive source emits particles at a rate of 12 per minute. Find the probability that in a
2-minute period, the number of particles emitted is between 20 and 30 (inclusive).