Probability (Extended Treatment)
This document extends the core probability material with rigorous treatments of conditional
Probability, independence, Venn diagrams, tree diagrams, and Bayes’ theorem.
:::info Probability problems reward careful notation and clear event definitions. Always define your
events Explicitly before writing any equations.
:::
1. Conditional Probability
1.1 Definition
The conditional probability of event A given that event B has occurred is:
P(A∣B)=L◆B◆P(A∩B)◆RB◆◆LB◆P(B)◆RB◆
Provided P(B)>0.
Interpretation. P(A∣B) is the probability of A within the “reduced sample space” B.
1.2 Multiplication rule
For any two events A and B:
P(A∩B)=P(A)⋅P(B∣A)=P(B)⋅P(A∣B)
Extension to three events:
P(A∩B∩C)=P(A)⋅P(B∣A)⋅P(C∣A∩B)
1.3 Worked example
Problem. A bag contains 5 red and 3 blue balls. Two balls are drawn without replacement. Find
The probability that both are red.
Let R1 = “first ball is red”, R2 = “second ball is red”.
P(R1∩R2)=P(R1)⋅P(R2∣R1)=85×74=5620=145
1.4 The Law of Total Probability
If {B1,B2,…,Bn} is a partition of the sample space (mutually exclusive and
exhaustive), Then for any event A:
P(A)=i=1∑nP(A∣Bi)P(Bi)
Proof. Since the Bi partition Ω:
A=A∩Ω=A∩(⋃i=1nBi)=⋃i=1n(A∩Bi)
The sets A∩Bi are mutually exclusive, so:
P(A)=∑i=1nP(A∩Bi)=∑i=1nP(A∣Bi)P(Bi)■
1.5 Worked example: law of total probability
Problem. In a factory, Machine A produces 60% of items and Machine B produces 40%. Machine
A has a defect rate of 2% and Machine B has a defect rate of 5%. Find the probability that a
Randomly selected item is defective.
Let D = “item is defective”.
P(D)=P(D∣A)P(A)+P(D∣B)P(B)=0.02×0.6+0.05×0.4=0.012+0.020=0.032
2. Bayes’ Theorem
2.1 Statement
Bayes’ Theorem. For events A and B with P(B)>0:
P(A∣B)=L◆B◆P(B∣A)P(A)◆RB◆◆LB◆P(B)◆RB◆
Using the law of total probability in the denominator, for a partition {A1,…,An}:
P(Ai∣B)=L◆B◆P(B∣Ai)P(Ai)◆RB◆◆LB◆∑j=1nP(B∣Aj)P(Aj)◆RB◆
2.2 Proof
P(A∣B)=L◆B◆P(A∩B)◆RB◆◆LB◆P(B)◆RB◆=L◆B◆P(B∣A)P(A)◆RB◆◆LB◆P(B)◆RB◆
The first step is the definition of conditional probability. The second step applies the
Multiplication rule to the numerator. ■
2.3 Worked example
Problem. A disease affects 1% of a population. A test for the disease has a 95% true positive
Rate (P(positive∣disease)=0.95) and a 10% false positive rate
(P(positive∣no disease)=0.10). If a person tests positive, what is the
Probability they actually have the disease?
Let D = “has disease”, T+ = “tests positive”.
P(D∣T+)=L◆B◆P(T+∣D)P(D)◆RB◆◆LB◆P(T+∣D)P(D)+P(T+∣D′)P(D′)◆RB◆
=L◆B◆0.95×0.01◆RB◆◆LB◆0.95×0.01+0.10×0.99◆RB◆=0.0095+0.0990.0095=0.10850.0095≈0.0876
So even with a positive test, there is only about an 8.8% chance of having the disease.
:::caution Warning This counterintuitive result arises because the disease is rare. The number of
false positives far Exceeds the number of true positives. This is the base rate fallacy —
ignoring the prior Probability of the condition.
:::
2.4 Worked example: factory with three machines
Problem. A factory has three machines producing bolts. Machine 1 produces 50%, Machine 2
produces 30%, and Machine 3 produces 20%. Defect rates are 1%, 2%, and 3% respectively. A bolt is
found to Be defective. What is the probability it came from Machine 3?
P(M3∣D)=L◆B◆P(D∣M3)P(M3)◆RB◆◆LB◆P(D∣M1)P(M1)+P(D∣M2)P(M2)+P(D∣M3)P(M3)◆RB◆
=L◆B◆0.03×0.20◆RB◆◆LB◆0.01×0.50+0.02×0.30+0.03×0.20◆RB◆
=0.005+0.006+0.0060.006=0.0170.006≈0.353
3. Venn Diagrams
3.1 Notation and regions
For two events A and BThe Venn diagram has four regions:
| Region | Description | Probability |
|---|
| A∩B | In both A and B | P(A∩B) |
| A∩B′ | In A but not in B | P(A)−P(A∩B) |
| A′∩B | In B but not in A | P(B)−P(A∩B) |
| A′∩B′ | In neither A nor B | 1−P(A∪B) |
3.2 Worked example
Problem. In a group of 100 students, 45 study Maths, 30 study Physics, and 15 study both. A
Student is chosen at random. Find: (a) the probability they study at least one subject; (b) the
Probability they study Maths given they study Physics.
P(M)=0.45,P(P)=0.30,P(M∩P)=0.15
(a) P(M∪P)=0.45+0.30−0.15=0.60
(b) P(M∣P)=L◆B◆P(M∩P)◆RB◆◆LB◆P(P)◆RB◆=0.300.15=0.50
3.3 Three-event Venn diagrams
For three events A, B, CThe inclusion-exclusion formula gives:
P(A∪B∪C)=P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C)
3.4 Worked example: three events
Problem. In a survey, 60% of people like tea, 50% like coffee, 40% like chocolate, 30% like Tea
and coffee, 25% like tea and chocolate, 20% like coffee and chocolate, and 10% like all three. What
proportion likes none of these?
P(T∪C∪H)=0.6+0.5+0.4−0.3−0.25−0.2+0.1=0.85
P(none)=1−0.85=0.15
4. Tree Diagrams
4.1 Structure
A tree diagram represents a sequence of events. Each branch represents a possible outcome with its
Probability. The probability of any path through the tree is the product of the probabilities along
That path.
4.2 Rules
- The probabilities on branches from any single node must sum to 1.
- The probability of an outcome is the product of probabilities along the path to that outcome.
- To find the probability of a compound event, add the probabilities of all paths leading to that
event.
4.3 Worked example: two-stage selection
Problem. A box contains 7 red and 5 green counters. Two counters are drawn at random without
Replacement. Find the probability that: (a) both are the same colour; (b) exactly one is red.
(a) P(both red)=127×116=13242=227
P(both green)=125×114=13220=335
P(same colour)=227+335=6621+10=6631
(b)
P(one red)=127×115+125×117=13235+13235=13270=6635
4.4 Worked example: with replacement
Problem. Two dice are rolled. Find the probability that the sum is at least 9, given that the
First die shows at least 4.
Let A = “sum ≥9” and B = “first die ≥4”.
P(B)=63=21
P(A∩B):First die=4:need second≥5⟹2 outcomes
First die=5:need second≥4⟹3 outcomes
First die=6:need second≥3⟹4 outcomes
P(A∩B)=362+3+4=369=41
P(A∣B)=L◆B◆P(A∩B)◆RB◆◆LB◆P(B)◆RB◆=1/21/4=21
5. Independence
5.1 Definition
Events A and B are independent if and only if:
P(A∩B)=P(A)⋅P(B)
Equivalently: P(A∣B)=P(A)Or P(B∣A)=P(B).
Interpretation. Knowing that B occurred provides no information about whether A occurred.
5.2 Pairwise vs mutual independence
For three events A, B, C:
- Pairwise independence means each pair is independent.
- Mutual independence means pairwise independence and
P(A∩B∩C)=P(A)P(B)P(C).
Mutual independence is a stronger condition. Pairwise independence does not imply mutual
Independence.
5.3 Worked example
Problem. Events A and B are independent with P(A)=0.4 and P(B)=0.7. Find: (a)
P(A∩B); (b) P(A∪B); (c) P(A∣B); (d) P(A′∩B′).
(a) P(A∩B)=0.4×0.7=0.28
(b) P(A∪B)=0.4+0.7−0.28=0.82
(c) P(A∣B)=P(A)=0.4 (by independence)
(d) P(A′∩B′)=P((A∪B)′)=1−0.82=0.18
Note: P(A′∩B′)=P(A′)⋅P(B′)=0.6×0.3=0.18 confirms the complements are
Also independent.
5.4 Theorem: complements of independent events are independent
Theorem. If A and B are independent, then A′ and B′ are also independent.
Proof.
P(A′∩B′)=P((A∪B)′)=1−P(A∪B)=1−P(A)−P(B)+P(A)P(B)
=(1−P(A))(1−P(B))=P(A′)⋅P(B′)■
:::caution Warning “Independent” and “mutually exclusive” are different concepts. In fact, if A
and B are both Non-trivial (positive probability) and mutually exclusive, they cannot be
independent: P(A∩B)=0=P(A)P(B).
6. Practice Problems
Problem 1
In a class of 40 students, 25 play football, 18 play cricket, and 5 play neither. A student is
Chosen at random. Given that they play football, find the probability they also play cricket.
Solution
P(F) = 25/40 = 0.625$$P(C) = 18/40 = 0.45$$P(F \cup C) = 35/40 = 0.875.
P(F∩C)=0.625+0.45−0.875=0.20.
P(C∣F)=0.20/0.625=0.32.
Problem 2
A test for a condition has sensitivity 92% (true positive rate) and specificity 96% (true negative
Rate). The condition prevalence is 3%. Find: (a) P(condition∣positive); (b)
P(condition∣negative).
Solution
P(T^+ \mid C) = 0.92$$P(T^- \mid C') = 0.96$$P(C) = 0.03.
(a)
P(C∣T+)=L◆B◆0.92×0.03◆RB◆◆LB◆0.92×0.03+0.04×0.97◆RB◆=0.0276+0.03880.0276=0.06640.0276≈0.416
(b)
P(C∣T−)=L◆B◆0.08×0.03◆RB◆◆LB◆0.08×0.03+0.96×0.97◆RB◆=0.0024+0.93120.0024=0.93360.0024≈0.00257
Problem 3
Events A and B are independent with P(A)=31 and P(A∪B)=43. Find
P(B).
Solution
P(A∪B)=P(A)+P(B)−P(A)P(B).
43=31+P(B)−31P(B).
43−31=32P(B).
125=32P(B)⟹P(B)=85.
Problem 4
A bag contains 4 red, 6 green, and 5 blue balls. Three balls are drawn without replacement. Find The
probability that they are all different colours.
Solution
Total balls =15. Ways to draw one of each colour:
Number of ways =(14)(16)(15)=120.
Total ways to draw 3 from 15 =(315)=455.
P=455120=9124≈0.264.
Alternatively, using conditional probability:
P=154×146×135×6=2730720=9124.
(The factor of 6 accounts for the 3!=6 orderings of the three colours.)
Common Pitfalls
-
Misreading the question, particularly with ‘hence’ vs ‘hence or otherwise’ — the former requires
using previous work.
-
Confusing the domain and range of functions, or not considering restrictions (e.g., denominator
cannot be zero).
-
Dropping negative signs during algebraic manipulation — substitute back to verify your answer.
-
Losing marks by not showing sufficient working — always write out each step, especially in proof
questions.
Worked Examples
Example 1: Bayes’ Theorem
Problem. A disease affects 1% of the population. A test has 95% sensitivity
(P(+∣disease)=0.95) and 90% specificity (P(-∣no disease)=0.90).
Find the probability a person has the disease given a positive test.
Solution. By Bayes’ theorem:
P(D∣+)=P(+∣D)P(D)+P(+∣Dc)P(Dc)P(+∣D)P(D)=0.95×0.01+0.10×0.990.95×0.01
P(D∣+)=0.0095+0.0990.0095=0.10850.0095≈0.0876
Despite the positive test, there is only an 8.8% chance of having the disease, due to the low
prevalence.
■
Example 2: Conditional Probability and Tree Diagrams
Problem. Bag A contains 4 red and 6 blue balls. Bag B contains 7 red and 3 blue balls. A fair
die is rolled: if it shows 1 or 2, a ball is drawn from Bag A; otherwise from Bag B. Find the
probability the ball is red, and the probability Bag A was chosen given the ball is red.
Solution. P(A)=62=31, P(B)=32.
P(R∣A)=104=0.4, P(R∣B)=107=0.7.
P(R)=P(R∣A)P(A)+P(R∣B)P(B)=0.4×31+0.7×32=30.4+1.4=31.8=0.6.
P(A∣R)=P(R)P(R∣A)P(A)=0.60.4×31=0.60.1333≈0.222
■
Summary
- P(A∣B)=P(B)P(A∩B); multiplication rule: P(A∩B)=P(A∣B)P(B).
- Independence: A and B are independent iff P(A∩B)=P(A)P(B).
- Bayes’ theorem: P(A∣B)=P(B)P(B∣A)P(A) inverts conditional probabilities.
- Total probability: P(B)=∑P(B∣Ai)P(Ai) partitions the sample space.
- Tree diagrams organise multi-stage probability calculations systematically.
:::