Skip to content

Statistics

Statistics

Statistics is the science of collecting, analysing, and drawing conclusions from data under uncertainty. A-Level Statistics covers data representation, probability theory, statistical distributions, and hypothesis testing — the foundations of data-driven decision making.

Topics Covered

Data Representation

  • Types of data — qualitative vs. quantitative, discrete vs. continuous, primary vs. secondary
  • Measures of central tendency — mean xˉ=xn\bar{x} = \frac{\sum x}{n}, median, mode; when each is appropriate
  • Measures of spread — range, interquartile range (IQR), variance σ2=(xxˉ)2n\sigma^2 = \frac{\sum(x-\bar{x})^2}{n}, standard deviation
  • Visual representations — histograms (with varying class widths), cumulative frequency curves, box plots, stem-and-leaf diagrams
  • Outliers — identification using 1.5×IQR1.5 \times \text{IQR} or mean ±2σ\pm 2\sigma; deciding whether to exclude

Correlation and Regression

  • Scatter diagrams — visual assessment of correlation (positive, negative, none)
  • Pearson’s product-moment correlation coefficientrr measures linear correlation; 1r1-1 \leq r \leq 1
  • Regression liney=a+bxy = a + bx; least squares; interpreting aa (intercept) and bb (gradient) in context
  • Interpolation vs. extrapolation — reliability of predictions within and beyond the data range

Probability

  • Axioms0P(A)10 \leq P(A) \leq 1, P(Ω)=1P(\Omega) = 1, P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)
  • Conditional probabilityP(AB)=P(AB)P(B)P(A|B) = \frac{P(A \cap B)}{P(B)}
  • IndependenceP(AB)=P(A)×P(B)P(A \cap B) = P(A) \times P(B)
  • Tree diagrams and Venn diagrams — systematic approaches to multi-stage probability problems
  • Mutually exclusive vs. independent — these are different concepts; mutually exclusive events cannot be independent (unless one has probability 0)

Statistical Distributions

  • Binomial distributionXBin(n,p)X \sim \text{Bin}(n, p); P(X=x)=(nx)px(1p)nxP(X = x) = \binom{n}{x}p^x(1-p)^{n-x}; conditions (fixed trials, two outcomes, constant probability, independence)
  • Normal distributionXN(μ,σ2)X \sim N(\mu, \sigma^2); bell curve, symmetry, the empirical rule (6868-9595-99.7%99.7\%)
  • Standard normalZ=XμσZ = \frac{X - \mu}{\sigma}; using tables for P(Z<z)P(Z < z)
  • Approximations — normal approximation to binomial when nn is large and p0.5p \approx 0.5

Hypothesis Testing

  • Null and alternative hypothesesH0H_0 (no effect) vs. H1H_1 (effect exists); one-tailed vs. two-tailed
  • Test statistics — calculating from sample data and comparing to critical values
  • Significance levelα=0.05\alpha = 0.05 or 0.010.01; the probability of a Type I error
  • pp-valuesP(observing this or more extremeH0)P(\text{observing this or more extreme} \mid H_0); reject H0H_0 if pp-value <α< \alpha
  • Critical regions — the set of values that lead to rejecting H0H_0
  • Binomial hypothesis tests — testing a population proportion

Study Tips

  1. Draw diagrams — Venn diagrams for probability, scatter plots for correlation, normal distribution sketches for every ZZ-score question.
  2. Show full working in hypothesis tests — state H0H_0, H1H_1, calculate the test statistic, find the pp-value or critical value, compare, state conclusion in context.
  3. Know when to use Binomial vs. Normal — Binomial for counting successes in fixed trials; Normal for continuous measurements with a bell-shaped distribution.
  4. Practise reading tables — normal distribution tables and binomial tables require careful reading. Check whether the table gives P(Z<z)P(Z < z) or P(Z>z)P(Z > z).
  5. Interpret in context — never just say “reject H0H_0”; say “there is sufficient evidence at the 5% significance level to suggest that the mean has increased.”

How to Use These Notes

Follow the sidebar order. Each page provides definitions, worked examples with full calculations, and exam-style problems. Start with data representation and probability before moving to distributions and hypothesis testing.