Mathematics (9709) Study Notes: Probability & Statistics 2 (Paper 6)

6.1 The Poisson Distribution

Welcome to the world of the Poisson distribution! This topic is absolutely essential for Paper 6. While the Binomial distribution deals with counts of success in a fixed number of trials, the Poisson distribution helps us model events that happen randomly over a specific time interval or space. Think of it as counting rare events!

Don't worry if this seems tricky at first. The key to mastering Poisson is understanding the conditions for its use and remembering the crucial approximations.


1. Understanding the Poisson Distribution Model

The Poisson distribution, denoted \(X \sim \text{Po}(\lambda)\), is a discrete random variable used to count the number of times an event occurs in a fixed interval (time, area, volume, etc.).

When is Poisson the right model? (The Conditions)

A random variable \(X\) can be modelled by a Poisson distribution if the following conditions are met:

  • Events occur independently: The occurrence of one event does not influence the chance of another occurring.
  • Events occur singly: Two events cannot happen at exactly the same instant. (e.g., two cars cannot arrive at a junction simultaneously in an exact mathematical sense).
  • The rate (\(\lambda\)) is constant: The average rate of occurrence, \(\lambda\), must be uniform throughout the interval. This rate is usually proportional to the size of the interval.

Key Term: The Mean Rate Parameter (\(\lambda\))

The symbol \(\lambda\) (pronounced "lambda") represents the average number of occurrences in the specified interval. If a question gives you a different interval, you must adjust \(\lambda\) proportionally.

Example: If the average number of calls to a switchboard is 4 per minute, then for a 5-minute interval, \(\lambda = 4 \times 5 = 20\).

Key Takeaway (Section 1)

The Poisson distribution counts random, rare, and independent events in a fixed space or time, governed by the mean rate \(\lambda\).


2. The Poisson Probability Formula

To calculate the probability of observing exactly \(r\) events, we use the formula provided in your List of Formulae (MF19):

If \(X \sim \text{Po}(\lambda)\), the probability of \(r\) occurrences is:

$$P(X=r) = \frac{e^{-\lambda} \lambda^r}{r!}$$

Where:

  • \(e\) is the base of the natural logarithm (approximately 2.718).
  • \(\lambda\) is the mean rate parameter.
  • \(r\) is the specific number of events we are interested in (\(r = 0, 1, 2, 3, \dots\)).
  • \(r!\) is \(r\) factorial.

Step-by-Step Calculation Example

Suppose the average number of mistakes on a page is \(\lambda = 1.5\). Find the probability of exactly 3 mistakes on a page, \(P(X=3)\).

  1. Identify parameters: \(\lambda = 1.5\), \(r = 3\).
  2. Substitute into the formula: $$P(X=3) = \frac{e^{-1.5} (1.5)^3}{3!}$$
  3. Calculate: $$P(X=3) = \frac{(0.22313) \times (3.375)}{6} \approx 0.1255$$
Calculating Cumulative Probabilities

Since the Poisson distribution is discrete, remember:

  • \(P(X \le r) = P(X=0) + P(X=1) + \dots + P(X=r)\).
  • \(P(X > r) = 1 - P(X \le r)\).
  • \(P(X \ge r) = 1 - P(X \le r-1)\).

Memory Aid: If you need to find \(P(X \ge 5)\), you calculate \(1 - [P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)]\).

Key Takeaway (Section 2)

Use the formula \(P(X=r) = \frac{e^{-\lambda} \lambda^r}{r!}\) for exact probabilities and remember standard cumulative rules for 'less than or equal to' or 'greater than'.


3. Mean and Variance

One of the most elegant features of the Poisson distribution is the relationship between its mean and variance.

The Golden Rule of Poisson

If \(X \sim \text{Po}(\lambda)\), then:

$$E(X) = \lambda$$ $$Var(X) = \lambda$$

This means the mean and variance are exactly the same, both equal to the rate parameter \(\lambda\).

Did you know? This equality ($E(X) = \text{Var}(X)$) is sometimes used in practice to test if the Poisson model is appropriate for a real-world dataset. If the observed mean and variance are very different, the model might not be suitable.

Key Takeaway (Section 3)

For Poisson, \(\text{Mean} = \text{Variance} = \lambda\).


4. The Poisson Approximation to the Binomial Distribution

Sometimes, a problem looks like a Binomial scenario, but the numbers are so large that calculating the Binomial probability becomes impractical or impossible (imagine calculating \(\binom{1000}{3}\)!). This is where Poisson steps in as a useful approximation.

Recall: \(X \sim B(n, p)\) requires a fixed number of trials \(n\).

Conditions for Po \(\approx\) B

The Poisson distribution can be used to approximate the Binomial distribution $B(n, p)$ when:

  1. \(n\) is large (The number of trials is big).
  2. \(p\) is small (The probability of success is low).

The syllabus suggests the following approximate rules for when this approximation is valid:

$$n > 50 \quad \text{and} \quad np < 5$$

How to set up the Poisson parameter \(\lambda\):

When approximating, the mean rate \(\lambda\) is calculated using the mean of the Binomial distribution:

$$\lambda = np$$

Example: A factory produces 500 items daily. The probability that an item is defective is 0.005. Let X be the number of defectives.

Binomial: \(X \sim B(500, 0.005)\). Here \(n=500\) (large) and \(p=0.005\) (small).

Poisson Approximation: Calculate \(\lambda = np = 500 \times 0.005 = 2.5\). We use \(X \sim \text{Po}(2.5)\).

Common Mistake Alert!

Always check the conditions. If \(n\) is large but \(p\) is large (e.g., \(p=0.9\)), or if $n$ is small, the Poisson approximation is invalid. You would use the Normal approximation to Binomial (if \(np > 5\) and \(nq > 5\)) or stick to the exact Binomial calculation.

Key Takeaway (Section 4)

If \(n\) is large and \(p\) is small ($\lambda = np < 5$), use the approximation \(B(n, p) \approx \text{Po}(\lambda)\), where \(\lambda = np\).


5. The Normal Approximation to the Poisson Distribution

Just as Poisson approximates Binomial when the parameters are extreme, the Normal distribution can approximate Poisson when \(\lambda\) is very large.

Conditions for N \(\approx\) Po

The Normal distribution, \(N(\mu, \sigma^2)\), can be used to approximate the Poisson distribution, \(X \sim \text{Po}(\lambda)\), when:

$$\lambda \text{ is large}$$

The syllabus suggests the following criterion:

$$\lambda > 15 \text{ (approximately)}$$

Setting up the Normal Parameters

Since the mean and variance of Poisson are both \(\lambda\):

$$\mu = \lambda$$ $$\sigma^2 = \lambda$$

Thus, we use the approximation: $$X \sim N(\lambda, \lambda)$$

The Continuity Correction (CC) - CRITICAL STEP!

The Poisson distribution is discrete (it deals with whole numbers: 0, 1, 2, ...), but the Normal distribution is continuous. When moving from discrete to continuous, we must apply a continuity correction (CC) by adjusting the boundary value by 0.5.

This is where students often lose marks! Remember to picture the block of probabilities.

| Discrete Probability | Continuous Approximation (Y) | Explanation | |---|---|---| | \(P(X=r)\) | \(P(r - 0.5 < Y < r + 0.5)\) | Taking the whole block centered at \(r\). | | \(P(X \le r)\) | \(P(Y < r + 0.5)\) | Including the entire block up to \(r\). | | \(P(X < r)\) | \(P(Y < r - 0.5)\) | Excluding the block at \(r\). | | \(P(X \ge r)\) | \(P(Y > r - 0.5)\) | Including the block starting at \(r\). | | \(P(X > r)\) | \(P(Y > r + 0.5)\) | Excluding the block at \(r\). |

Analogy: Imagine a bar chart (discrete counts). If you want \(P(X \le 5)\), you want all the bars up to and including the bar at 5. In continuous terms, the bar for 5 extends from 4.5 to 5.5, so you must integrate up to 5.5.

Step-by-Step Example (N \(\approx\) Po)

Let \(X \sim \text{Po}(18)\). Calculate \(P(X \le 20)\) using the Normal approximation.

  1. Check condition: \(\lambda = 18\). Since \(18 > 15\), the approximation is appropriate.
  2. Define Normal approximation parameters: \(\mu = 18\), \(\sigma^2 = 18\). \(\sigma = \sqrt{18} \approx 4.243\).
  3. Apply Continuity Correction: \(P(X \le 20) \rightarrow P(Y < 20.5)\).
  4. Standardise: Use \(Z = \frac{Y - \mu}{\sigma}\). $$Z = \frac{20.5 - 18}{\sqrt{18}} = \frac{2.5}{4.2426} \approx 0.589$$
  5. Use Normal tables (MF19): \(P(Z < 0.589)\).
    From the tables, \(\Phi(0.589) \approx 0.7224\).
    So, \(P(X \le 20) \approx 0.7224\).
Key Takeaway (Section 5)

If \(\lambda\) is large ($\lambda > 15$), use the approximation \(X \sim N(\lambda, \lambda)\) and NEVER forget the continuity correction when shifting from the discrete $X$ to the continuous $Y$.


Quick Revision Summary: Choosing the Right Distribution

When solving a probability problem involving counts, use this checklist:

  • Exact Poisson: If $n$ is not mentioned, and events occur randomly in an interval. Use $P(X=r) = \frac{e^{-\lambda} \lambda^r}{r!}$.
  • Poisson Approximation: If $n$ is known and large ($n>50$) and $p$ is small ($np < 5$). Use $\lambda = np$.
  • Normal Approximation to Poisson: If $\lambda$ is large ($\lambda > 15$). Use $N(\lambda, \lambda)$ and Continuity Correction.
Debugging Corner: Common Mistakes to Avoid

1. Forgetting to Scale \(\lambda\): If the mean is given per hour, but the question asks about a 30-minute period, halve your \(\lambda\)!

2. Incorrect Continuity Correction: Students often confuse \(P(X < r)\) with \(P(X \le r)\) when applying the correction. Always use $\pm 0.5$ correctly based on whether the boundary point is included or excluded.

3. Misusing Mean/Variance: Remember that for Poisson (and its Normal approximation), the mean and the variance are equal: \(\mu = \sigma^2 = \lambda\).