Hello there! Welcome to The Binomial and Poisson Distributions

Welcome to Unit S2! This chapter is incredibly important because it moves us from basic probability into the powerful world of discrete probability distributions. Don't worry if this sounds intimidating; we are essentially learning specialized tools (formulas and tables) that allow us to predict the likelihood of certain events happening in the real world—like how many successful penalty kicks a football player might score, or how many emails you receive in an hour.

We will focus on two key models: the Binomial Distribution and the Poisson Distribution. Mastering these will give you a fundamental understanding of how statistics models randomness!

Section 1: The Binomial Distribution (Counting Successes)

What is a Binomial Distribution?

Imagine you are repeating the same action over and over again, and each time, there are only two possible results: Success or Failure. If you have a fixed number of tries, the Binomial Distribution helps you find the probability of getting a specific number of successes.

We write this mathematically as: \(X \sim B(n, p)\).
Here, \(X\) is the random variable (the number of successes we are counting).
\(n\) is the total number of trials (fixed).
\(p\) is the probability of success in a single trial (must be constant).

The Four Key Conditions for Binomial (The BINS Check)

A random variable \(X\) can only be modelled by a Binomial distribution if all four of these conditions are met. Use the mnemonic BINS to remember them:

  1. Binary outcomes: Each trial must have only two outcomes (Success or Failure).
  2. Independent trials: The result of one trial must not affect the result of any other trial.
  3. Number of trials is fixed: The value of \(n\) must be determined before the experiment starts.
  4. Same probability: The probability of success (\(p\)) must be constant for every trial.

Example: Flipping a coin 10 times and counting the number of heads. \(n=10\), \(p=0.5\). This fits BINS perfectly!

Calculating Binomial Probabilities: The Formula

The probability of getting exactly \(x\) successes out of \(n\) trials is given by:

\(P(X=x) = \binom{n}{x} p^x (1-p)^{n-x}\)

Where:

  • \(\binom{n}{x}\) (often written as \({}^n C_x\)) means "n choose x". This counts the number of different ways \(x\) successes can happen in \(n\) trials.
  • \(p^x\) is the probability of getting \(x\) successes.
  • \((1-p)^{n-x}\) is the probability of getting the remaining \((n-x)\) failures.

Tip for Struggling Students: Understanding \(\binom{n}{x}\)

Imagine you flip a coin 3 times (\(n=3\)). You want exactly 2 heads (\(x=2\)). The possibilities are: HHT, HTH, THH. There are 3 ways.
The formula \(\binom{3}{2}\) simply calculates this count (3).

Using Binomial Tables (Cumulative Probabilities)

For Edexcel exams, you often use statistical tables, which list Cumulative Probabilities:

\(P(X \le x) = \text{The probability of getting } x \text{ successes or fewer.}

You must be extremely careful with inequalities when using the tables:

  • \(P(X < 5)\) is the same as \(P(X \le 4)\). (You look up the value for 4).
  • \(P(X \ge 3)\) must be calculated as: \(1 - P(X \le 2)\).
  • \(P(3 \le X \le 7)\) must be calculated as: \(P(X \le 7) - P(X \le 2)\). (Subtract the probabilities up to the one *before* the start).

Parameters of the Binomial Distribution (Mean and Variance)

While you could calculate the Mean (Expected Value) and Variance using standard discrete random variable formulas, the Binomial distribution has simple shortcuts:

Expected Value (Mean):
\(E(X) = \mu = np\)

Variance:
\(Var(X) = \sigma^2 = np(1-p)\)

Example: If 20% of packages are delayed (\(p=0.2\)), and you send 50 packages (\(n=50\)).
Expected number of delayed packages: \(E(X) = 50 \times 0.2 = 10\).

Key Takeaway for Binomial:

The Binomial distribution models a fixed number of independent trials with two outcomes. Remember the BINS conditions and be precise when using cumulative tables!


Section 2: The Poisson Distribution (Events in Time or Space)

What is a Poisson Distribution?

The Poisson distribution is used to model the number of events that occur randomly and independently in a fixed interval of time or space.

Examples include: the number of typos on a page, the number of customers arriving at a checkout queue per hour, or the number of accidents at a specific intersection per month.

We write this mathematically as: \(X \sim Po(\lambda)\).
Here, \(X\) is the random variable (the number of events we are counting).
\(\lambda\) (Lambda) is the average rate of occurrence in the given interval. It is the mean number of events.

The Conditions for Poisson

For \(X\) to be modelled by a Poisson distribution, the following assumptions must hold:

  • Events occur singly (one at a time, not in clusters).
  • Events occur randomly and independently of each other.
  • Events occur at a constant rate (uniform rate) over the interval.

Did you know? The Poisson distribution is named after the French mathematician Siméon Denis Poisson (1781–1840).

The Poisson Rate (\(\lambda\)): Scaling is Key!

The value of \(\lambda\) must match the interval you are interested in. If you change the time period, you must adjust \(\lambda\)!

Example: If a shop receives an average of 4 customers every hour, then for a 1-hour interval, \(\lambda = 4\).

  • For a 2-hour interval, the average rate would double: \(\lambda = 4 \times 2 = 8\).
  • For a 30-minute interval (half an hour): \(\lambda = 4 \times 0.5 = 2\).
Common Mistake: Forgetting to scale \(\lambda\) when the time or space interval changes!

Calculating Poisson Probabilities: The Formula

The probability of getting exactly \(x\) events when the mean rate is \(\lambda\) is:

\(P(X=x) = \frac{e^{-\lambda} \lambda^x}{x!}\)

Where:

  • \(e\) is Euler's number (approx 2.718...).
  • \(x!\) is \(x\) factorial (\(x \times (x-1) \times ... \times 1\)).

Parameters of the Poisson Distribution (The Great Equality)

The Poisson distribution has a very neat characteristic that simplifies calculations: its Mean (Expected Value) is always equal to its Variance.

Expected Value (Mean):
\(E(X) = \lambda\)

Variance:
\(Var(X) = \lambda\)

This equality (\(E(X) = Var(X)\)) is a key feature used to check if a real-world data set might be modelled accurately by Poisson.

Using Poisson Tables

Just like the Binomial, Poisson tables provide Cumulative Probabilities \(P(X \le x)\). The same rules for using inequalities apply here:

  • To find \(P(X > 5)\), calculate \(1 - P(X \le 5)\).
  • To find \(P(X = 4)\), calculate \(P(X \le 4) - P(X \le 3)\).
Key Takeaway for Poisson:

Poisson models events occurring randomly over an interval. The core parameter is the average rate, \(\lambda\). Remember to adjust \(\lambda\) if the interval changes, and recall that \(Mean = Variance = \lambda\).


Section 3: Connecting the Distributions

Poisson Approximation to the Binomial Distribution

In the early days of statistics (before powerful computers), calculating a Binomial probability when \(n\) was very large was incredibly difficult. Mathematicians found that under certain conditions, the Poisson distribution provides an excellent, much simpler approximation for the Binomial.

When can we use the Approximation?

We approximate \(B(n, p)\) using \(Po(\lambda)\) ONLY when the following two conditions are met:

  1. \(n\) is large: Generally, \(n > 50\).
  2. \(p\) is small: Generally, \(p < 0.1\).

Think of this as modelling rare events (small \(p\)) over many trials (large \(n\)). For example, the probability of catching a rare disease, where the whole population (\(n\)) is huge, but the chance (\(p\)) is tiny.

The Conversion Rule

If the conditions are met, we approximate \(B(n, p)\) by \(Po(\lambda)\) where:

\(\lambda = np\)

We simply use the expected value of the Binomial distribution as the mean rate for the Poisson distribution.

Step-by-Step Example of Approximation

A company produces light bulbs, and the probability of a bulb being defective is 0.005. If a batch contains 1000 bulbs, estimate the probability that exactly 3 are defective.

Step 1: Check the Binomial parameters.
\(n = 1000\) (Large)
\(p = 0.005\) (Small)
Conclusion: Approximation is suitable.

Step 2: Calculate \(\lambda\).
\(\lambda = np = 1000 \times 0.005 = 5\)

Step 3: Define the Poisson approximation.
\(X \sim Po(5)\)

Step 4: Calculate the required probability using Poisson.
We want \(P(X=3)\). Using the Poisson formula or tables for \(\lambda=5\) and \(x=3\).
\(P(X=3) = P(X \le 3) - P(X \le 2)\) (using tables)

Why does this work? (Briefly)

When \(n\) is huge and \(p\) is tiny, the probability of having two events occur simultaneously becomes negligible, and the trials essentially become independent rare events happening continuously—the exact conditions required for the Poisson distribution!

Quick Review: Key Parameters

Binomial (B(n, p)):
\(E(X) = np\)
\(Var(X) = np(1-p)\)

Poisson (Po(\(\lambda\))):
\(E(X) = \lambda\)
\(Var(X) = \lambda\)

Approximation: Requires large \(n\) and small \(p\). Use \(\lambda = np\).

You’ve done a great job getting through this section! By understanding the conditions and the parameters for the Binomial and Poisson distributions, you are fully equipped to tackle complex probability problems in Statistics 2. Keep practicing those table lookups—that is where most errors occur!