M1 Study Notes: Discrete Random Variables
Hello! Welcome to your study notes for one of the most interesting topics in M1 Statistics: Discrete Random Variables. Don't worry if the name sounds complicated. We're going to break it all down into simple, easy-to-understand pieces.
In this chapter, we'll learn how to attach numbers to the outcomes of random events (like rolling a dice or flipping a coin) and then explore what we can predict about these numbers. This is a super important idea that's used in everything from playing games and insurance to quality control in factories. Let's get started!
1. What are Discrete Random Variables?
Let's start with the basics: What's a Random Variable?
Imagine you roll a standard six-sided die. The outcome is random, right? A random variable, which we usually call X, is simply a way to assign a number to each possible outcome.
Example: If we roll a die, we can let the random variable X be the number that faces up. So, the possible values for X are 1, 2, 3, 4, 5, and 6.
Think of X as a placeholder for a numerical result before the event actually happens.
The key idea: 'Discrete'
The word 'discrete' means the variable can only take on specific, separate values that you can count. There are gaps between the values. You can list all the possible outcomes.
- Good Example (Discrete): The number of students in a classroom. You can have 30 students or 31 students, but you can't have 30.5 students. You can count them.
- Good Example (Discrete): The score you get when you roll two dice. The possible values are 2, 3, 4, ..., 12. There are gaps between these values.
- Non-Example (Continuous): The height of a student. Someone could be 175cm, 175.1cm, or even 175.11cm tall. You can't list all the possible heights because the values are on a continuous scale. (Continuous variables are covered in a later chapter!)
Key Takeaway
A Discrete Random Variable is a variable whose value is a number determined by the outcome of a random experiment, and its possible values are countable.
2. Probability Distributions: The Rulebook for Random Variables
What is a Probability Distribution?
A probability distribution is a table, graph, or formula that tells us the probability for each possible value of our discrete random variable X. It's like a complete summary of the random variable's behaviour.
Analogy: Think of it like a menu. The possible values of X are the dishes, and the probabilities are their prices. The probability distribution lists every dish and its price.
The Two Golden Rules of Discrete Probability Distributions
For any valid probability distribution, two rules MUST be true:
- The probability of each value x must be between 0 and 1 (inclusive).
$$0 \leq P(X=x) \leq 1$$ - The sum of all the probabilities for all possible values must be exactly 1.
$$\sum P(X=x) = 1$$
This makes sense, right? It means that one of the possible outcomes *must* happen!
How to Show a Probability Distribution
There are a few ways, but the most common for us is a table.
Step-by-Step Example
Let's create a probability distribution for a biased coin, where the probability of getting a Head is 0.6. Let the random variable X be the number of heads in a single flip.
Step 1: Identify the possible values of X.
If you flip a coin once, you can either get 0 heads (a tail) or 1 head. So, the values for X are 0 and 1.
Step 2: Find the probability for each value.
- P(X=1) = P(Head) = 0.6
- P(X=0) = P(Tail) = 1 - P(Head) = 1 - 0.6 = 0.4
Step 3: Put it in a table.
x | 0 | 1 |
P(X = x) | 0.4 | 0.6 |
Step 4: Check the two golden rules!
Are all probabilities between 0 and 1? Yes. Do they sum to 1? Yes, 0.4 + 0.6 = 1. So this is a valid probability distribution!
Key Takeaway
A probability distribution links each possible numerical outcome of a random event with its probability of occurring. The total probability must always sum to 1.
3. Expected Value: What to Expect on Average
What is Expected Value, E(X)?
The Expected Value of a random variable, written as E(X), is the long-term average value you would expect if you repeated the experiment many, many times. It's the mean of the probability distribution.
Analogy: Imagine a game where you roll a die and win that many dollars. Sometimes you'll win $1, sometimes $6. If you played this game thousands of times, what would your average winning per game be? That's the expected value.
Important: The expected value might not be a value that X can actually take! For a fair die roll, E(X) = 3.5, but you can never roll a 3.5.
The Formula for Expected Value
To calculate E(X), you multiply each possible value x by its probability P(X=x), and then add them all up.
$$E(X) = \sum x \cdot P(X=x)$$Step-by-Step Example
Let's say a game has the following probability distribution for winnings (X).
x ($) | 0 | 10 | 50 |
P(X = x) | 0.7 | 0.2 | 0.1 |
Calculation:
E(X) = (0 × 0.7) + (10 × 0.2) + (50 × 0.1)
E(X) = 0 + 2 + 5
E(X) = 7
So, on average, you would expect to win $7 per game if you played many times.
Properties of Expected Value
These are super useful shortcuts!
For any constants a and b: $$E(aX + b) = aE(X) + b$$
In simple terms: If you take your random outcome X, multiply it by a, and then add b, the new expected value is just the old expected value multiplied by a, plus b.
Example: In the game above, E(X) = 7. If the game host decides to double your winnings and add a $5 bonus, the new winning is Y = 2X + 5. The new expected winning is E(Y) = 2E(X) + 5 = 2(7) + 5 = $19.
Expectation of a Function, E[g(X)]
Sometimes we need the expectation of a function of X, like E(X²). The rule is similar: apply the function to each x value, then multiply by the probability and sum it up.
$$E[g(X)] = \sum g(x) \cdot P(X=x)$$For E(X²), this becomes: $$E(X^2) = \sum x^2 \cdot P(X=x)$$
Using our game example:
E(X²) = (0² × 0.7) + (10² × 0.2) + (50² × 0.1)
E(X²) = (0 × 0.7) + (100 × 0.2) + (2500 × 0.1)
E(X²) = 0 + 20 + 250 = 270. (We'll need this for variance!)
Key Takeaway
Expected Value E(X) is the weighted average of the outcomes of a random variable. It tells you the long-run average. Calculate it using $$E(X) = \sum x \cdot P(X=x)$$.
4. Variance: Measuring the Spread
What is Variance, Var(X)?
Variance, written as Var(X), measures how spread out the values of a random variable are from its expected value (the mean).
- A small variance means the outcomes are usually very close to the expected value. (Consistent, predictable)
- A large variance means the outcomes are very spread out. (Inconsistent, risky)
Analogy: Two basketball players might both have an expected value of 20 points per game. Player A scores 19, 20, 21, 20 (low variance). Player B scores 40, 0, 30, 10 (high variance). They have the same average, but Player B is far less predictable.
The Best Formula for Variance (The Shortcut!)
While the definition of variance is $$Var(X) = E[(X - E(X))^2]$$, it's difficult to calculate. We use a much easier formula:
$$Var(X) = E(X^2) - [E(X)]^2$$Be careful! Notice the difference between `E(X²) ` (the average of the squared values) and `[E(X)]²` (the square of the average value). This is a common place to make a mistake!
Step-by-Step Example (using our game from before)
We already found that E(X) = 7 and E(X²) = 270.
Step 1: Plug the values into the formula.
Var(X) = E(X²) - [E(X)]²
Var(X) = 270 - (7)²
Var(X) = 270 - 49
Var(X) = 221
This number tells us the data is quite spread out.
What about Standard Deviation?
The standard deviation is simply the square root of the variance. Its symbol is σ (sigma) or SD(X).
$$SD(X) = \sqrt{Var(X)}$$The main advantage is that its units are the same as the random variable X, making it a bit easier to interpret the spread.
Properties of Variance
Just like with expectation, we have some handy shortcuts!
For any constants a and b: $$Var(aX + b) = a^2 Var(X)$$
In simple terms:
- Adding a constant b just shifts the whole distribution; it doesn't change the spread. That's why b disappears from the formula.
- Multiplying by a constant a stretches the distribution, increasing the spread by a factor of a².
Example: For our game, Var(X) = 221. If the host doubles winnings and adds a $5 bonus (Y = 2X + 5), the new variance is Var(Y) = 2² Var(X) = 4 × 221 = 884. The spread has become much larger!
Key Takeaway
Variance Var(X) measures the spread of the data around the mean. Use the shortcut formula $$Var(X) = E(X^2) - [E(X)]^2$$ to calculate it.
5. The Binomial Distribution
The Binomial and Poisson distributions are specific, common types of discrete distributions. Knowing when to use them is key!
When to Use the Binomial Distribution
The Binomial distribution is your tool when an experiment consists of a fixed number of independent trials, each having only two possible outcomes.
First, a quick definition: A Bernoulli trial is a single experiment with only two outcomes, typically called 'success' and 'failure'. (e.g., one coin flip).
An experiment follows a Binomial distribution if it meets these four conditions. A good mnemonic is B.I.N.S.:
- B - Binary: There are only two possible outcomes for each trial ('success' or 'failure').
- I - Independent: The outcome of one trial does not affect the outcome of another.
- N - Number: There is a fixed number of trials, which we call n.
- S - Success: The probability of success, which we call p, is the same for every trial.
If these conditions are met, and X is the random variable for the number of successes, we write: $$X \sim B(n, p)$$ which reads "X follows a Binomial distribution with n trials and probability of success p."
Classic Example: Tossing a fair coin 10 times and counting the number of heads. Here, n=10, p=0.5.
The Binomial Probability Formula
The probability of getting exactly k successes in n trials is:
$$P(X=k) = C_k^n p^k (1-p)^{n-k}$$Where:
- $$C_k^n$$ is the number of ways to choose k successes from n trials.
- $$p^k$$ is the probability of k successes.
- $$(1-p)^{n-k}$$ is the probability of the remaining n-k trials being failures.
Step-by-Step Example
A student takes a 5-question multiple-choice test. Each question has 4 options, and the student guesses every answer. What is the probability they get exactly 3 questions right?
Step 1: Check if it's Binomial (B.I.N.S.).
- Binary: Yes, each question is either 'right' (success) or 'wrong' (failure).
- Independent: Yes, guessing one question doesn't affect another.
- Number: Yes, there's a fixed number of trials, n = 5.
- Success: Yes, the probability of guessing right is 1/4 for each question, so p = 0.25.
It's Binomial! So, $$X \sim B(5, 0.25)$$. We want to find P(X=3).
Step 2: Use the formula with n=5, p=0.25, k=3.
$$P(X=3) = C_3^5 (0.25)^3 (1-0.25)^{5-3}$$ $$P(X=3) = 10 \cdot (0.25)^3 (0.75)^2$$ $$P(X=3) = 10 \cdot (0.015625) \cdot (0.5625)$$ $$P(X=3) \approx 0.0879$$So there's about an 8.8% chance of guessing exactly 3 questions correctly.
Mean and Variance of a Binomial Distribution
Luckily, we don't need the big formulas for E(X) and Var(X). For a Binomial distribution, there are simple shortcuts (you don't need to know the proofs!):
- Mean (Expected Value): $$E(X) = np$$
- Variance: $$Var(X) = np(1-p)$$
For our test-guessing example:
Expected number of correct answers: E(X) = 5 × 0.25 = 1.25.
Variance: Var(X) = 5 × 0.25 × (0.75) = 0.9375.
Key Takeaway
Use the Binomial distribution for a fixed number of independent trials with two outcomes. Remember B.I.N.S. and the key formulas: $$P(X=k) = C_k^n p^k (1-p)^{n-k}$$, $$E(X) = np$$, and $$Var(X) = np(1-p)$$.
6. The Poisson Distribution
When to Use the Poisson Distribution
The Poisson distribution is different. It describes the number of times an event occurs in a fixed interval of time or space, when you know the average rate of occurrence.
Use Poisson when:
- Events are random and independent of each other.
- We are counting the number of occurrences over an interval (e.g., time, area, distance).
- The only thing we know is the average rate of occurrence, which we call λ (lambda).
If these conditions are met, and X is the random variable for the number of events in that interval, we write: $$X \sim Po(\lambda)$$ which reads "X follows a Poisson distribution with an average rate of λ."
Classic Examples: The number of emails arriving in your inbox in one hour; the number of flaws in a 10-metre roll of fabric; the number of calls to a help centre in one minute.
The Poisson Probability Formula
The probability of observing exactly k events in an interval is:
$$P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$$Where:
- λ is the average number of events per interval.
- k is the number of events we are interested in.
- e is Euler's number (approximately 2.718...).
The Most Important Trick: Adjusting λ
A very common exam question gives you a rate for one interval but asks for a probability over a different interval. You MUST adjust λ before using the formula.
Example: If a shop receives an average of 10 customers per hour, and you want to find the probability of getting customers in a 30-minute period, your new λ will be:
λ = (10 customers / 60 minutes) × 30 minutes = 5 customers.
Step-by-Step Example
A hospital emergency room receives patients at an average rate of 3 per hour. What is the probability that exactly 4 patients arrive in a particular hour?
Step 1: Identify the distribution and parameters.
This is about the number of events (arrivals) in a fixed interval (one hour) with a known average rate. It's Poisson!
The rate is λ = 3 per hour. The interval is one hour, so no adjustment is needed. We want P(X=4).
Step 2: Use the formula with λ=3, k=4.
$$P(X=4) = \frac{e^{-3} \cdot 3^4}{4!}$$ $$P(X=4) = \frac{e^{-3} \cdot 81}{24}$$ $$P(X=4) \approx \frac{0.049787 \cdot 81}{24}$$ $$P(X=4) \approx 0.168$$So there's about a 16.8% chance of exactly 4 patients arriving in that hour.
Mean and Variance of a Poisson Distribution
This is the easiest part to remember! For a Poisson distribution, the mean and variance are the same, and they are both equal to λ.
- Mean (Expected Value): $$E(X) = \lambda$$
- Variance: $$Var(X) = \lambda$$
If you see a problem where the mean and variance of a distribution are approximately equal, it's a big clue that it might be a Poisson distribution!
Did You Know?
The Poisson distribution is named after French mathematician Siméon Denis Poisson. It's sometimes called the "distribution of rare events" because it works well for events that have a low probability of happening at any specific moment, but occur many times over a long period.
Key Takeaway
Use the Poisson distribution for the number of events in a fixed interval when you know the average rate λ. Always check if you need to adjust λ! The key formulas are $$P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$$, and the unique property $$E(X) = Var(X) = \lambda$$.