Welcome to Discrete Random Variables!
Hello! This chapter is where we bridge the gap between simple probability calculations and serious statistical modeling. Don't worry if the term "random variable" sounds intimidating—it's just a mathematical way of describing the possible numerical outcomes of a random event.
Understanding Discrete Random Variables (DRVs) is fundamental to Statistics 1. It helps us analyze games of chance, understand experimental data, and make predictions about what we can expect to happen on average. We'll learn how to calculate the average outcome and measure how spread out those outcomes are.
Section 1: What is a Discrete Random Variable?
1.1 Defining the Random Variable (X)
A Random Variable (X) is a function that assigns a numerical value to every outcome in a sample space. It must be random because we don't know the exact value it will take until the event occurs.
- We typically use capital letters (like \(X\) or \(Y\)) to denote the random variable itself.
- We use lowercase letters (like \(x\) or \(y\)) to denote the specific values the variable can take.
Example: If you roll a standard die once, let \(X\) be the score. The possible values \(x\) are \(\{1, 2, 3, 4, 5, 6\}\).
1.2 Discrete vs. Continuous
The key word here is Discrete.
- Discrete Random Variable (DRV): A variable where the possible outcomes are countable. They usually take integer values (whole numbers).
- Analogy: A DRV is like counting the number of students in a class, or the number of heads in five coin flips. You can't have 2.5 students or 3.1 heads.
- (Non-examinable quick note): A Continuous Random Variable (CRV) can take any value within a range (e.g., height, weight, time). We cover these later in Statistics.
Quick Review: DRV Checklist
For a variable \(X\) to be a Discrete Random Variable, the outcomes must be:
- Numerical.
- Determined by chance.
- Countable (often integers).
Section 2: Probability Distributions
Once we know the possible values \(x\) a DRV can take, we need to know the probability associated with each value. A Probability Distribution (or sometimes called a Probability Mass Function, PMF) lists all possible outcomes and their corresponding probabilities.
2.1 Displaying the Distribution
Probability distributions are usually displayed in a table format:
| \(x\) (Possible values) | \(x_1\) | \(x_2\) | \(x_3\) | ... |
| \(P(X=x)\) (Probabilities) | \(p_1\) | \(p_2\) | \(p_3\) | ... |
We often use the notation \(P(X=x)\) to mean "the probability that the random variable \(X\) takes the specific value \(x\)."
Example: If \(X\) is the result of rolling a die:
| \(x\) | 1 | 2 | 3 | 4 | 5 | 6 |
| \(P(X=x)\) | \(1/6\) | \(1/6\) | \(1/6\) | \(1/6\) | \(1/6\) | \(1/6\) |
2.2 The Two Essential Rules
For any function or table to be a valid probability distribution, it must satisfy two crucial conditions:
-
All probabilities must be between 0 and 1:
$$0 \le P(X=x) \le 1$$ -
The sum of all probabilities must equal exactly 1:
$$\sum P(X=x) = 1$$
!!! Common Exam Scenario !!!
You will frequently encounter questions where the probabilities involve an unknown constant (like \(k\)). You MUST use the second rule (\(\sum P(X=x) = 1\)) to form an equation and solve for \(k\).
Key Takeaway for Section 2
The probability distribution shows us the likelihood of every possible outcome. If the probabilities don't sum to 1, you've made a mistake or the distribution is invalid!
Section 3: Expectation (The Mean)
The Expectation (or Expected Value) of a discrete random variable \(X\), denoted as \(E(X)\) or \(\mu\) (the Greek letter mu), is the long-run average value of the variable.
Analogy: If \(X\) represents the winnings in a game, \(E(X)\) is the amount you would expect to win, on average, if you played the game thousands of times. It tells you the center point of the distribution.
3.1 The Formula for Expectation, \(E(X)\)
To calculate the expected value, you multiply each possible outcome \(x\) by its corresponding probability \(P(X=x)\), and then sum up all those results.
$$E(X) = \mu = \sum x P(X=x)$$
3.2 Step-by-Step Calculation of \(E(X)\)
Let's use a quick example: A game where you can win \$1, \$2, or \$5 with given probabilities.
| \(x\) | 1 | 2 | 5 |
| \(P(X=x)\) | 0.5 | 0.3 | 0.2 |
-
Create a third row/column: Calculate \(x P(X=x)\) for each value.
- \(1 \times 0.5 = 0.5\)
- \(2 \times 0.3 = 0.6\)
- \(5 \times 0.2 = 1.0\)
-
Sum the results:
$$E(X) = 0.5 + 0.6 + 1.0 = 2.1$$
The expected value is 2.1. Notice that 2.1 is not an actual outcome in the table (you can't win \$2.10), but it represents the average outcome over many trials.
3.3 Expectation of a Function of X, \(E(g(X))\)
Sometimes, you need to find the expected value of a function of \(X\), like \(E(X^2)\) or \(E(2X-1)\). The rule remains the same:
$$E(g(X)) = \sum g(x) P(X=x)$$
To calculate \(E(X^2)\), you simply square the \(x\) values first, then multiply by the probability. This specific calculation is essential for finding the variance (see Section 4).
Key Takeaway for Section 3
Expectation \(E(X)\) tells you the weighted average of the outcomes. It’s the ‘center’ of your distribution. Remember the formula: "x times P(x), then sum."
Section 4: Measuring Spread (Variance and Standard Deviation)
While \(E(X)\) tells us the average, it doesn't tell us how spread out the results are. A game where you always win \$2 is less risky than a game where you either lose \$100 or win \$104, even if both have an expected value of \$2.
Variance and Standard Deviation measure this spread or risk.
4.1 Variance, \(Var(X)\) or \(\sigma^2\)
The variance is the expected squared difference between the outcome and the mean. While the definition formula is:
$$Var(X) = \sum (x - \mu)^2 P(X=x)$$
This definition is messy to calculate! Instead, we use a much simpler identity (which you must know and apply):
$$Var(X) = E(X^2) - [E(X)]^2$$
Memory Aid: "The Expectation of the Squares MINUS the Square of the Expectation."
4.2 Step-by-Step Calculation of Variance
To find \(Var(X)\), you need two things:
- \(E(X)\): The mean (calculated in Section 3).
- \(E(X^2)\): The expectation of \(X\) squared.
Example Continuing from 3.2 (where \(E(X) = 2.1\)):
-
Calculate \(x^2\) for each outcome:
- \(1^2 = 1\)
- \(2^2 = 4\)
- \(5^2 = 25\)
-
Calculate \(E(X^2) = \sum x^2 P(X=x)\):
- \(1 \times 0.5 = 0.5\)
- \(4 \times 0.3 = 1.2\)
- \(25 \times 0.2 = 5.0\)
- Apply the variance formula: $$Var(X) = E(X^2) - [E(X)]^2$$ $$Var(X) = 6.7 - (2.1)^2$$ $$Var(X) = 6.7 - 4.41 = 2.29$$
!!! Common Mistake Alert !!!
Do NOT forget the square brackets! Students often calculate \(E(X^2)\) correctly but forget to square the \(E(X)\) term at the end.
4.3 Standard Deviation, \(\sigma\)
The Standard Deviation (SD) is simply the square root of the variance, \(\sigma = \sqrt{Var(X)}\).
- It is preferred because it is measured in the same units as \(X\) and \(E(X)\).
In our example: $$SD(X) = \sqrt{2.29} \approx 1.513$$
Key Takeaway for Section 4
Variance (\(Var(X)\)) measures the spread of the data. Always use the shortcut formula: \(E(X^2) - [E(X)]^2\). The Standard Deviation is just the square root of that result.
Section 5: Linear Transformations (Coding)
What happens if you scale or shift the random variable? For instance, if \(X\) is a score in points, but you want to find the statistics for \(Y\), where \(Y = 2X + 5\) (doubling the score and adding 5 bonus points).
If \(Y = aX + b\), where \(a\) and \(b\) are constants, the new mean and variance follow these rules:
5.1 The Effect on Expectation (Mean)
Expectation is affected by both scaling (\(a\)) and shifting (\(b\)).
$$E(aX + b) = a E(X) + b$$
This makes perfect sense: if you double the scores and add 5, the average score will also double and increase by 5.
5.2 The Effect on Variance (Spread)
Variance measures spread. Shifting the entire distribution (adding \(b\)) does not change how spread out the values are, only where the center is. Therefore, \(b\) has no effect on variance.
However, scaling the distribution by \(a\) increases the spread by \(a^2\).
$$Var(aX + b) = a^2 Var(X)$$
5.3 The Effect on Standard Deviation
Since SD is the square root of variance, the change is simpler:
$$SD(aX + b) = |a| SD(X)$$
We use \(|a|\) because SD must be positive.
Did You Know? (Why \(a^2\)?)
The variance is measured in squared units of the original variable. If \(X\) is measured in meters (m), \(Var(X)\) is measured in \(m^2\). If you scale \(X\) by 3, you are scaling the units by 3, meaning the variance (the squared units) must be scaled by \(3^2 = 9\).
Summary of Linear Coding Rules
| Statistic of \(Y = aX + b\) | Transformation Rule |
| \(E(Y)\) | \(a E(X) + b\) |
| \(Var(Y)\) | \(a^2 Var(X)\) |
| \(SD(Y)\) | \(|a| SD(X)\) |
Key Takeaway for Section 5
Linear transformations affect the mean exactly as expected (scale and shift). However, they only affect the variance through the scaling factor (\(a^2\)); shifting (\(b\)) has no impact on the spread.
Chapter Review: Discrete Random Variables
You've covered the core foundation of statistical modeling! Here is a final quick checklist:
- A DRV takes countable, numerical values.
- A valid Probability Distribution must have probabilities summing to 1.
- Expectation (Mean): \(E(X) = \sum x P(X=x)\). (The center of the data.)
- Variance (Spread): \(Var(X) = E(X^2) - [E(X)]^2\). (The risk/spread of the data.)
- Linear Coding \(Y=aX+b\): \(E(Y)=aE(X)+b\) and \(Var(Y)=a^2 Var(X)\).
Great job! Master these core concepts and calculations, and you will be well prepared for the specific named distributions (like the Binomial Distribution) that build upon this general theory. Keep practicing those calculation steps!