Welcome to Discrete Random Variables!

Hello! If you’ve ever tried to predict the outcome of a dice roll, a sports match, or even how many times you’ll hit traffic lights on your commute, you’ve been thinking about probability. This chapter, Discrete Random Variables, is where we stop just calculating simple chances and start calculating the expected outcomes and spread of those outcomes in a structured way.

Don't worry if this seems tricky at first—we are simply taking the concepts of probability and linking them to the concepts of mean and spread you already know from descriptive statistics. Let's dive in!

1. Defining Discrete Random Variables (DRVs)

What is a Random Variable?

A Random Variable, usually denoted by $X$ or $Y$, is essentially a function that assigns a numerical value to every possible outcome of a random experiment.

A Discrete Random Variable (DRV) is a random variable that can only take on a countable number of values. These values often, but not always, have to be integers (whole numbers).

  • Example of Discrete: The number of heads when flipping a coin 3 times (outcomes: 0, 1, 2, 3).
  • Example of NOT Discrete (Continuous): The height of a student (can be any value within a range).

The Probability Distribution

The Probability Distribution (or Probability Mass Function, PMF) of a DRV tells us exactly which values $X$ can take and the probability associated with each value.

Distributions are typically presented in two ways:

  1. A Table: The most common way you will see these problems.
  2. A Simple Function: A formula that generates the probability $P(X=x)$ for a given value $x$.

Key Rule: The Sum of Probabilities

The total probability for all possible outcomes must always equal 1.

$P(X=x_i) \ge 0$ for all $i$.
Important Formula: \(\sum P(X=x) = 1\)

Example: Rolling a biased die.

$x$ 1 2 3 4
$P(X=x)$ 0.1 0.2 $k$ 0.4

Since the probabilities must sum to 1: $0.1 + 0.2 + k + 0.4 = 1$. Therefore, $k = 1 - 0.7 = 0.3$.


Key Takeaway for Section 1: A DRV has countable outcomes. Its distribution lists these outcomes and their probabilities, which must always sum up to 1.


2. Measures of Central Tendency: The Mean (Expected Value)

When we talk about the "mean" of a random variable, we use the term Expected Value. It’s what you would expect the average outcome to be if you repeated the experiment many, many times.

We denote the Expected Value as $E(X)$ or the Greek letter $\mu$ (mu).

Calculating Expected Value \(E(X)\)

The expected value is a weighted average. We weight each outcome ($x$) by how likely it is ($P$).

Formula: $$E(X) = \mu = \sum x_i P(X=x_i)$$

Analogy: Imagine your score in a module depends on four tasks. If Task 1 is worth 10% and Task 4 is worth 40%, you weight the score for Task 4 much higher. Similarly, an outcome with a higher probability contributes more to the expected value.

Step-by-Step Example (using the biased die from before):

$x$ 1 2 3 4
$P(X=x)$ 0.1 0.2 0.3 0.4

1. Multiply each $x$ value by its probability $P(X=x)$.
\(1 \times 0.1 = 0.1\)
\(2 \times 0.2 = 0.4\)
\(3 \times 0.3 = 0.9\)
\(4 \times 0.4 = 1.6\)
2. Sum the results:
\(E(X) = 0.1 + 0.4 + 0.9 + 1.6 = 3.0\)

The expected value is $3.0$. Note that $E(X)$ does not have to be an actual possible outcome (e.g., if we expected $2.5$ heads, that would be fine, even though you can't have half a head).


Key Takeaway for Section 2: The Expected Value $E(X)$ is the long-run average outcome, calculated by summing $x \times P(X=x)$ for all outcomes.


3. Measures of Spread: Variance and Standard Deviation

The mean tells us the center, but we also need to know the spread—how far, on average, the values deviate from that mean. This is measured using Variance, $Var(X)$, and Standard Deviation, $\sigma$.

3.1. Calculating Variance \(Var(X)\)

The variance is the expected value of the squared difference between the outcome and the mean.

Definition Formula (Less used for calculation):
$$Var(X) = E((X - \mu)^2) = \sum (x_i - \mu)^2 P(X=x_i)$$

This formula can be complicated to use because you calculate $\mu$, then subtract it from every $x$, square the result, multiply by $P$, and then sum them up.

Calculation Formula (The one you should master):

This equivalent formula is much easier computationally:

$$Var(X) = E(X^2) - [E(X)]^2$$

To use this, you need two steps:

  1. Calculate $E(X^2)$: The Expected Value of $X$ squared.
  2. Square $E(X)$: Square the mean you found in Section 2.

Calculating \(E(X^2)\): $$E(X^2) = \sum x_i^2 P(X=x_i)$$

Don't confuse \(E(X^2)\) with \([E(X)]^2\)!

Step-by-Step Example (Continuing from the biased die, where $E(X)=3.0$):

$x$ 1 2 3 4
$x^2$ 1 4 9 16
$P(X=x)$ 0.1 0.2 0.3 0.4

1. Calculate $E(X^2)$: Multiply each $x^2$ value by its probability $P(X=x)$.
\(E(X^2) = (1 \times 0.1) + (4 \times 0.2) + (9 \times 0.3) + (16 \times 0.4)\)
\(E(X^2) = 0.1 + 0.8 + 2.7 + 6.4 = 10.0\)

2. Calculate $Var(X)$: Use $Var(X) = E(X^2) - [E(X)]^2$.
We know $E(X)=3.0$, so $[E(X)]^2 = 3.0^2 = 9.0$.
\(Var(X) = 10.0 - 9.0 = 1.0\)

3.2. Standard Deviation \(\sigma\)

The Standard Deviation ($\sigma$) is simply the square root of the variance. It is useful because it is in the same units as the random variable $X$.

$$ \sigma = \sqrt{Var(X)} $$

In our example, $\sigma = \sqrt{1.0} = 1.0$.

Common Mistakes to Avoid!

  • DO NOT use the frequency formula for variance that you use for raw data (dividing by $n$). Here, you are dividing by 1 (the total probability) implicitly. Stick strictly to $E(X^2) - [E(X)]^2$.
  • Remember that $E(X^2)$ is not the same as $E(X)$ squared!

Key Takeaway for Section 3: Variance measures spread. Use the easier formula: $Var(X) = E(X^2) - [E(X)]^2$. Standard deviation is the square root of variance.


4. Functions of a Discrete Random Variable

Sometimes we are interested not in $X$ itself, but in a new variable that is a function of $X$, such as $Y = X^2$ or $Z = 3X + 5$.

4.1. Expected Value of a General Function \(E(g(X))\)

If $Y = g(X)$ is any function of $X$, the expected value of $Y$ is found by applying the function to the outcomes first, and then weighting by the probabilities.

Formula: $$E(g(X)) = \sum g(x_i) P(X=x_i)$$

Note: You already did this when calculating $E(X^2)$! Here, $g(X)=X^2$.

4.2. Linear Functions: $Y = aX + b$ (Scaling and Shifting)

This is the most common type of function encountered. $a$ is a scaling factor and $b$ is a shift.

Analogy: Converting temperature from Celsius (X) to Fahrenheit (Y). $Y = 1.8X + 32$.

Rules for Expected Value (Mean):

The expected value follows the transformation exactly.

$$E(aX + b) = aE(X) + b$$
Rules for Variance and Standard Deviation (Spread):

Spread is affected by scaling ($a$), but not by shifting ($b$). If you add 10 to every outcome, the average increases by 10, but the spread of the data remains the same.

Crucially, because variance measures squared distance, the scaling factor $a$ becomes $a^2$:

$$Var(aX + b) = a^2Var(X)$$ $$SD(aX + b) = |a|SD(X)$$

Example: If $E(X)=10$ and $Var(X)=4$. Find $E(2X - 5)$ and $Var(2X - 5)$.

\(E(2X - 5) = 2E(X) - 5 = 2(10) - 5 = 15\)
\(Var(2X - 5) = 2^2 Var(X) = 4(4) = 16\)


Key Takeaway for Section 4: For linear transformations, the mean follows the rule ($E(aX+b) = aE(X)+b$), but the variance is only affected by the scaling factor, which is squared ($Var(aX+b) = a^2Var(X)$). The constant $b$ does nothing to the spread.


5. Combining Independent Random Variables

Often, we deal with the combined results of two separate, unrelated experiments.

Two random variables, $X$ and $Y$, are independent if the outcome of one does not affect the outcome of the other (e.g., rolling two different dice).

5.1. Rules for Expected Value (Sum or Difference)

The expected value of a sum or difference is simply the sum or difference of the individual expected values. This applies whether or not $X$ and $Y$ are independent.

$$E(aX + bY) = aE(X) + bE(Y)$$ $$E(X - Y) = E(X) - E(Y)$$

5.2. Rules for Variance (Sum or Difference) - The Independence Requirement

For variance rules to hold, $X$ and $Y$ MUST BE INDEPENDENT.

When combining independent variables, the spread always increases, whether you are adding them or subtracting them. This is because combining results introduces more overall randomness.

Crucial Formula: $$Var(aX + bY) = a^2Var(X) + b^2Var(Y)$$ $$Var(aX - bY) = a^2Var(X) + b^2Var(Y)$$

Memory Aid: Variance is ALWAYS Added

If $X$ and $Y$ are independent, Variance is always positive and always added.

Example: If $E(X)=10$, $Var(X)=4$, $E(Y)=20$, $Var(Y)=9$. Find $E(X-Y)$ and $Var(3X + 2Y)$.

\(E(X - Y) = E(X) - E(Y) = 10 - 20 = -10\)
\(Var(3X + 2Y) = 3^2 Var(X) + 2^2 Var(Y)\)
\(Var(3X + 2Y) = 9(4) + 4(9) = 36 + 36 = 72\)

5.3. Sum of \(n\) Independent Random Variables

This rule extends naturally. If $X_1, X_2, ..., X_n$ are independent DRVs (e.g., $n$ rolls of a die), then:

$$E(\sum X_i) = \sum E(X_i)$$ $$Var(\sum X_i) = \sum Var(X_i)$$

Did you know? If you roll a single fair die 10 times, the expected total score is $10 \times E(X)$, and the total variance is $10 \times Var(X)$.

Quick Review Box: The Core Formulae

Let $X$ and $Y$ be independent DRVs, and $a, b$ be constants.

Expected Value (Mean):

$E(X) = \sum x P(X=x)$
$E(aX + b) = aE(X) + b$
$E(X \pm Y) = E(X) \pm E(Y)$

Variance (Spread):

$Var(X) = E(X^2) - [E(X)]^2$
$Var(aX + b) = a^2Var(X)$ (Constant $b$ disappears!)
$Var(aX \pm bY) = a^2Var(X) + b^2Var(Y)$ (Variance is always added for independent variables!)