Moment generating functions (mgf) - Further Mathematics (9665) - Oxford AQA A-Level

* The content provided by thinka is generated by AI and may not always be accurate or up-to-date. Please use it as a supplementary resource and verify with official materials.

Moment Generating Functions (MGFs): The Statistical Super-Tool

Hello future Further Mathematicians! Welcome to one of the most powerful and elegant topics in statistics: Moment Generating Functions (MGFs).

This chapter might look intimidating due to the exponential functions and calculus involved, but don't worry! An MGF is essentially a secret weapon—a single, clever function that packages up all the most important information (like the mean and variance) about a probability distribution.

By the end of these notes, you will understand how to define an MGF, use it to quickly find the mean and variance of a distribution, and combine MGFs to analyse sums of independent variables.

What is a Moment?

In statistics, a moment is a specific type of expected value.

The first moment is the expectation, \(E(X)\), which is the mean (\(\mu\)).
The second moment is \(E(X^2)\).

The MGF is named because its derivatives, when evaluated at \(t=0\), generate these moments!

1. Defining the Moment Generating Function, \(M_X(t)\)

The Definition

The Moment Generating Function (MGF) of a random variable \(X\) is the expected value of \(e^{tX}\), where \(t\) is a dummy variable (a placeholder).

It is denoted by \(M_X(t)\):

\[ M_X(t) = E(e^{tX}) \]

Calculating \(M_X(t)\) (The Formulas)

How you calculate the expectation depends on whether the random variable \(X\) is discrete or continuous.

(a) For Discrete Random Variables (like Poisson)

If \(X\) is discrete with probability mass function (p.m.f.) \(P(X=x) = p_i\):

\[ M_X(t) = \sum e^{tx} p_i \]

(This is a summation over all possible values \(x\).)

(b) For Continuous Random Variables (like Exponential, Normal)

If \(X\) is continuous with probability density function (p.d.f.) \(f(x)\):

\[ M_X(t) = \int_{-\infty}^{\infty} e^{tx} f(x) dx \]

(This is an integration over the entire range of \(x\).)

Quick Review: The MGF Recipe

The MGF is simply the Expected Value of \(e^{tX}\). You just replace \(X\) in the standard \(E(g(X))\) formula with the specific function \(g(X) = e^{tX}\).

2. Key Properties: Finding the Mean and Variance

This is where MGFs truly shine. Instead of using the lengthy methods for finding \(E(X)\) and \(E(X^2)\) (summing or integrating \(x p_i\) or \(x f(x)\), etc.), we can use differentiation.

Finding the Mean (\(\mu\))

The mean, \(\mu = E(X)\), is found by differentiating the MGF once and then setting \(t=0\).

\[ \mu = E(X) = M'_X(0) \]

Step-by-Step for the Mean:

Find the first derivative of \(M_X(t)\), denoted \(M'_X(t)\).
Substitute \(t=0\) into \(M'_X(t)\).

Finding the Variance (\(\sigma^2\))

The variance, \(\sigma^2 = Var(X) = E(X^2) - [E(X)]^2\), requires the second moment, \(E(X^2)\).

The second moment is found by differentiating the MGF twice and then setting \(t=0\).

\[ E(X^2) = M''_X(0) \]

The final variance formula is:

\[ \sigma^2 = M''_X(0) - [M'_X(0)]^2 \]

Step-by-Step for the Variance:

Find the second derivative of \(M_X(t)\), denoted \(M''_X(t)\).
Substitute \(t=0\) into \(M''_X(t)\). This gives you \(E(X^2)\).
Use the result from finding the mean, \(\mu = M'_X(0)\).
Calculate \(\sigma^2 = E(X^2) - \mu^2\).

Memory Aid: The number of dashes (derivatives) tells you the power of \(X\) in the expectation. \(M'_X(0) \rightarrow E(X^1)\). \(M''_X(0) \rightarrow E(X^2)\). Always evaluate at \(t=0\)!

⚠️ Common Mistake Alert!

Do NOT forget to substitute \(t=0\) after differentiating. If you substitute \(t=0\) first, the MGF will simply be \(M_X(0) = E(e^0) = E(1) = 1\), and the derivatives of a constant (1) are 0! You must differentiate, then substitute \(t=0\).

3. MGFs of Standard Distributions (Derivations Required)

The syllabus requires you to know the MGFs and their derivations for the Poisson, Exponential, and Normal distributions. While we won't show the full algebraic proof here, knowing the final forms is essential for applications.

3.1. Poisson Distribution

If \(X \sim \text{Po}(\lambda)\), then the MGF is:

\[ M_X(t) = e^{\lambda(e^t - 1)} \]

(Derivation involves using the Taylor series for \(e^u\).)

Did you know? Using the differentiation rules on this MGF confirms that for a Poisson distribution, \(\mu = \lambda\) and \(\sigma^2 = \lambda\).

3.2. Exponential Distribution

If \(X \sim \text{Exp}(\lambda)\), the p.d.f. is \(f(x) = \lambda e^{-\lambda x}\) for \(x \geq 0\). The MGF is:

\[ M_X(t) = \frac{\lambda}{\lambda - t}, \quad \text{for } t < \lambda \]

(Derivation involves calculating the integral \(\int_0^\infty e^{tx} \cdot \lambda e^{-\lambda x} dx\).)

3.3. Normal Distribution

If \(X \sim N(\mu, \sigma^2)\), this MGF is surprisingly complex to derive, but its final form is very neat:

\[ M_X(t) = e^{\mu t + \frac{1}{2}\sigma^2 t^2} \]

Key Takeaway (Standard MGFs)

Recognising these specific forms is vital. If a calculation yields the exponential MGF form, you immediately know the distribution and its parameter \(\lambda\).

4. Manipulating MGFs: Linear Combinations and Sums

MGFs are essential because they make manipulating random variables straightforward, often replacing complex convolutions or density transformations with simple algebra.

4.1. Linear Transformations: \(Y = a + bX\)

If we define a new variable \(Y\) by scaling and shifting \(X\), its MGF \(M_Y(t)\) is easily found:

\[ M_{a+bX}(t) = e^{at} M_X(bt) \]

How does this work?
We use the definition: \(M_{a+bX}(t) = E(e^{t(a+bX)}) = E(e^{at} e^{btX})\). Since \(e^{at}\) is a constant (it doesn't depend on the random variable \(X\)), we can pull it out of the expectation: \(M_{a+bX}(t) = e^{at} E(e^{btX})\). And since \(E(e^{uX}) = M_X(u)\), we have \(E(e^{btX}) = M_X(bt)\).

This property is particularly useful when standardising a random variable (e.g., \(Z = \frac{X - \mu}{\sigma}\)), where \(a = -\frac{\mu}{\sigma}\) and \(b = \frac{1}{\sigma}\).

4.2. Sum of Independent Random Variables

One of the most important results in this section is how MGFs handle the sum of independent variables.

If \(X_1\) and \(X_2\) are independent random variables, and \(Z = X_1 + X_2\), then the MGF of the sum is the product of their individual MGFs:

\[ M_{X_1 + X_2}(t) = M_{X_1}(t) \cdot M_{X_2}(t) \]

Analogy: Think of MGFs as the 'DNA' or 'blueprint' of a distribution. When you combine two independent systems (like adding two scores), you multiply their statistical blueprints to get the blueprint of the total score.

Application: Why is this powerful?

This property is essential for proving statistical theorems, especially involving the Normal and Poisson distributions.

Example 1 (Normal): If \(X_1 \sim N(\mu_1, \sigma_1^2)\) and \(X_2 \sim N(\mu_2, \sigma_2^2)\) are independent, we multiply their MGFs:
\(M_{X_1+X_2}(t) = e^{\mu_1 t + \frac{1}{2}\sigma_1^2 t^2} \cdot e^{\mu_2 t + \frac{1}{2}\sigma_2^2 t^2}\)
\(M_{X_1+X_2}(t) = e^{(\mu_1 + \mu_2) t + \frac{1}{2}(\sigma_1^2 + \sigma_2^2) t^2}\)

This resulting MGF is clearly the MGF of a Normal distribution \(N(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)\). This proves that the sum of two independent Normal variables is also Normal!

Example 2 (Poisson): If \(X_1 \sim \text{Po}(\lambda_1)\) and \(X_2 \sim \text{Po}(\lambda_2)\) are independent, multiplying their MGFs proves that \(X_1 + X_2 \sim \text{Po}(\lambda_1 + \lambda_2)\).

5. Study Tips and Key Mathematical Skills

Prerequisite Calculus Skills

Success in this chapter relies heavily on your ability to differentiate exponential and algebraic functions accurately.

You must be comfortable with the following:

Chain Rule: Essential when differentiating MGFs that have composite functions, such as \(M_X(t) = e^{\lambda(e^t - 1)}\). Remember to multiply by the derivative of the inner function (\(\lambda e^t\)).
Product Rule: Required if your differentiation results in a product of two functions of \(t\) (e.g., \(t \cdot e^{-t}\)).
Basic Differentiation of \(e^{kt}\): \(\frac{d}{dt}(e^{kt}) = k e^{kt}\).

The MGF Uniqueness Theorem

The MGF Uniqueness Theorem states that if two random variables have the same MGF, they must have the same distribution.

Why is this important? It means if you calculate the MGF of a sum of variables and the result looks exactly like the MGF of a standard distribution (like the Normal MGF), you have automatically identified the distribution of the sum without further work!

Quick Recap Checklist

When solving an MGF problem, ask yourself:

Am I dealing with a discrete (sum) or continuous (integral) variable?
To find the mean, did I differentiate once and set \(t=0\)? \(M'_X(0)\)
To find \(E(X^2)\), did I differentiate twice and set \(t=0\)? \(M''_X(0)\)
When combining independent variables, did I multiply the MGFs?