Probability Generating Functions (PGFs): Study Notes (9231 Further Probability & Statistics)

Hello there! This chapter introduces one of the most elegant tools in Further Statistics: the Probability Generating Function (PGF). Don't worry if it sounds complicated—it’s essentially a clever way to package all the probability information of a discrete random variable into a single function. Once packaged, we can use the powerful tools of calculus (differentiation!) to easily find the mean and variance. Pretty neat, right?

This topic is essential for Paper 4 and focuses exclusively on discrete random variables.


1. Understanding the Concept and Definition of a PGF

1.1 What is a PGF?

The PGF, usually denoted by \(G_X(t)\), is an algebraic function that uses the powers of a dummy variable, \(t\), to store the probability distribution of a discrete random variable \(X\).

Think of it like a filing cabinet: the index number (the exponent of \(t\)) tells you which outcome you’re looking at, and the contents of the file (the coefficient of \(t\)) tell you the probability of that outcome.

1.2 The Formal Definition

For a discrete random variable \(X\) which takes values \(x_i\) with probabilities \(P(X=x_i)\), the PGF is defined as the expected value of \(t^X\):

Definition: $$G_X(t) = E(t^X) = \sum_{x} P(X=x) t^x$$

Since the probabilities must sum to 1, if we set \(t=1\), we must get 1:

$$G_X(1) = \sum P(X=x) (1)^x = \sum P(X=x) = 1$$

Memory Aid: Always check that G(1) = 1. This is a great way to verify your PGF formulation!

1.3 Extracting Probabilities

The beauty of the PGF is that if you know the function \(G_X(t)\), you can find the probability of any specific outcome simply by looking at the coefficient of the corresponding power of \(t\).

  • \(P(X=0)\) is the coefficient of \(t^0\).
  • \(P(X=1)\) is the coefficient of \(t^1\).
  • \(P(X=k)\) is the coefficient of \(t^k\).

If the PGF is a simple polynomial, you can read the probabilities directly. If it is a more complex function (like those involving \(e\) or fractions), you may need to use the Maclaurin series expansion (a concept from Further Pure Mathematics) to find the coefficients.

$$P(X=k) = \frac{G_X^{(k)}(0)}{k!}$$ (The \(k^{th}\) derivative evaluated at \(t=0\), divided by \(k\)!)

Key Takeaway: The PGF \(G_X(t)\) is a sum of terms where the exponent of \(t\) is the outcome and the coefficient is the probability of that outcome. \(G_X(1)=1\).

2. Constructing PGFs for Standard Distributions

You must be able to construct and use the PGFs for the most common discrete distributions.

2.1 Discrete Uniform Distribution (D(n))

If \(X\) is the outcome of a fair die roll (\(X=1, 2, 3, 4, 5, 6\)), then \(P(X=x) = 1/6\) for all outcomes.

$$G_X(t) = \sum_{x=1}^{6} P(X=x) t^x = \frac{1}{6}t^1 + \frac{1}{6}t^2 + ... + \frac{1}{6}t^6$$ $$G_X(t) = \frac{1}{n} \sum_{x=1}^{n} t^x \quad \text{or generally} \quad G_X(t) = \frac{t(1-t^n)}{n(1-t)}$$ (Note: This is a Geometric Progression sum!)

2.2 Binomial Distribution (\(X \sim B(n, p)\))

We use the definition \(G_X(t) = \sum P(X=x) t^x\). Recall that \(P(X=x) = \binom{n}{x} p^x q^{n-x}\), where \(q=1-p\).

$$G_X(t) = \sum_{x=0}^{n} \left[ \binom{n}{x} q^{n-x} p^x \right] t^x$$ $$G_X(t) = \sum_{x=0}^{n} \binom{n}{x} q^{n-x} (pt)^x$$

By the Binomial Theorem, this sum simplifies beautifully:

$$G_X(t) = (q + pt)^n$$

2.3 Geometric Distribution (\(X \sim Geo(p)\))

The Geometric distribution (number of failures before the first success, or number of trials until the first success, depending on the definition used in your base A Level course - we will use the definition where \(P(X=x) = p q^{x-1}\) for \(x=1, 2, 3...\))

$$G_X(t) = \sum_{x=1}^{\infty} P(X=x) t^x = \sum_{x=1}^{\infty} p q^{x-1} t^x$$

Factoring out \(pt\):

$$G_X(t) = pt \sum_{x=1}^{\infty} q^{x-1} t^{x-1} = pt \sum_{k=0}^{\infty} (qt)^k$$

Since this is a geometric series with common ratio \(qt\), and we assume \(|qt| < 1\), the sum is \(\frac{1}{1-qt}\).

$$G_X(t) = \frac{pt}{1 - qt}$$

2.4 Poisson Distribution (\(X \sim Po(\lambda)\))

Recall \(P(X=x) = \frac{e^{-\lambda} \lambda^x}{x!}\) for \(x=0, 1, 2...\)

$$G_X(t) = \sum_{x=0}^{\infty} \left[ \frac{e^{-\lambda} \lambda^x}{x!} \right] t^x$$

We can factor out \(e^{-\lambda}\) and rearrange:

$$G_X(t) = e^{-\lambda} \sum_{x=0}^{\infty} \frac{(\lambda t)^x}{x!}$$

The sum inside is the Maclaurin series expansion of \(e^{\lambda t}\).

$$G_X(t) = e^{-\lambda} e^{\lambda t} = e^{\lambda t - \lambda}$$

$$G_X(t) = e^{\lambda(t-1)}$$

Standard PGF Summary (Must Memorise/Derive Quickly)

  • Binomial: \(G_X(t) = (q + pt)^n\)
  • Poisson: \(G_X(t) = e^{\lambda(t-1)}\)
  • Geometric: \(G_X(t) = \frac{pt}{1 - qt}\)

3. Using PGFs to Find Mean and Variance

This is where the magic happens! We use differentiation (calculus) to calculate the mean and variance, making complex expectation calculations much easier.

The key properties rely on evaluating the first and second derivatives of \(G_X(t)\) at \(t=1\).

3.1 The Mean (Expected Value)

The mean, \(E(X)\), is found by calculating the first derivative of the PGF and evaluating it at \(t=1\):

$$E(X) = G'_X(1)$$

Did you know? This works because when you differentiate the PGF with respect to \(t\), the coefficient of \(t^{x-1}\) becomes \(x P(X=x)\). Setting \(t=1\) then gives you \(\sum x P(X=x)\), which is the definition of the mean, \(E(X)\).

3.2 The Variance (\(Var(X)\))

The variance is a little more complex, requiring the second derivative. The formula provided in the MF19 booklet is:

$$Var(X) = G''_X(1) + G'_X(1) - \{G'_X(1)\}^2$$

This formula may look scary, but notice that: $$E(X) = G'_X(1)$$ and $$E[X(X-1)] = G''_X(1)$$

So, the variance formula is just an adaptation of the familiar identity \(Var(X) = E(X^2) - [E(X)]^2\), rewritten using PGFs.

Step-by-Step Calculation Process:

  1. Find the first derivative, \(G'_X(t)\).
  2. Calculate the mean, \(E(X)\), by evaluating \(G'_X(1)\).
  3. Find the second derivative, \(G''_X(t)\). (Remember to use the product rule if necessary!)
  4. Calculate \(G''_X(1)\).
  5. Substitute \(G''_X(1)\) and \(G'_X(1)\) into the variance formula: \(Var(X) = G''_X(1) + E(X) - [E(X)]^2\).

Common Mistake to Avoid: Always evaluate the derivatives at \(t=1\) AFTER differentiation, not before! If you calculate \(G'_X(1)\) and then differentiate it again, you will just get zero.

Example: Finding Mean and Variance of \(X \sim Po(\lambda)\)

We know \(G_X(t) = e^{\lambda(t-1)}\).

1. First Derivative: $$G'_X(t) = \lambda e^{\lambda(t-1)}$$

2. Mean: Evaluate at \(t=1\). $$E(X) = G'_X(1) = \lambda e^{\lambda(1-1)} = \lambda e^0 = \lambda$$ (Matches the known result for Poisson!)

3. Second Derivative: Differentiate \(G'_X(t)\) again. $$G''_X(t) = \frac{d}{dt} \left[ \lambda e^{\lambda(t-1)} \right] = \lambda (\lambda e^{\lambda(t-1)}) = \lambda^2 e^{\lambda(t-1)}$$

4. Evaluate \(G''_X(1)\): $$G''_X(1) = \lambda^2 e^{\lambda(1-1)} = \lambda^2$$

5. Variance: $$Var(X) = G''_X(1) + G'_X(1) - \{G'_X(1)\}^2$$ $$Var(X) = \lambda^2 + \lambda - (\lambda)^2$$ $$Var(X) = \lambda$$ (Matches the known result for Poisson! This confirms the PGF method works perfectly.)

Key Takeaway: PGFs allow us to find \(E(X)\) and \(Var(X)\) using simple differentiation and evaluation at \(t=1\). Use the formulas provided in the MF19 booklet exactly.

4. The Sum of Independent Random Variables

This is arguably the most powerful application of PGFs—it allows us to easily combine independent random variables.

4.1 The Multiplication Rule

If \(X\) and \(Y\) are two independent discrete random variables, and \(Z = X + Y\), then the PGF of the sum \(Z\) is the product of the individual PGFs:

$$G_Z(t) = G_X(t) \times G_Y(t)$$

This result holds true for any number of independent variables being summed.

Analogy: If PGFs are "probability recipe books," multiplying them is like mixing two separate recipes together to get the recipe for the combined outcome.

4.2 Applications: Summing Standard Distributions

This property is particularly useful for showing that the sum of two variables of the same distribution type maintains that type (but with updated parameters).

Example 1: Sum of Independent Binomial Variables
Let \(X_1 \sim B(n_1, p)\) and \(X_2 \sim B(n_2, p)\), and let \(Z = X_1 + X_2\). (Note: they must have the same probability \(p\)).

  • \(G_{X_1}(t) = (q + pt)^{n_1}\)
  • \(G_{X_2}(t) = (q + pt)^{n_2}\)

The PGF of the sum is: $$G_Z(t) = G_{X_1}(t) G_{X_2}(t) = (q + pt)^{n_1} (q + pt)^{n_2} = (q + pt)^{n_1 + n_2}$$ Since \(G_Z(t)\) has the form of a Binomial PGF, we conclude that \(Z \sim B(n_1 + n_2, p)\).

Example 2: Sum of Independent Poisson Variables
Let \(X_1 \sim Po(\lambda_1)\) and \(X_2 \sim Po(\lambda_2)\), and let \(Z = X_1 + X_2\).

  • \(G_{X_1}(t) = e^{\lambda_1(t-1)}\)
  • \(G_{X_2}(t) = e^{\lambda_2(t-1)}\)

The PGF of the sum is: $$G_Z(t) = G_{X_1}(t) G_{X_2}(t) = e^{\lambda_1(t-1)} e^{\lambda_2(t-1)}$$ $$G_Z(t) = e^{(\lambda_1 + \lambda_2)(t-1)}$$ Since \(G_Z(t)\) has the form of a Poisson PGF, we conclude that \(Z \sim Po(\lambda_1 + \lambda_2)\).

Quick Review: The PGF Toolbox

  • Finding probabilities: Coefficient of \(t^k\) is \(P(X=k)\).
  • Finding the mean: \(E(X) = G'_X(1)\).
  • Finding the variance: \(Var(X) = G''_X(1) + G'_X(1) - \{G'_X(1)\}^2\).
  • Summing independent variables: Multiply the PGFs: \(G_{X+Y}(t) = G_X(t) G_Y(t)\).