A Level Maths (9709) P3: Probability & Statistics 2
Chapter 6.5: Comprehensive Study Notes on Hypothesis Tests
Hello future statistician! Hypothesis testing might seem intimidating, but it is one of the most powerful and satisfying topics in statistics. It's essentially a formal way of deciding whether a belief (hypothesis) about a population parameter (like a mean or a probability) is supported by the evidence (sample data).
Think of it like a jury trial: we start with an assumption and use evidence to decide if that assumption should be rejected. Ready to get started? Let’s break it down!
Part 1: The Language of Hypothesis Testing
You need to be fluent in the vocabulary before you can solve problems. Here are the core terms:
1. The Hypotheses (The Claims)
Every test involves two opposing statements:
-
Null Hypothesis (\(H_0\)): This is the default or status quo assumption. It always contains an equality sign (\(=\)).
Example: The average height of students is 170 cm (\(\mu = 170\)). -
Alternative Hypothesis (\(H_1\)): This is what we suspect or are trying to find evidence for. It never contains an equality sign. It challenges \(H_0\).
Example: The average height is *not* 170 cm (\(\mu \neq 170\)).
2. One-Tailed vs. Two-Tailed Tests
This tells us the direction of the suspected change, defined entirely by \(H_1\):
-
Two-Tailed Test: Used when we are interested in whether the parameter has simply changed (either increased or decreased).
\(H_1\) uses \(\neq\). -
One-Tailed Test: Used when we suspect the parameter has changed in a specific direction (e.g., increased or decreased).
\(H_1\) uses \(<\) or \(>\).
Analogy: A two-tailed test asks, "Is this coin unfair?" A one-tailed test asks, "Is this coin biased towards heads?"
3. Significance Level and Regions
-
Significance Level (\(\alpha\)): This is the probability of rejecting \(H_0\) when it is actually true. It represents our maximum allowable risk of making an error (usually 5% or 1%).
If \(\alpha = 0.05\), we are willing to accept a 5% chance of rejecting a true \(H_0\). - Test Statistic: The value calculated from the sample data that is used to decide whether to reject \(H_0\).
- Critical Region (or Rejection Region): The range of values for the test statistic that leads to the rejection of \(H_0\). These are the "extreme" values.
- Acceptance Region: The range of values that leads to the conclusion that there is insufficient evidence to reject \(H_0\).
Quick Review Box: The Relationship The total probability area under the distribution curve is 1. The Critical Region has a total probability equal to \(\alpha\). For a two-tailed test with \(\alpha=0.05\), the critical region is split into two tails, each having an area of \(0.025\).
Part 2: The Five Steps to Carrying Out a Hypothesis Test
Regardless of whether you are using Binomial, Poisson, or Normal distribution, follow these steps methodically:
Step 1: State the Hypotheses (\(H_0\) and \(H_1\))
Define the population parameter (e.g., probability \(p\), or mean \(\mu\)) and state \(H_0\) and \(H_1\). Make sure \(H_0\) is an equality.
Step 2: Define the Significance Level and Test Type
State \(\alpha\) (e.g., 5%) and determine if the test is one-tailed or two-tailed (based on \(H_1\)).
Step 3: Calculate the Test Statistic (or find the Critical Region)
This is where the calculation begins. The method depends on the distribution you are using (see Parts 3a and 3b below).
Step 4: Make the Decision (Comparison)
Compare your result from Step 3 to the critical value or significance level:
- If using Critical Region: If the calculated Test Statistic falls into the Critical Region, Reject \(H_0\).
- If using \(p\)-value (Direct Probability): If the probability of observing the sample data (or something more extreme) is less than \(\alpha\), Reject \(H_0\).
Step 5: Interpret the Conclusion in Context
This is essential! State your final decision in the context of the original problem. Do not just say "Reject \(H_0\)".
Example: "There is sufficient evidence at the 5% significance level to conclude that the average height has increased."
Part 3: Testing with Specific Distributions
3a: Hypothesis Testing for Binomial and Poisson (Single Observation)
When testing a claim about the population probability (\(p\)) in a Binomial distribution \(B(n, p)\) or the mean rate (\(\lambda\)) in a Poisson distribution \(Po(\lambda)\), we often use the direct probability method for small samples.
Procedure Example (Binomial): A company claims 10% of their devices are faulty (\(p=0.1\)). A sample of 20 devices contains 5 faulty ones. Test the claim that \(p\) has increased (\(H_1: p > 0.1\)) at \(\alpha = 5\%\).
1. Hypotheses: \(H_0: p = 0.1\), \(H_1: p > 0.1\). (One-tailed, upper tail)
2. Distribution under \(H_0\): \(X \sim B(20, 0.1)\). Observed result: \(x=5\).
3. Calculate the $p$-value: We calculate the probability of observing 5 or more faulty devices, assuming \(H_0\) is true (i.e., using \(p=0.1\)).
\(p\text{-value} = P(X \geq 5 \text{ | } p=0.1)\)
\(P(X \geq 5) = 1 - P(X \leq 4)\)
(Using tables/calculator, suppose \(P(X \leq 4) = 0.9568\))
\(p\text{-value} = 1 - 0.9568 = 0.0432\)
4. Decision: Since \(0.0432 < 0.05\), the $p$-value is less than \(\alpha\). Reject \(H_0\).
5. Interpretation: There is evidence that the proportion of faulty devices has increased.
Important Note on Critical Regions for Discrete Data:
Since Binomial/Poisson distributions are discrete, the Critical Region must be defined by the first value \(k\) such that the cumulative probability is less than or equal to \(\alpha\).
- If \(H_1: p > p_0\), find the smallest \(k\) such that \(P(X \geq k) \leq \alpha\).
- If \(H_1: p < p_0\), find the largest \(k\) such that \(P(X \leq k) \leq \alpha\).
3b: Normal Approximation for Binomial and Poisson
When \(n\) is large (for Binomial) or \(\lambda\) is large (for Poisson), we use the Normal approximation, which turns the problem into a Z-test.
Conditions for Approximation:
- Binomial: \(n > 50\), and \(np > 5\) and \(nq > 5\). Use \(N(np, npq)\).
- Poisson: \(\lambda > 15\). Use \(N(\lambda, \lambda)\).
Crucial Step: Continuity Correction (CC)
Since we are approximating a discrete distribution with a continuous one, we MUST use continuity correction (CC).
Example: \(P(X \leq 10)\) becomes \(P(Y < 10.5)\). \(P(X > 15)\) becomes \(P(Y > 15.5)\).
3c: Hypothesis Test Concerning Population Mean (\(\mu\))
This test is used when investigating a claim about the population mean. It is always a Z-test if the sample is large OR if the population is Normal with known variance.
Prerequisites (Why we use Normal/Z):
We rely on the Central Limit Theorem (CLT) or the assumption of a Normal population:
- The distribution of the sample mean, \(\bar{X}\), is Normal (or approximately Normal if \(n\) is large).
- We use the distribution \(\bar{X} \sim N\left( \mu, \frac{\sigma^2}{n} \right)\).
The Test Statistic (The Z-Value)
The standard way to measure how many standard errors the sample mean is away from the hypothesized population mean \(\mu_0\) is:
$$Z = \frac{\bar{X} - \mu_0}{\sigma / \sqrt{n}}$$
where \(\bar{X}\) is the sample mean, \(\mu_0\) is the mean assumed under \(H_0\), \(\sigma\) is the population standard deviation (or sample estimate $s$ if $n$ is large), and \(n\) is the sample size.
Step-by-Step Z-Test Example:
1. Hypotheses: \(H_0: \mu = 50\), \(H_1: \mu \neq 50\). (\(\alpha = 5\%\), Two-tailed)
2. Critical Value: Since \(\alpha = 0.05\) (two-tailed), the critical Z-values are \(Z = \pm 1.96\) (using the Normal tables, critical region area is 0.025 in each tail).
3. Calculate Z-statistic: (Suppose sample mean \(\bar{X} = 52\), \(\sigma=10\), \(n=100\))
$$Z = \frac{52 - 50}{10 / \sqrt{100}} = \frac{2}{1} = 2.00$$
4. Decision: The calculated \(Z=2.00\) falls outside the acceptance region (between -1.96 and 1.96). It is in the critical region. Reject \(H_0\).
5. Interpretation: There is sufficient evidence at the 5% level to conclude that the population mean is not 50.
Part 4: The Errors in Decision Making
Since we rely on samples, there is always a chance our decision is wrong. There are two types of error you must understand and calculate.
4.1 Type I Error (\(\alpha\))
- Definition: Rejecting the Null Hypothesis (\(H_0\)) when it is actually true.
- Severity: Sometimes called a "false positive."
- Probability: The probability of making a Type I Error is equal to the significance level, \(\alpha\).
- Example: Concluding the average height is *not* 170 cm, when it actually is 170 cm.
4.2 Type II Error (\(\beta\))
- Definition: Accepting (or failing to reject) the Null Hypothesis (\(H_0\)) when it is actually false (i.e., when \(H_1\) is true).
- Severity: Sometimes called a "false negative."
- Probability ($\beta$): This is harder to calculate and requires assuming a specific value for the true parameter under \(H_1\).
- Example: Concluding the average height *is* 170 cm, when it is actually 172 cm.
How to Calculate the Probability of a Type II Error (\(\beta\))
Calculating \(\beta\) involves two steps:
Step A: Find the Acceptance Region (Critical Values) based on \(H_0\) and \(\alpha\).
We find the boundary values (critical values \(k\)) that separate the acceptance and rejection regions under the distribution defined by \(H_0\).
Step B: Calculate the probability of the Test Statistic falling into the Acceptance Region, assuming the NEW parameter (from \(H_1\)).
The probability of Type II Error, \(\beta\), is \(P(\text{Acceptance Region } | \text{ True Parameter})\).
Don't worry if this seems tricky at first! This is the most complex calculation in the chapter. Practice questions where \(H_1\) specifies a single value (e.g., \(H_1: \mu = 51\) instead of just \(\mu > 50\)) are usually the easiest starting point for \(\beta\) calculation.
Summary of Type I and Type II Errors
| \(H_0\) is True | \(H_0\) is False (\(H_1\) is True) | |
| Accept \(H_0\) | Correct Decision | Type II Error (\(\beta\)) |
| Reject \(H_0\) | Type I Error (\(\alpha\)) | Correct Decision |
Key Takeaway: There is an inherent trade-off. If you decrease the probability of a Type I error (e.g., lower \(\alpha\)), you increase the size of the acceptance region, which makes it more likely you’ll make a Type II error (\(\beta\)).