Introduction: Why Sampling Matters in Further Maths Statistics

Hello future statisticians! Welcome to the exciting world of Sampling.
This chapter, part of Unit S3, is fundamental. It teaches us how to collect reliable data without having to measure every single thing in existence. Imagine trying to find the average height of every person in the UK—impossible! Instead, we take a sample.

Understanding good sampling techniques is crucial because bad sampling leads to bad data, and bad data leads to incorrect conclusions (or failing hypothesis tests!). Don't worry if this seems tricky at first; we will break down the methods step-by-step.

Key Learning Outcomes:

  • Define core terminology (Population, Sample, Frame).
  • Distinguish between different random and non-random sampling techniques.
  • Evaluate the advantages and disadvantages of each method.

Section 1: The Essential Terminology

Before we learn how to select a sample, we must understand the language used.

1.1 Population and Sample

The Population

The Population is the entire group of items or individuals that we are interested in studying. This could be people, animals, objects, or data points.
Example: If you want to study the quality of lightbulbs produced by a factory today, the population is ALL lightbulbs produced today.

The Sample

The Sample is a small, manageable sub-set of the population that is selected for measurement or observation.
Analogy: Think of baking a cake. You don't eat the whole cake to see if it's done; you take a small spoonful (the sample) to test the batter (the population).

The Census

A Census occurs when every single member of the population is included in the study.
When is it used? Only when the population is very small, or when legislation requires it (like a national population census).

1.2 Sampling Units and Sampling Frame

Sampling Unit

A Sampling Unit is an individual member or item within the population that could be selected for the sample.
Example: If the population is students in a school, the unit is one student.

Sampling Frame

The Sampling Frame is the comprehensive list of all sampling units in the population. It acts as the "address book" from which we choose our sample.
Example: A school register containing the names of all current students.

Quick Review: The Foundation
  • Population: The whole group.
  • Sample: A piece of the group.
  • Frame: The list you choose from.

Section 2: Random Sampling Methods

In statistical inference, we aim for a representative sample. This means the characteristics of the sample should reflect the characteristics of the population. The best way to achieve this is through Random Sampling, where every unit has a known, non-zero chance of being selected.

2.1 Simple Random Sample (SRS)

This is the simplest form of random sampling. Every possible sample of the required size has an equal chance of being selected, and every unit has an equal chance of being selected.

Process for SRS:
  1. Create a comprehensive Sampling Frame (a list of all units).
  2. Assign a unique number to every unit in the frame.
  3. Use a completely random method (e.g., random number generator, pulling numbered slips from a hat) to select the required sample size.

Advantages of SRS:

  • It is theoretically unbiased (no human judgment involved).
  • Easy to understand and implement if the population size is small.

Disadvantages of SRS:

  • A complete Sampling Frame is required, which may not exist or be difficult to create.
  • It can be time-consuming, especially for large populations.

2.2 Systematic Sampling

In Systematic Sampling, units are selected at regular intervals from the sampling frame.

Step-by-Step Systematic Process:

Let \(N\) be the population size and \(n\) be the required sample size.

  1. Calculate the interval size, \(k\): \(k = \frac{N}{n}\) (usually rounded down to the nearest integer).
  2. Choose a random starting point, \(r\), between 1 and \(k\).
  3. Select the unit numbered \(r\), then the unit numbered \(r + k\), then \(r + 2k\), and so on, until the sample size \(n\) is reached.

Example: Population \(N=100\), sample \(n=10\). Interval \(k = 100/10 = 10\). Random start chosen is 4. Sample units are 4, 14, 24, 34, ..., 94.

Advantages of Systematic Sampling:

  • It is generally quick and straightforward to implement.
  • It is often a good representation of the population if the units are listed randomly in the sampling frame.

Disadvantages of Systematic Sampling:

  • If there is a hidden pattern or cycle in the sampling frame that matches the interval \(k\), the sample can become heavily biased. (Example: Taking a sample every 7 days might miss trends that occur only on weekends.)

2.3 Stratified Sampling

If the population naturally divides into distinct groups (called strata) based on characteristics like gender, age, or location, Stratified Sampling ensures each group is represented proportionally in the sample.

When to Use Stratified Sampling:

Use this when the population is heterogeneous (not uniform) and you believe the characteristic being measured (e.g., opinion) is affected by the different strata.

The Proportional Rule (Crucial Calculation):

The number of units selected from each stratum must be proportional to the size of that stratum in the population.

$$ \text{Sample size from stratum } i = \frac{\text{Size of stratum } i}{\text{Size of population}} \times \text{Total sample size} $$

Example: A college has 600 males and 400 females (Total 1000). We need a sample of 100.
Male sample: \(\frac{600}{1000} \times 100 = 60\) males.
Female sample: \(\frac{400}{1000} \times 100 = 40\) females.

Once the number required from each stratum is calculated, the actual selection within each stratum is done using SRS or Systematic sampling.

Advantages of Stratified Sampling:

  • It guarantees that the sample accurately reflects the structure of the population regarding the key characteristic (e.g., gender balance).
  • It often yields the most representative and reliable data.

Disadvantages of Stratified Sampling:

  • The population must be clearly classified into strata.
  • A detailed Sampling Frame showing which stratum each unit belongs to is essential.
Common Mistake Alert!

Do not confuse Stratified Sampling with Cluster Sampling (which is often covered in higher-level university courses but sometimes mentioned in context). In Cluster Sampling, you randomly select entire groups (clusters) and survey *everyone* in those selected groups. In Stratified, you select *some* people from *all* groups.


Section 3: Non-Random (Non-Probability) Sampling Methods

Non-random sampling methods are often quicker and cheaper, but they rely on the subjective judgment of the researcher. This means they are highly susceptible to bias and cannot be reliably used for statistical inference (like hypothesis testing).

3.1 Quota Sampling

Quota sampling is similar to stratified sampling because the population is segmented into groups (like age or gender), and the researcher sets targets (quotas) for the number of units needed in each segment.

How Quota Sampling Works:

The interviewer goes out and actively seeks people until the quotas are met. The selection within the quota is left up to the interviewer's judgment (e.g., stopping the first 10 men they see).

Advantages of Quota Sampling:

  • No sampling frame is needed.
  • It is quick, easy, and inexpensive.
  • It allows research to be conducted even when there are restrictions on who can be contacted (e.g., if the researcher needs to physically meet the subjects).

Disadvantages of Quota Sampling:

  • It is highly susceptible to interviewer bias (the interviewer might subconsciously choose people who look approachable or agreeable).
  • The results cannot be reliably used to generalise to the entire population.

3.2 Opportunity (or Convenience) Sampling

This is the simplest, quickest, and usually the worst method for scientific research. The sample is chosen simply because the units are available at the time of the study.

Example: Surveying the first 20 people you see outside the library.

Advantages of Opportunity Sampling:

  • Extremely easy and inexpensive.

Disadvantages of Opportunity Sampling:

  • It is almost certainly unrepresentative. (The sample only reflects the opinion of people who happened to be in that place at that time.)
  • Leads to significant bias.

Section 4: Summary and Evaluation

4.1 When is a Census Appropriate?

A Census (studying the entire population) is only appropriate if:

  • The population is very small.
  • The study involves non-destructive testing (i.e., you aren't testing lightbulbs until they break).
  • High accuracy is required and you have unlimited time and resources.

4.2 Comparison Table: Methods, Pros, and Cons

Method Key Feature Pros (Advantages) Cons (Disadvantages)
Simple Random Selection based purely on chance. Unbiased; easy to analyse results. Needs a full sampling frame; expensive for large areas.
Systematic Regular intervals (\(k = N/n\)). Quick and easy to execute. Can be biased if there's a periodic cycle in the frame.
Stratified Proportional representation from key subgroups. Highly representative; reduces variability. Requires knowledge of strata sizes; complex frame needed.
Quota Interviewer selects units until quotas are met. No frame needed; quick and cheap fieldwork. Highly prone to interviewer bias; not truly random.
Opportunity Units chosen because they are readily available. Extremely quick and convenient. Produces the highest level of bias; unrepresentative.

Did you know? Political polling companies spend millions ensuring their samples are perfectly stratified, often balancing dozens of characteristics (age, geography, previous voting history) to reduce bias and accurately predict election outcomes!

Key Takeaway for Exams:

The most common exam questions ask you to justify why one method is better than another in a specific scenario. If the population has clear, known subgroups, Stratified Sampling is usually the best answer. If a sampling frame is impossible, you must rely on Quota or Opportunity, but remember to mention the high risk of bias.


You’ve successfully covered the core methodology of Sampling! Use this knowledge as you move on to Unit S3’s later chapters, where we use these samples to make powerful statistical inferences. Keep practising those proportional calculations, and you'll be sampling experts in no time!