📊 Graphical Representation of Data: Making Sense of Numbers

Welcome to the chapter on displaying data! Don't worry if Statistics sometimes feels overwhelming—this topic is all about turning messy lists of numbers into beautiful, easy-to-read pictures. When we visualize data correctly, patterns jump right out at us, making analysis much simpler.
In this section, we will learn how to choose the right graph for the right type of data, and how to correctly interpret the information presented. Let’s get started!

1. Charts for Simple and Discrete Data

Discrete data is data that can only take specific values (like the number of siblings you have, or shoe size). These charts are usually straightforward to construct.

1.1 Bar Charts and Vertical Line Graphs

A Bar Chart uses rectangular bars to show the frequency of different categories.

Key Features:

  • The height of the bar represents the frequency (how often something occurs).
  • The bars are usually separated by gaps. This is the crucial difference from a histogram! The gaps emphasize that the data is discrete or categorical.
  • The x-axis labels the categories.

Tip for Students:

If the data is truly numerical and discrete (like the number of cars passing a point), sometimes a Vertical Line Graph (or frequency diagram) is used, where a thin line replaces the bar. The principles remain the same.

1.2 Pie Charts

Pie charts are used to show how a whole amount is divided into different parts or categories. They are excellent for showing relative proportions.

Step-by-Step: Creating a Pie Chart

  1. Find the Total Frequency (the total number of items in the dataset).
  2. Calculate the angle for each category. Since a full circle is \(360^\circ\), the angle is proportional to the category's frequency.
  3. Formula: Angle \( = \frac{\text{Frequency}}{\text{Total Frequency}} \times 360^\circ \)
  4. Draw the circle and use a protractor to mark out the calculated angles.

Common Mistake to Avoid: Don't forget to check that all the calculated angles add up exactly to \(360^\circ\)! If they don't, you made a calculation error.

1.3 Stem and Leaf Diagrams

Stem and leaf diagrams are a brilliant way to display data because, unlike a bar chart, they keep all the original raw data while still showing the shape of the distribution.

Structure:

  • The Stem (left side) holds the larger place values (e.g., tens or hundreds).
  • The Leaf (right side) holds the smallest place values (usually the units digit).
  • The leaves must always be written in ascending order (from smallest to largest).

The Golden Rule: The Key
You must include a Key to explain what the stem and leaf represent. Example: If 2 | 5 means 25, you must write: Key: 2 | 5 = 25.

Quick Review: Discrete Data

Bar Charts show categories clearly (with gaps). Pie Charts show proportions. Stem and Leaf diagrams keep the raw data.

2. Dealing with Continuous Data (Grouped Frequency)

Continuous data is data that can take any value within a range (like height, time, or weight). When we have lots of continuous data, we group it into class intervals.

2.1 Histograms

Histograms are used exclusively for displaying continuous data (often grouped data). They look similar to bar charts, but there is a major conceptual difference.

Crucial Difference: Area vs. Height
In a bar chart, the height is the frequency. In a Histogram, the Area of the bar represents the Frequency.

Since the bars might have different widths (class intervals), we cannot use frequency on the vertical axis. We must calculate Frequency Density.

The Histogram Formula (Memorize This!): $$ \text{Frequency Density} = \frac{\text{Frequency}}{\text{Class Width}} $$

Step-by-Step: Drawing a Histogram
  1. Calculate the Class Width for each group (\( \text{Upper Boundary} - \text{Lower Boundary} \)).
  2. Calculate the Frequency Density for each group using the formula above.
  3. Label the vertical axis (y-axis) as Frequency Density.
  4. Draw the rectangles. Unlike bar charts, there are no gaps between the bars because the data is continuous.
Common Mistake Alert!

Struggling students often put "Frequency" on the y-axis of a histogram. DO NOT DO THIS if the class intervals are unequal. You must use Frequency Density.

2.2 Frequency Polygons

A Frequency Polygon is simply another way to display the information contained in a histogram or grouped frequency table.

Step-by-Step: Drawing a Frequency Polygon

  1. Find the Midpoint of each class interval.
  2. Plot the points using the coordinates: (\(\text{Midpoint}, \text{Frequency}\)).
  3. Join the points with straight lines.

Did You Know? To fully enclose the polygon and make it touch the axis, we usually include an extra class interval at the start and end (with zero frequency). This helps show the overall shape of the data distribution clearly.

Key Takeaway: Histograms

Continuous data requires Histograms. The vertical axis is Frequency Density, and the area of the bar equals the frequency. No gaps!

3. Cumulative Frequency and Box Plots

These graphs help us find specific values within the distribution, like the median or the quartiles, especially when dealing with large amounts of grouped continuous data.

3.1 Cumulative Frequency Graphs (Ogive)

Cumulative Frequency (CF) means the "running total" of the frequencies. It tells you how many pieces of data are less than or equal to a certain value.

Step-by-Step: Drawing a Cumulative Frequency Graph
  1. Calculate the Cumulative Frequency by adding up frequencies sequentially.
  2. Plot the points using the coordinates: (\(\text{Upper Class Boundary}, \text{Cumulative Frequency}\)).
  3. Start the graph at the lower boundary of the first class interval with a Cumulative Frequency of 0.
  4. Join the points with a smooth curve (called an Ogive). Note: It should look S-shaped.

Why the Upper Boundary? We plot against the upper boundary because the cumulative frequency tells us the total number of items up to that point.

Interpreting the Graph: Estimating Statistics

If \(N\) is the total frequency, we use the y-axis (CF) to find:

  • Median (Q2): Find the value corresponding to \(\frac{N}{2}\) on the y-axis.
  • Lower Quartile (Q1): Find the value corresponding to \(\frac{N}{4}\) on the y-axis.
  • Upper Quartile (Q3): Find the value corresponding to \(\frac{3N}{4}\) on the y-axis.

The Interquartile Range (IQR) is a measure of spread and is calculated as: $$ IQR = Q3 - Q1 $$

3.2 Box Plots (Box and Whisker Diagrams)

A Box Plot is a standardized way of displaying the distribution of data based on five key numbers. It is very useful for comparing two distributions side-by-side.

The Five-Number Summary (The essential components):

  1. Minimum Value (Smallest observation)
  2. Lower Quartile (Q1) (25th percentile)
  3. Median (Q2) (50th percentile)
  4. Upper Quartile (Q3) (75th percentile)
  5. Maximum Value (Largest observation)

The 'box' itself stretches from Q1 to Q3, and the median is drawn inside the box. The 'whiskers' extend out to the minimum and maximum values.

What does a Box Plot tell us?

The length of the box (the IQR) tells us how spread out the middle 50% of the data is. A shorter box means the middle data is tightly clustered.

Quick Review: Key Terms for Spread

Range: \( \text{Maximum Value} - \text{Minimum Value} \)
Interquartile Range (IQR): \( Q3 - Q1 \)
(The IQR is a better measure of spread than the range because it is not affected by extremely high or low outliers!)