Study Notes: Representing Data - Units of Information (Syllabus 3.5.2)

Hello future Computer Scientists! This chapter is absolutely fundamental because it teaches you the language of measurement inside a computer. Think of it as learning the metric system for digital data. Everything a computer does—from running complex algorithms to storing a simple photo—is built upon these basic units. Let’s dive into the core building blocks!

1. The Fundamental Units: Bit and Byte

In the digital world, data is represented by electricity flowing through circuits. These circuits can only be in one of two states: on or off.

The Bit (Binary Digit)
  • The bit is the smallest, fundamental unit of information.
  • It can only hold one of two possible values: 0 (off/false) or 1 (on/true).
  • Analogy: A single light switch. It's either on or off.
The Byte
  • A byte is a group of 8 bits.
  • The byte is the standard unit that computers use to store a single character (like the letter 'A' or the number '5').
  • Since 1 Byte = 8 bits, it can represent \(2^8 = 256\) different possible values.
Quick Review: The Foundation
1 Bit = 0 or 1
1 Byte = 8 Bits

2. Representational Capacity: How Much Can \(n\) Bits Hold?

Knowing how many different values a string of bits can represent is crucial. This calculation uses powers of 2.

The \(2^n\) Rule

If you have \(n\) bits, you can represent \(2^n\) different values.

  • Example 1: If \(n = 1\) bit, you can represent \(2^1 = 2\) values (0 or 1).
  • Example 2: If \(n = 4\) bits (often called a nybble), you can represent \(2^4 = 16\) different values (0000 to 1111).
  • Example 3: If \(n = 8\) bits (1 Byte), you can represent \(2^8 = 256\) different values (0 to 255).

Did you know? If you are using 3 bits, the possible configurations are: 000, 001, 010, 011, 100, 101, 110, 111. That's exactly \(2^3 = 8\) different ways.

Key Takeaway: The amount of unique information a system can encode increases exponentially with the number of bits available.

3. Measuring Large Quantities: The Kilo vs. Kibi Problem

As computers got bigger, we needed prefixes (like kilo, mega, giga) to talk about large quantities of bytes. Historically, this caused confusion because two different systems of measurement were used interchangeably: Decimal (Metric) and Binary (Computer Science).

The core of the issue is that 1000 (\(10^3\)) is very close to 1024 (\(2^{10}\)). For a long time, the term "Kilobyte (kB)" was used sometimes to mean 1000 Bytes, and sometimes to mean 1024 Bytes.

To fix this confusion, the International Electrotechnical Commission (IEC) introduced the official Binary Prefixes, known as IEC Prefixes (kibi, mebi, gibi, etc.).

⚠ Common Mistake Alert ⚠
You must be able to distinguish between the two systems:
1. Decimal Prefixes: Use powers of 10 (used primarily by storage manufacturers like hard drive companies).
2. Binary Prefixes: Use powers of 2 (used by Computer Scientists and Operating Systems for accurate memory measurement).

4. Decimal Prefixes (Powers of 10)

These prefixes are based on the standard metric system, where each step is a multiplication by one thousand (1000).

NameSymbolPower of 10Value (Bytes)
kilok\(10^3\)1,000
megaM\(10^6\)1,000,000
gigaG\(10^9\)1,000,000,000
teraT\(10^{12}\)1,000,000,000,000

Example: A 1 terabyte (TB) hard drive actually contains \(10^{12}\) bytes.

5. Binary Prefixes (Powers of 2)

These prefixes, officially recognised by the IEC, are based on powers of two (2), where each step is a multiplication by 1024.

The key difference is the addition of the "bi" (for binary) in the name (e.g., kibi instead of kilo) and the "i" in the symbol (e.g., KiB instead of kB).

NameSymbolPower of 2Value (Bytes)
kibiKi\(2^{10}\)1,024
mebiMi\(2^{20}\)1,048,576
gibiGi\(2^{30}\)1,073,741,824
tebiTi\(2^{40}\)1,099,511,627,776

Example: 1 kibibyte (KiB) = \(2^{10}\) Bytes.
Real-World Connection: When you buy a hard drive advertised as 1 TB (a decimal measurement), your operating system might report it as 0.909 TiB (a binary measurement). This difference often confuses users, but it is technically accurate according to the distinct definitions above!

Memory Aid: The Powers of 2
To remember the sequence of binary prefixes, remember the powers go up in tens:
kibi = \(2^{10}\)
mebi = \(2^{20}\)
gibi = \(2^{30}\)
tebi = \(2^{40}\)

Key Takeaway Summary

  • The bit (0 or 1) is the absolute core.
  • The byte (8 bits) is the standard grouping.
  • The number of values you can represent is calculated by \(2^n\).
  • Always distinguish between the two prefix systems:
    • Decimal (kilo, M, G, T) uses \(10^3\) (1000).
    • Binary (kibi, Mi, Gi, Ti) uses \(2^{10}\) (1024).