Introduction: Keeping Data Safe in Transit
Welcome to a crucial topic in Computer Science: ensuring that data sent from one place to another arrives correctly! Data transmission is never perfectly reliable; signals get weak, interference happens, and bits can flip.
If you download a large software update, you need to be certain that the file you receive is exactly the same as the file the server sent.
This chapter focuses on Error Detection—the techniques computers use to spot corrupted data, and what happens once an error is found. This process is essential for maintaining data integrity.
1. The Need to Check for Errors
Errors occur during data transmission primarily due to interference (like electrical noise or signal distortion) along the transmission medium (cables, wireless signals).
Types of Errors That Can Occur:
When data (represented as 0s and 1s) is sent, the signal corruption can result in:
- Data Change (Corruption): A 0 bit might accidentally become a 1, or a 1 bit might become a 0. This is the most common error to detect.
- Data Loss: A chunk of data or an entire packet might be lost completely.
- Data Gain: Unwanted or duplicate data might be accidentally added into the stream.
2. Methods for Detecting Transmission Errors
Once data has been transmitted, the receiving device immediately performs checks. We need methods that are fast and reliable.
2.1 Parity Check (Parity Byte and Block)
The parity check is one of the simplest methods used. It checks whether the total number of 1s in a set of bits is either even or odd.
How Parity Works:
A rule must first be agreed upon: either Even Parity or Odd Parity. A single extra bit, the Parity Bit, is added to the block of data (usually 7 or 8 bits) to enforce this rule.
Step-by-step Process:
- Sender Action: Counts the number of 1s in the original data.
- Sender Action: Sets the Parity Bit (0 or 1) so that the total count of 1s in the entire group (data + parity bit) matches the agreed rule.
- Receiver Action: Counts the total number of 1s in the received group.
- Receiver Action: If the total matches the rule, the data is assumed to be correct. If it doesn't match, an error is detected.
Example: Using Even Parity
We agree that the total number of 1s must be even.
- Data to send: 1 0 0 1 1 0 1 0 (Four 1s – already even).
- Parity Bit: Set to 0.
- Total sent: 1 0 0 1 1 0 1 0 0 (Still four 1s. Total is even.)
Now, imagine a single error occurs during transmission: the third bit flips from 0 to 1.
- Data received: 1 0 1 1 1 0 1 0 0 (Five 1s).
- Receiver Check: Five 1s is odd. The system was expecting an even number. Error detected!
Limitations of Simple Parity Check
The biggest flaw is that if two bits flip (or any even number of bits flip), the parity check will still pass! The system will mistakenly assume the corrupted data is correct.
Parity Block Check (Increased Reliability)
To overcome the limitations of checking data one byte at a time, data can be sent in a block (a grid of bytes). An extra byte, the Parity Byte, is added to the end of the block, containing the parity bit for each column. This is known as a Parity Block Check.
- The system checks parity for every row (standard parity check).
- The system also checks parity for every column (using the Parity Byte).
If an error is detected, checking both the row parity and the column parity allows the system to pinpoint the exact bit that flipped, making it much easier to detect and even correct single errors.
2.2 Checksum
Checksum is a more robust mathematical method often used for checking large chunks of data, like an entire file or a transmission packet.
How Checksum Works:
- Sender Calculation: The data is divided into fixed-size segments. These segments are added together mathematically (often using a specialised adding method).
- Checksum Creation: The result of this calculation is stored as the Checksum.
- Transmission: The data block and the Checksum are sent together.
- Receiver Calculation: The receiver performs the exact same mathematical addition on the received data segments.
- Verification: The receiver compares their newly calculated checksum against the checksum received from the sender.
- Result: If the two checksums match, the data is accepted. If they differ, an error occurred during transmission.
Analogy: Think of a big delivery box (the data block). The sender puts a total weight sticker (the Checksum) on the outside. If the receiver weighs the contents and the total weight doesn't match the sticker, they know something was added or removed during shipping.
Key Takeaway: Checksums are more effective than simple parity in detecting multiple errors within a data block.
2.3 Echo Check
Echo checking is a very straightforward method where the receiving device sends the received data straight back to the sender for confirmation.
Step-by-step Process:
- Sender transmits the data.
- Receiver immediately sends the data back (the "echo").
- Sender compares the echoed data with the original data it sent.
- If they are identical, the transmission is assumed to be correct.
Drawback: The main risk is that the error could occur during the echo (return journey). The sender would compare corrupted received data with its perfect original data and correctly spot the mismatch, but it wouldn't know exactly where the transmission failed (on the way out or the way back).
3. Check Digits (Detecting Data Entry Errors)
While Parity and Checksum deal with transmission errors, the Check Digit is a specific technique used primarily to detect errors made by a human when typing in or scanning a long reference number (a data entry error).
What is a Check Digit?
A check digit is an extra digit appended to a code number. This digit is calculated mathematically using the other digits in the code. It acts as a safety validation measure.
How Check Digits are Used:
- International Standard Book Numbers (ISBN): The final digit is a check digit.
- Bar Codes (UPC/EAN): The final digit visible beneath the bar code lines is a check digit.
The Process of Verification:
When the number is entered (or scanned):
- The computer takes the main digits of the code.
- It applies the specific calculation algorithm (which often involves multiplying each digit by a different weight).
- It calculates what the check digit should be.
- It compares the calculated check digit with the actual check digit that was entered (the last digit of the code).
If the entered number has errors, such as a transposition error (swapping adjacent digits, e.g., typing 1234 instead of 1243) or a single digit mistake, the calculation will often fail, and the number will be rejected.
Encouraging Note: Don't worry about learning the specific ISBN calculation, just understand the purpose (detecting data entry/transposition errors) and when it is used (codes, not general data files).
4. Automatic Repeat Query (ARQ)
Once an error is detected using a method like Parity Check or Checksum, the system needs to fix it. Since error detection methods usually cannot correct corrupted data, the simplest solution is to ask the sender to transmit the packet again. This is the job of the Automatic Repeat Query (ARQ) protocol.
ARQ uses two main mechanisms to ensure reliable delivery:
4.1 Acknowledgements (ACK and NACK)
The receiver sends a signal back to the sender to confirm the status of the received packet:
- Positive Acknowledgement (ACK): If the packet is received correctly (e.g., the checksum matched), the receiver sends an ACK. The sender can then proceed to send the next packet.
- Negative Acknowledgement (NACK): If the packet is received, but an error is detected (e.g., the checksum failed), the receiver sends a NACK. This signal tells the sender, "I got packet 5, but it's corrupted, please re-send it."
4.2 Timeout
What if the data packet is completely lost and the receiver sends nothing back?
The timeout mechanism handles this:
- When the sender transmits a packet, it starts a timer.
- The timer is set for a reasonable period, called the timeout.
- If the sender does not receive either an ACK or a NACK before the timer runs out, it assumes the original packet (or the acknowledgement) was lost.
- The sender then automatically re-sends the packet.
Key Takeaway: ARQ combines acknowledgements and timeouts to guarantee that data transmission is reliable, even if errors occur.
Summary Checklist for Methods of Error Detection (Syllabus 2.2)
- Parity Check (Odd/Even): Adds a bit to ensure an odd or even number of 1s. Good for single-bit errors. (Remember: Fails if an even number of errors occur).
- Parity Block Check: Uses a parity byte for rows and columns, making it easier to pinpoint and correct single errors in a block.
- Checksum: Mathematically sums sections of data. More reliable for detecting multiple errors in a large block/packet.
- Echo Check: Receiver sends data back to the sender for comparison. Simple but vulnerable to errors on the return journey.
- Check Digit (Data Entry): Mathematically calculated digit added to codes (like ISBNs/Bar Codes) to check if a number was typed incorrectly.
- ARQ: The protocol that *requests* retransmission if an error is detected, using ACK, NACK, and timeout.