Welcome to the World of Searching Algorithms!
Hi there! In this chapter, we dive into one of the most fundamental skills a computer scientist needs: finding things quickly. Whether you are searching a huge database or finding a contact on your phone, algorithms are working behind the scenes.
We will explore two main searching algorithms: the straightforward (but slow) Linear Search and the clever (but conditional) Binary Search. Understanding these helps you build more efficient and smarter programs. Don't worry if this seems tricky at first—we will use plenty of analogies to break down these concepts!
3.4.1 Searching Algorithms
1. The Linear Search Algorithm (The Slow but Sure Method)
The Linear Search (or Sequential Search) is the simplest way to find an item in a list or array. It does exactly what its name suggests: it checks items one by one, in a straight line, from the start of the list until it finds the target item.
How Linear Search Works (Step-by-Step)
Imagine you have a row of unlabeled lockers and you are looking for Locker 42. You start at the first locker and check them individually until you find 42.
- Start at the very beginning of the list (index 0).
- Look at the current item.
- Is the current item the target item you are searching for?
- If YES, stop and return the position (index). Job done!
- If NO, move to the next item in the list.
- If you reach the end of the list and haven't found the item, then the item is not present.
Key Features of Linear Search
- Order doesn't matter: The list does not need to be sorted. This is a huge advantage!
- Implementation: This algorithm is easy to code, usually using a simple loop (iteration).
- Efficiency (Time): In the worst-case scenario, you have to check every single item in the list. If the list has $N$ items, you might perform up to $N$ checks. This is described as $O(n)$ efficiency (linear time).
Did you know? (Efficiency Trick)
While the basic version of Linear Search checks every item, an efficient version will stop as soon as the item is found. If the target is the very first item, the search is instant! If the target is the last item, or not present at all, the search takes the maximum possible time.
Analogy: Checking every card in a shuffled deck one by one.
Requirement: List can be UNORDERED.
Time Efficiency: Scales poorly (slow) for huge lists.
2. The Binary Search Algorithm (The Fast but Picky Method)
The Binary Search algorithm is incredibly fast, but it has one vital rule: the list or array MUST be sorted. If the data isn't ordered (e.g., numerically or alphabetically), Binary Search will not work correctly.
Binary Search uses a powerful strategy called Divide and Conquer. Instead of checking one item at a time, it eliminates half of the remaining data in every step.
Analogy: The Dictionary Search
Imagine looking up a word in a huge paper dictionary (which is always sorted!).
You don't start at page 1. You open the dictionary roughly in the middle. If your target word starts with 'S', and the middle page is 'K', you immediately discard the entire first half of the book! You then repeat this process on the remaining half, constantly cutting the search space in half.
How Binary Search Works (Step-by-Step)
We keep track of the search space using three pointers: Low, High, and Mid.
- Ensure the list is sorted.
- Set the Low pointer to the start of the list (index 0).
- Set the High pointer to the end of the list (index $N-1$).
- While Low is less than or equal to High:
- Calculate the Mid point: $Mid = (Low + High) / 2$ (rounded down).
- Compare the item at Mid with the Target:
- If $List[Mid] = Target$: Success! Return Mid.
- If $List[Mid] < Target$: The target must be in the upper half. Move the Low pointer: $Low = Mid + 1$.
- If $List[Mid] > Target$: The target must be in the lower half. Move the High pointer: $High = Mid - 1$.
- If the loop finishes and the target wasn't found, the item is not present.
Tracing Example
Search for Target = 45 in the list: [10, 20, 30, 40, 45, 50, 60, 70] (8 items)
Pass 1:
- Low = 0, High = 7
- Mid = (0 + 7) / 2 = 3. $List[3]$ is 40.
- 45 > 40. Target is greater. Discard the lower half.
- New Low = $Mid + 1 = 4$. (New range: [45, 50, 60, 70])
Pass 2:
- Low = 4, High = 7
- Mid = (4 + 7) / 2 = 5. $List[5]$ is 50.
- 45 < 50. Target is smaller. Discard the upper half.
- New High = $Mid - 1 = 4$. (New range: [45])
Pass 3:
- Low = 4, High = 4
- Mid = (4 + 4) / 2 = 4. $List[4]$ is 45.
- 45 = 45. FOUND!
It took only 3 checks to find the item! A Linear Search would have taken 5 checks.
Efficiency of Binary Search
Binary search is extremely efficient in terms of time. Every comparison halves the search space. For a list of $N$ items, the maximum number of checks is proportional to the logarithm of $N$. This is written as $O(\log n)$.
Example: If you have 1 million items ($N=1,000,000$), a Linear Search might take 1 million checks. A Binary Search would take at most 20 checks!
Analogy: Looking up a word in a sorted dictionary.
Requirement: List MUST BE ORDERED (sorted).
Time Efficiency: Scales very well (very fast) for huge lists.
3. Comparing Searching Algorithms
When choosing an algorithm, we compare them based on two main criteria: Efficiency (how fast they are) and Conditions (what state the data needs to be in).
Comparison Table
| Feature | Linear Search | Binary Search |
|---|---|---|
| Prerequisite (Condition) | List can be UNORDERED. | List MUST be ORDERED. |
| Time Efficiency (Worst Case) | $O(n)$ (Linear Time) | $O(\log n)$ (Logarithmic Time) |
| Operation | Checks one item sequentially. | Eliminates half the data repeatedly. |
| Use Case | Small lists, or lists that change frequently and are hard to keep sorted. | Very large, stable, and sorted datasets (like databases). |
Understanding Time Efficiency
At AS level, you need to be able to compare the time efficiency, even without relying solely on Big-O notation.
- Linear Search: If you double the size of the list (from 10 items to 20 items), the search time approximately doubles. The time grows in a straight line relative to the list size.
- Binary Search: If you double the size of the list, the search time only increases by one single step (one more halving operation). The time grows very, very slowly relative to the list size.
Conclusion: For large datasets, the initial effort required to sort the list (if it isn't already sorted) is almost always worth it, because the Binary Search is dramatically faster than Linear Search.
Always state the prerequisite condition when discussing Binary Search. If an exam question describes an unordered list, Binary Search cannot be used until the list is sorted.
If asked to compare efficiency, focus on the rate of growth: Binary search time increases minimally when the list size grows, whereas Linear search time increases proportionally to the list size.