👋 Welcome to HL Topic 5: Abstract Data Structures (ADS)!
This chapter is where we move beyond simple variables and arrays to look at highly structured and powerful ways of organizing data. Don't worry if this sounds intimidating! An Abstract Data Structure (ADS) is just a formal blueprint for how data should be organized and accessed.
Think of ADSs as specialized tools in your programming toolkit. If you need to manage a waiting list, you use a Queue. If you need to track function calls, you use a Stack. Understanding these structures is essential for writing efficient and elegant Higher Level computer science solutions.
Key Learning Goals for Topic 5
- Understand the difference between abstraction and implementation.
- Master the behavior and uses of Stacks (LIFO) and Queues (FIFO).
- Grasp the concept of dynamic structures like Linked Lists.
- Explore non-linear structures, specifically Binary Search Trees (BSTs).
1. The Concept of Abstraction
What is an Abstract Data Structure (ADS)?
An ADS defines data, the relationships between the data, and the operations that can be performed on the data, without specifying how the data is stored in memory.
This separation is called abstraction.
Analogy: The Remote Control
When you press the "Volume Up" button on your TV remote, you don't care about the circuits, the infrared light, or the internal programming of the TV (the implementation). You only care that the operation ("Volume Up") works as expected (the abstraction).
Similarly, when we use a Stack ADS, we only define the operations (PUSH and POP). We don't worry whether that Stack is being implemented internally using a rigid array or a flexible linked list.
Abstraction vs. Implementation
- Abstraction: The logical view. What does the data structure do? (e.g., A Queue holds waiting items in order.)
- Implementation: The physical view. How is the data structure built? (e.g., Using an array, or using a linked list of objects.)
Why this separation? It allows us to focus on the problem definition first. If a software engineer needs a waiting list, they grab the Queue ADS blueprint. Later, a systems programmer decides the most efficient *way* to build that Queue for their specific computer.
Quick Takeaway
ADS = What it does.
Implementation = How it does it.
2. Stacks: Last-In, First-Out (LIFO)
Definition and Behavior
A Stack is a linear ADS where all additions and deletions occur at the same end, called the top.
This results in a LIFO (Last-In, First-Out) behavior. The last item placed on the stack is always the first item to be removed.
Analogy: A Stack of Cafeteria Trays
Imagine trays in a cafeteria dispenser. You can only add a new tray to the top, and you can only take a tray from the top. The tray that was put in most recently is the one you take out first.
Key Stack Operations
- PUSH: Adds an item to the top of the stack.
- POP: Removes and returns the item currently at the top of the stack.
- PEEK (or TOP): Looks at, but does not remove, the item at the top.
- isEmpty: Checks if the stack currently contains any items.
Did you know? (Real-World Application)
Stacks are fundamental to how computers manage programs. The Run-Time Stack (or Call Stack) tracks function calls. When function A calls function B, B is PUSHED onto the stack. When B finishes, it is POPPED, and the program returns exactly where it left off in A.
Common Mistake to Avoid: Confusing PEEK and POP. POP *changes* the stack; PEEK only *reads* the stack.
Quick Review: Stack Mnemonic
LIFO: The Last In is the First Out.
3. Queues: First-In, First-Out (FIFO)
Definition and Behavior
A Queue is a linear ADS where insertion occurs at one end (the rear or tail), and deletion occurs at the other end (the front or head).
This results in a FIFO (First-In, First-Out) behavior. The first item that entered the queue is the first item to leave.
Analogy: Waiting in Line
Think of a line at the supermarket or a queue for a concert ticket. The person who arrived first is the person who is served first. New people always join the back (rear).
Key Queue Operations
- ENQUEUE: Adds an item to the rear (tail) of the queue.
- DEQUEUE: Removes and returns the item from the front (head) of the queue.
- PEEK (or FRONT): Looks at the item at the front without removing it.
- isEmpty: Checks if the queue is empty.
Real-World Application: Printing Jobs
If you send three documents to a shared printer, they don't print simultaneously. They wait in a print queue. The first document sent (FIFO) is the first document printed.
Implementation Note (HL focus)
While a queue can be implemented using a simple array, this is often inefficient because DEQUEUE requires shifting every remaining item forward. For better efficiency, queues are often implemented using a circular array or a linked list.
4. Linked Lists: Dynamic Storage
When Arrays Fail
Arrays (static data structures) have two main limitations: their size is fixed when created, and inserting/deleting items in the middle is slow because many elements have to be shifted.
Definition and Structure
A Linked List is a dynamic ADS where elements are not necessarily stored sequentially in memory. Instead, each element, called a Node, contains two parts:
- The Data itself (the value).
- A Pointer (or link) to the next Node in the sequence.
The list always starts with a special pointer called the Head, which points to the first Node. The list ends when a Node's pointer is Null.
Analogy: The Scavenger Hunt
Imagine a scavenger hunt where each clue (Node) tells you the location of the *next* clue (the Pointer). You only need to know where the first clue (Head) is to find all the others, even if the clues are physically scattered across a wide area (non-sequential memory).
Key Benefits of Linked Lists
- Dynamic Size: They grow and shrink exactly as needed, utilizing memory efficiently.
- Efficient Insertion/Deletion: To insert or delete a Node, you only need to change the pointers of its neighboring Nodes. No need to shift large blocks of data!
Types of Linked Lists (Syllabus Scope)
- Singly Linked List: Each Node points only to the next Node. (One-way street).
- Doubly Linked List: Each Node has two pointers: one to the next Node and one to the previous Node. (Two-way street, making traversal backwards possible and deletion easier).
Quick Takeaway: Efficiency
Linked Lists trade fast sequential access (which arrays do well) for fast insertion and deletion in the middle of the list.
5. Trees: Non-Linear Structures
Definition and Terminology
A Tree is a non-linear ADS used to represent hierarchical data. It consists of Nodes connected by Edges.
Key Tree Terminology:
- Root: The top-most Node in the tree. (It has no parents.)
- Child: A Node directly connected to another Node moving away from the Root.
- Parent: The inverse of a Child.
- Leaf: A Node that has no children. (It's at the end of the branch.)
- Subtree: A portion of a Tree that is itself a Tree.
Binary Search Trees (BSTs)
A Binary Tree is a Tree where every Node has at most two children (left and right).
A Binary Search Tree (BST) enforces an organizational rule that allows for highly efficient searching, insertion, and deletion:
- All values in the left subtree must be less than the value of the Parent Node.
- All values in the right subtree must be greater than the value of the Parent Node.
Why are BSTs useful?
If you want to find the number 15 in a BST, you start at the root. If the root is 20, you know instantly you only need to look at the left subtree, effectively cutting your search space in half. This speed is critical!
Step-by-Step: Inserting into a BST
To insert a value (e.g., 55):
- Start at the Root.
- Is 55 < Root? Go Left. Is 55 > Root? Go Right.
- Repeat step 2 down the tree until you hit a Null pointer.
- Insert the new Node (55) at that empty spot.
Tree Traversal Methods
Traversal is the process of visiting every Node in the tree exactly once. There are three main recursive methods, defined by when the Root (Parent) is visited relative to its Left and Right children.
Traversal Mnemonic: Position of the Root (R)
We always visit Left (L) then Right (T). The difference is where the Root (R) goes.
1. In-order Traversal (L R T):
Order: Left Subtree -> Root -> Right Subtree
- Effect: If performed on a BST, In-order traversal always produces the values in numerical or alphabetical sorted order. This is the most famous property of a BST.
2. Pre-order Traversal (R L T):
Order: Root -> Left Subtree -> Right Subtree
- Effect: Used to create a copy of the tree or to reconstruct the tree structure.
3. Post-order Traversal (L T R):
Order: Left Subtree -> Right Subtree -> Root
- Effect: Used primarily for deleting or processing the tree, ensuring that a node's children are processed/deleted before the parent itself.
Quick Review Box: The ADSs at a Glance
Stack: LIFO (Last In, First Out). Think trays.
Queue: FIFO (First In, First Out). Think waiting line.
Linked List: Dynamic memory, flexible insertion/deletion via Pointers.
BST: Hierarchical (Tree), rule-based organization (Left < Root < Right) for fast searching.