Data Structures and Algorithms
A program is data structure + algorithm (Niklaus Wirth). The CSS Computer Science paper rewards candidates who can analyse complexity, pick the right structure, and explain trade-offs.
A particular way of organising data in computer memory so that operations like insertion, deletion, search and update can be performed efficiently.
Complexity analysis
Algorithms are evaluated by how their resource use grows with input size n.
- Time complexity — number of elementary operations.
- Space complexity — additional memory used.
Big-O notation
| Class | Name | Example |
|---|---|---|
| O(1) | Constant | Array index access |
| O(log n) | Logarithmic | Binary search |
| O(n) | Linear | Linear search |
| O(n log n) | Log-linear | Merge sort, heap sort |
| O(n²) | Quadratic | Bubble, insertion, selection sort |
| O(n³) | Cubic | Naive matrix multiplication |
| O(2ⁿ) | Exponential | Naive recursive Fibonacci |
| O(n!) | Factorial | Brute-force TSP |
Other notations: Ω (omega) for lower bound, Θ (theta) for tight bound.
- Big-O describes worst-case growth.
- Constants and lower-order terms are dropped —
O(3n + 5) = O(n). - A faster-growing function eventually dominates regardless of constants.
- For very small n, an O(n²) algorithm can outperform an O(n log n) one due to constants.
Linear data structures
Arrays
Contiguous fixed-size memory.
- Access: O(1)
- Insertion/deletion (middle): O(n)
- Search (unsorted): O(n); (sorted): O(log n) with binary search.
Linked lists
Sequence of nodes connected by pointers.
- Singly, doubly, circularly linked.
- Insertion at head: O(1); search: O(n); access by index: O(n).
Stack (LIFO)
push,pop,peek— all O(1).- Uses: function call stack, expression parsing, undo operations.
Queue (FIFO)
enqueue,dequeue— both O(1).- Variants: circular queue, double-ended queue (deque), priority queue.
Hash tables
Map keys to values via a hash function.
- Average operations: O(1); worst case: O(n) due to collisions.
- Collision handling: separate chaining, open addressing (linear/quadratic probing, double hashing).
Non-linear data structures
Trees
A tree has nodes connected hierarchically with no cycles. Key terms: root, leaf, internal node, height, depth.
- Binary Tree — each node ≤ 2 children.
- Binary Search Tree (BST) — left subtree < node < right subtree.
- Balanced BSTs — AVL trees, Red-Black trees. Guarantee O(log n) operations.
- B-trees — used in databases and file systems (multiway, balanced).
- Heap — complete binary tree satisfying heap property; min-heap or max-heap. Basis of heap sort and priority queue.
- Trie — prefix tree for strings.
Tree traversals
- In-order (Left-Root-Right): visits BST nodes in sorted order.
- Pre-order (Root-Left-Right): used to copy a tree.
- Post-order (Left-Right-Root): used for deletion.
- Level-order (BFS): uses a queue.
Graphs
A graph G = (V, E) is a set of vertices and edges. Representations:
- Adjacency matrix — O(V²) space; O(1) edge query.
- Adjacency list — O(V + E) space; preferred for sparse graphs.
Types: directed/undirected, weighted/unweighted, cyclic/acyclic, connected/disconnected.
Graph traversals
- Breadth-First Search (BFS) — uses a queue; explores layer by layer; O(V + E).
- Depth-First Search (DFS) — uses recursion or stack; O(V + E).
Sorting algorithms
| Algorithm | Best | Average | Worst | Space | Stable? |
|---|---|---|---|---|---|
| Bubble | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Selection | O(n²) | O(n²) | O(n²) | O(1) | No |
| Insertion | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Merge | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes |
| Quick | O(n log n) | O(n log n) | O(n²) | O(log n) | No |
| Heap | O(n log n) | O(n log n) | O(n log n) | O(1) | No |
| Counting | O(n+k) | O(n+k) | O(n+k) | O(k) | Yes |
| Radix | O(n·k) | O(n·k) | O(n·k) | O(n+k) | Yes |
Quick sort is fastest in practice due to good cache behaviour; merge sort is preferred when stability or external (disk-based) sorting is needed.
Searching algorithms
- Linear Search — O(n); works on unsorted data.
- Binary Search — O(log n); requires sorted array.
- Hash-based — O(1) average.
- Tree-based — O(log n) in balanced trees.
Classical graph algorithms
- Dijkstra's algorithm — single-source shortest paths, non-negative weights, O((V+E) log V) with a priority queue.
- Bellman-Ford — handles negative weights; detects negative cycles; O(VE).
- Floyd-Warshall — all-pairs shortest paths; O(V³).
- Kruskal's and Prim's — minimum spanning tree; O(E log E) and O(E + V log V) respectively.
- Topological sort — DAG ordering; O(V + E).
Algorithm design paradigms
- Divide and Conquer — break into smaller subproblems (Merge sort, Quick sort, FFT).
- Greedy — make locally optimal choices (Dijkstra, Kruskal, Huffman coding).
- Dynamic Programming — overlapping subproblems + optimal substructure (Fibonacci, LCS, Knapsack, Bellman-Ford).
- Backtracking — explore + prune (N-Queens, Sudoku, subset sum).
- Branch and Bound — for optimisation problems (TSP).
- Randomised algorithms — randomised quicksort, Monte Carlo, Las Vegas.
NP-completeness
The complexity-theoretic backdrop:
- P — problems solvable in polynomial time.
- NP — problems whose solutions can be verified in polynomial time.
- NP-Complete — hardest NP problems; if any one is in P, then P = NP. Examples: SAT, Travelling Salesman, Subset Sum.
- NP-Hard — at least as hard as NP problems, may not even be in NP.
The P vs. NP question, posed by Stephen Cook (1971), remains open and is one of the seven Millennium Prize problems.
For exam answers, learn to state worst, average and best case complexities of common algorithms by heart, and be ready to explain why quicksort's worst case is O(n²) — when the pivot is always the smallest or largest element.
Practical relevance
Pakistani fintech (Easypaisa, JazzCash), e-commerce (Daraz), telecom (PTCL, Jazz) and government systems (NADRA, FBR's IRIS portal) all run on optimised data structures and algorithms. A CSS officer interfacing with such systems benefits enormously from knowing why a query is slow, why an index helps, or why a graph traversal lies under social-network recommendations.