19 Algorithms and Computational Complexity
An array is a basic data structure. It stores a fixed number of elements of the same data type under a single variable name. Sorting these elements is one of the most fundamental tasks in computer science.
19.1 Selection Sort
Selection Sort works as follows:
- Go through the array to find the lowest value.
- Move the lowest value to the front of the unsorted part of the array.
- Repeat this process for the unsorted part as many times as there are values in the array.
In order to evaluate the time complexity of Selection Sort, it is necessary to analyse how the number of comparisons increases as a function of the number of elements being sorted. Due to the way Selection Sort operates, approximately \(\frac{1}{2}n^2\) comparisons are required for \(n\) elements. This means the number of operations increases quadratically relative to the input size.
Since we categorise an algorithm’s efficiency according to its order of magnitude, we refer to this as having quadratic time complexity. In this process, constant factors (like the \(\frac{1}{2}\)) and lower-order terms are disregarded, as the primary concern when evaluating an algorithm is how the time requirement scales with extremely large datasets.
19.2 Big O Notation
To formally capture this focus on the fundamental rate of growth, computer scientists use a specific shorthand known as Big O notation.
The ‘O’ stands for ‘Order of’, indicating the complexity class of an algorithm. Instead of meticulously counting every single operation, Big O notation describes the upper bound of the growth rate. For instance, stating that Selection Sort has a time complexity of \(O(n^2)\) tells us that if the input size doubles, the time required will roughly quadruple, regardless of the computer’s speed or any minor mathematical constants.
19.3 Merge Sort: A More Efficient Approach
To illustrate why choosing the right complexity class matters, let’s compare Selection Sort with Merge Sort. While Selection Sort works by repeatedly searching for the minimum, Merge Sort uses a “divide and conquer” strategy. It recursively splits the array into two halves until each part contains only one element, and then merges these parts back together in the correct order.
19.3.1 The Merge Step: Combining with Efficiency
The actual efficiency of Merge Sort lies in the merge step:
- Imagine you have two smaller piles of cards, both already sorted from lowest to highest.
- You only ever need to compare the two “top” cards of each pile.
- You pick the smaller one, place it into the new, larger array, and move to the next card in that pile.
Because this step only requires a single pass through the elements, merging two sorted lists of total size \(n\) takes linear time, or \(O(n)\). Since the “Divide” phase splits the problem into \(\log n\) levels and an \(O(n)\) merge operation is performed at each level, the total time complexity becomes:
\[n \cdot \log n = O(n \log n)\]
19.4 Comparison of Growth Rates
The “log-linear” growth of Merge Sort is significantly more efficient than the “quadratic” growth of Selection Sort as the input size \(n\) increases.
| Number of Elements (\(n\)) |
Selection Sort (\(O(n^2)\)) |
Merge Sort (\(O(n \log n)\)) |
Efficiency (Ratio) |
|---|---|---|---|
| 10 | 100 operations | ~33 operations | 3x faster |
| 1,000 | 1,000,000 | ~10,000 | 100x faster |
| 1,000,000 | 1,000,000,000,000 | ~20,000,000 | 50,000x faster |
For a small list, the difference is negligible. However, for large datasets, Selection Sort becomes unusable, while Merge Sort remains efficient. This is the power of understanding computational complexity: it allows us to predict how an algorithm will perform when data scales up.