Partition & Quick Sort
Reviewing what I studied, how this work will be explained as well.
Quick Sort is an efficient, comparison-based sorting algorithm that employs the divide and conquer strategy. Unlike Merge Sort which divides arrays into equal halves, Quick Sort creates unbalanced partitions based on a pivot element. This characteristic gives Quick Sort its name - it’s particularly “quick” for many real-world scenarios.
The Divide and Conquer Strategy in Quick Sort
- Divide: Select a pivot element and partition the array into two sub-arrays - elements less than the pivot and elements greater than the pivot.
- Conquer: Recursively sort the sub-arrays.
- Combine: Since the partitions are already sorted in place, no explicit combination step is needed. Let’s actually see how this works.
Partitioning
Partitioning is a crucial step in the Quick Sort algorithm where elements are rearranged around a pivot value. There are two main partitioning schemes: Lomuto and Hoare.
Lomuto Partition Scheme The Lomuto partition scheme typically selects the last element as the pivot. It uses a one-directional scanning technique that maintains three distinct partitions: elements less than the pivot, elements greater than the pivot, and the unknown section.
pair<int, int> lomutoPartition(int data[], int lo, int hi) {
int pivot = data[hi]; // alternatively you can select the pivot in the middle of the array.
int i = lo - 1; // int i = 0
for (int j = lo; j < hi; j++) {
if(data[j] <= pivot)
swap(data[++i], data[j]); // data[i++]
}
swap(data[++i], data[hi]); // data[i]
return make_pair(i - 1, i + 1);
}
Hoare Partition Scheme The Hoare partition scheme, developed by Sir Charles Antony Richard Hoare (the creator of Quick Sort), typically selects the first element as the pivot. It uses a two-directional scanning technique with two pointers moving in opposite directions.
pair<int, int> hoarePartition(int data[], int lo, int hi) {
int pivot = data[(lo + hi) / 2];
int i = lo - 1;
int j = hi + 1;
while(true){
do i++; while(data[i] < pivot);
do j--; while(data[j] > pivot);
if (i < j)
swap(data[i], data[j])
else
break;
}
return make_pair(j, j+1);
}
version 2:
pair<int, int> hoarePartition2(int data[], int lo, int hi) {
int pivot = data[(lo + hi) / 2];
int i = lo;
int j = hi;
while( i <= j ){
while(data[i] < pivot)
i++;
while(data[j] > pivot)
j--;
if (i < j)
swap(data[i++], data[j--];
else
i++;
}
return make_pair(j, j+1);
}
version 3: when pivot was selected in the first element
pair<int, int> hoarePartition3(int data[], int lo, int hi) {
int pivot = data[lo];
int i = lo;
int j = hi+1;
while(true) {
do i++; while( i <= hi && data[i] < pivot);
do j--; while( j > lo && data[i] > pivot);
if(i < j)
swap(data[i], data[j]);
else
break;
}
swap(data[lo], data[hi]);
return make_pair(j -1, j+1);
}
3-way Partitioning
This is very specific cases, where there are multiple or repetitive numbers:
Then we use 3-way partitioning
. There are two 3-way partitioning; hoare partition and Bentley-Macllroy.
pair<int, int> partition3wayHoare(int data[], int lo, int hi) {
int pivot = data[(lo + hi)/2];
int lt = lo;
int gt = hi;
for (int i = lo; i <= gt;) {
if(data[i] < pivot)
swap(data[lt++], data[i++])
else if (data[i] > pivot)
swap(data[i], data[gt--])
else
i++;
}
return make_pair(lt - 1, gt + 1);
}
pair<int, int> partitionBently3way(int data[], int lo, int hi) {
int pivot = data[(lo + hi) /2];
int i = lo - 1, j = hi + 1;
int p = lo - 1, q = hi + 1;
while(true) {
while(data[++i] < pivot);
while(data[--j] > pivot);
if (i < j) {
swap(data[i], data[j]);
if (data[i] == pivot)
swap(data[++p], data[i]);
if (data[j] == pivot)
swap(data[--q], data[j]);
} else {
if (i == j)
swap(data[++p], data[i]);
break;
}
}
i = j +1;
for (int k = lo; k <= p; k++)
swap(data[k], data[j--]);
for (int k = hi; k >= q; k--)
swap(data[k], data[i++]);
return make_pair(j, i);
}
So using one of those partition mechanism, we can use one of those above.
How to select right pivot value?
- first or last:
- random:
int pivot = data[rand() % (hi - lo + 1) + lo];
- middle
- MED3 (Median of Median)
Standard C++ Library: Select Pivot
int choosePivot(int data[], int lo, int hi)
{
int n = hi - lo + 1;
int mid = (lo + hi) / 2;
if (n > 7) {
if (n > 40) {
lo = MED3(data, lo, lo + d, lo + 2*d);
mid = MED3(data, mid - d, mid, mid + d);
hi = MED3(data, hi - 2*d, hi - d, hi);
}
mid = MED3(data, lo, mid, hi);
}
return mid;
}
int pivot = data[choosePivot(data, lo, hi)];
Algorithm Walkthrough
Let’s trace through the Quick Sort algorithm with a concrete example:
Initial Array: {5, 3, 8, 4, 9, 1, 6, 2, 7}
First Partition:
- Select pivot: Let’s choose the first element, 5
- Initialize pointers:
- Left pointer starts at index 1 (value 3)
- Right pointer starts at the last index (value 7)
- Partitioning process:
- Move left pointer rightward until finding an element ≥ pivot
- Move right pointer leftward until finding an element ≤ pivot
- Swap these elements if pointers haven’t crossed
- Continue until pointers cross
- Final swap: Place pivot at its correct position by swapping with the right pointer
- Result: {2, 3, 1, 4, 5, 9, 6, 8, 7}
Now we have two sub-arrays: {2, 3, 1, 4} (elements < 5) and {9, 6, 8, 7} (elements > 5).
Second Partition (Left Sub-array):
- Select pivot: 2
- Partitioning process: Similar to above
- Result: {1, 2, 3, 4} with sub-arrays {1} and {3, 4}
Third Partition (Right Sub-array):
- Select pivot: 9
- Partitioning process: Similar to above
- Result: {7, 6, 8, 9} with sub-arrays {7, 6, 8} and {}
The process continues recursively until all sub-arrays are sorted.
This can be done recursively. Let’s look at the code.
void quickSortRect(int data[], int lo, int hi) {
if (lo >= hi) return;
int leftLast, rightFirst;
auto [leftLast, rightFirst] = partition(data, lo, hi);
quickSortRec(data, lo, leftLast);
quickSortRec(data, rightFirst, hi);
}
void quickSort(int data[], int n) {
quickSortRec(data, 0, n -1)
}
Time Complexity
The time complexity is based on the pivot value, because this will make array to be unbalanced. The worst case would be O(n^2), but the best case would be O(nlogn)