Lecture 14.A: Sorting algorithms. 4/8

For practical purposes, use the Java library methods.

Sorting routines are standardly written in terms of arrays. (If the data is in the form of a linked list, copy it over into an array and sort there.)

Simple, quadratic sorting algorithms

Selection sort

Find the smallest, then the second smallest, etc.
for (i=0; i < A.length-1; i++) {
   ismallest = i;
   for (j=i+1; j < A.length; j++) 
     if (A[j] < A[ismallest]) ismallest = j;
   swap(A[i], A[ismallest])
  }
"swap(A[i], A[j])" is an abbreviation for
     tmp = A[i];
     A[i] = A[j];
     A[j] = tmp;
A loop invariant is a statement that is true after each iteration of the loop. The loop invariant for the outer loop above is that, after the i iteration, the first i elements in the array are the i smallest elements, in sorted order.

Time: O(N2) in all cases.

Insertion sort

Intuitively, put the second card in its correct position relative to the first; then the third relative to the first two, and so on. Inserting it involves sliding everything down, so the processes of finding the right place and inserting can be combined by just comparing the new element to the those already sorted, in backward order; swapping if it is less; and stopping when it is more.
for (i=1; i < A.length; i++) {
   j = i;
   while (j > 0 && A[j] < A[j-1}]) {
      swap(A[j],A[j-1])
      j--;
   }
Time O(N2) in the worst case (data arrives in backward order) and in the average case. If data arrives in forward sorted order, then this algorithm O(N) and in fact optimal; it simply checks that each term is greater than the previous.

Loop invariant: After the ith iteration, the original first i elements have been sorted.

This is the best sorting routine for small N (certainly for N < 5; perhaps for N < 8).

You could find the correct place to insert faster by doing a binary search, but it would not save much time because the insertion takes O(N) at each iteration.

Bubble sort

No very intuitive description.
for (i=0; i < A.length-1; i++)
  for (j=A.length-1; j > i; j--) 
     if (A[j] < A[j-1]) swap(A[j],A[j-1]);
Time O(N2) in all cases.

Loop invariant: After the i iteration, the first i elements in the array are the i smallest elements, in sorted order. (Same invariant as selection sort.)

The only advantage is that the code is the shortest.

HeapSort

A MinHeap is a binary tree with numeric labels satisfying the following conditions. Example:

Because of the shape constraint, the height of the tree is always log2N.

Operations

A MinHeap supports two operations

The algorithm for these is as follows
add(x) {
   create a new node Q with value x, and put it at the next position in the 
     tree;
   while (Q.value < Q.parent.value) {
      swap Q.value with Q.parent.value;
      Q = Q.parent;
     }
}
Runs in time O(height of tree) = O(log(N)) because each iteration of the loop climbs one step up the tree.
deleteMin() {
    m = Root.value;
    L = the last node in the tree;
    Root.value = L.value;
    delete L;
    Q = Root;
    while (Q has children &&
           (Q.value > Q.left.value || Q.value > Q.right.value)) {
       W = the child of Q with the smaller value.
       swap Q.value with W.value;
       Q = W;
      }
    return m;
  }
Runs in time O(height of tree) = O(log(N)) because each iteration of the loop climbs one step down the tree.

Heapsort

H = empty heap;
for (i=0; i < A.length; i++) H.add(A[i])
for (i=0; i < A.length; i++) A[i]=H.deleteMin();
Each loop iterates N times and each iteration takes time log(N), so the whole procedure runs in time O(N log(N)).

Array implementation

Because of the shape constraint, there is a very efficient implementation of a heap as an array: Put the items in the heap in breadth-first order into the array H.
Then the children of H[i] are H[2*i+1] and H[2*i+2] (zero-based indexing). The parent of H[i] is H[(i-1)/2]

E.g. the children of H[0] are H[1] and H[2]. The children of H[1] are H[3] and H[4]. The children of H[2] are H[5] and H[6]. And so on.

So you get a very simple of implementation of Heapsort:

int [] H;    // The heap
count;       // Number of items in the heap = index of first empty slot.


add(x) {
    H[count] = x;                   // Add A[i] to the heap
    q = count
    count++;
    while (q > 0 && H[(q-1)/2] > H[q]) {
        swap(H[q},H[(q-1)/2]);
        q = (q-1)/2;
      }
} // end add


deleteMin() {
    m = H[0];
    count--;
    H[0]=H[count];
    q = 0;
    while (2*q+1 < count) {
      c1 = 2*q+1;   // Two children
      c2 = 2*q+2;
      if (c2 == count)  // Only one child
         smaller = c1;
      else if (H[c1] < H[c2])
              smaller = c1;
      else smaller = c2;
      if (H[smaller] > H[q]) exitloop;
      swap(H[q],H[smaller])
      q = smaller;
      } // end while loop
    return m;
 } // end deleteMin.
   

heapsort(A) {
H = new int[A.length];
count = 0;
for (i = 0; i < A.length; i++) 
    add(A[i]);
for (i=0; i < A.length; i++)
    A[i] = deleteMin()
  }

More tricks

1. You don't need to have a spare array H; you can work within A itself by taking items off the back of the array and building up H at the front of the array and then working in the opposite direction.

2. You can build the heap in time O(N) rather than O(N*log N) by building it from bottom up rather than top down.