Recursion

Two statements are often made about recursion:

A. Recursion works by starting with a computation over a large example and computing the solution in terms of the same computation over 1 or more smaller examples.
B. Any iteration can be implemented as recursion. In fact, in ML, you have no choice, and in Scheme people will look down their noses if you make any other choice.

Unfortunately, these statements are mutually contradictory, since iterations often do not progress from large to small, they often go from small to large or equal size.

Example: Newton method for solving equations recurs from one value of X to another value of X. But the new value is only "smaller" in the sense that it is closer to the solution. There's nothing structurally simpler about it; it's just another floating point number.

(If your measure of the "size" of a problem is "How many more iterations until the loop exits?" then of course all iterations go from large to small. But that's circular.)


; Iterative Newton's method X[I+1] = X[I] - f(X[I])/f'(X[I])
; (newton F DF X0 EPS) uses Newton's method to compute a zero of 
; function F within EPS (i.e.  a value X such (abs (F X)) < EPS. 
; DF is the derivative of F.  X0 is the starting point.

(define (newton1 F DF X0 EPS)
  (do ((X X0 (- X (/ (F X) (DF X)))))  ; (variable initial update)
      ((< (F X) EPS) X) ; termination condition 
  )
)    

; compute sqrt 2; i.e. the zero of X^2 - 2.0.

(newton1 (lambda (X) (- (* X X) 2.0))
         (lambda (X) (* 2.0 X))
         10.0  0.00001)


; Same function defined recursively
(define (newton2 F DF X EPS)
   (if (< (F X) EPS) 
       X
       (newton2 F DF (- X (/ (F X) (DF X))) EPS)))


(newton2 (lambda (X) (- (* X X) 2.0))
         (lambda (X) (* 2.0 X))
         10.0  0.00001)

Extreme example: Interactive programs:

Iterative encoding:

procedure interactive1()
{ S = initialState;
  loop { I := read();
         if (quitInstruction(I)) then { quittingProtocol(S); return(); }
         S:=carryOut(I,S)
     }
}

Recursive encoding

function interactive2(S)
  { I := read();
    if (quitInstruction(I)) then { quittingProtocol(S); return(); }
    interactive2(carryOut(I,S));
  }

main() { interactive2(initialState); }

Tail recursion optimization

Besides the clumsiness of the recursive encoding of "interactive" is the fact that the calling sequence gets longer and longer; hence, the stack gets larger and larger and will eventually exhaust memory.

This can be fixed with tail-recursion optimization:

If the last step of function f is to call itself recursively, and if the value returned by the call to f is the value returned by the recursive call (as in newton2 and interactive2), then the activation record for the first call of f can be replaced by the activation record for the second call. (Note that their lexical context must be the same.)
The same is true of a call from f to g if the lexical context of f and g is the same.

So a compiler with tail-recursion optimization can execute interactive2 without ever blowing the stack. The only disadvantage is a slight overhead. Functional programming enthusiasts are very proud of this.

Sometimes a recursive program has to be twiddled a bit to make it fit the tail-recursion condition.

(define (length L) (if (null? L) 0 (+ 1 (length L))))
does not satisfy the condition, because after the recursive call returns, the function adds 1. However it can be rewritten
(define (length2 L N) (if (null? L) N (length2 L (+ N 1))))
(define (length1 L) (length2 L 0))
Here length2 does satisfy the condition.

My own opinion: If the only recursive call in a function is one that satisfies the tail-recursion condition, then generally it's easier to read if it's written iteratively.

When should you actually use recursion? (E. Davis' opinion)

(Assuming that the PL or the exam question gives you any choice in the matter.)

A. When the algorithm actually works by reducing a large problem to a smaller one of the same type or (especially) when it works by reducing a large problem to more than one smaller problems of the same type (divide and conquer).

B. Algorithms over recursively defined datastructures. E.g. traversing a tree.

function preOrder(T:tree)
{ doWhatever(T)
  for (C in T.children) preOrder(C);
}

C. When the algorithm has the following form:

function f(...)
 {  set up a lot of local state;
    recursive call to f;
    more computation involving the state set up before the recursive call;
 }

The advantage of the recursion in case C is that the saving and restoring of the local state is automatic. If you want to do this without recursion you have to set up your own stack, push the state onto it, and then pop it.

function f(...)
{ loop
   { if (base condition of recursion) then exitloop;
     set up local state;
     push local state onto stack;
   }  /* end loop
  loop
   { do end computation;
     if stack is empty then exitloop;
     state := pop stack;
   }
}

This is very rarely worth doing. Perhaps, if the first part of the algorithm creates a lot of state and only a little bit is used in the end computation, then if you use your own stack, you can save only the part you will need, whereas the recursive encoding will save the whole state. But you can usually accomplish this recursively with a little cleverness in the coding.

Example: quicksort satisfies both conditions A and C.

procedure quicksort(L,U : int; A: array)
{ if (L==U) then return;  /* Base case */
  else { M = partition(L,U,A);
         quicksort(L,M,A);
         quicksort(M,U,A);
       }
}

You sometimes see the second recursive call turned into a loop (note that this satisfies the "tail recursion" condition):

procedure quicksort(L,U : int; A: array)
{ if (L==U) then return;  /* Base case */
  while (L < U) {
      M = partition(L,U,A);
      quicksort(L,M,A);
      L = M;
    }
}

But that destroys the nice symmetry.

Getting rid of the first recursive call involves keeping a set of pairs of (M,U) values that you will work on later.

Bottom line: My original two statements should be reworded:

A. Recursion should be used when you are starting with a computation over a large example and computing the solution in terms of the same computation over 1 or more smaller examples.
B. Any iteration can be implemented as recursion, but that's not always a good idea.