[FOM] Simple Turing machines, Universality, Encodings, etc.

Fri Nov 2 00:59:24 EDT 2007

Alexander Smith wrote:
> The overall point I'm making is that universality is a hard concept 
> to pin down, and I suspect that there may never be a definition that 
> fits everyone's intuition as to whether a system should be considered
> universal or not. Likewise with defining what information should be
> included as part of a system, or group of systems, and what is part
> of its initial condition.

While some on this list might object to those who challenge the
authority of the accumulated conventional wisdom, I come down on the
side of those like you who do, who are likely in the long run not only
to understand its rationale at a deeper level but be in a better
position to improve on it.

I see two main issues underlying your concerns, the meaning of the
Church-Turing thesis, and what to do about nonterminating computation.

============

One statement of the former is that every effective procedure can be
carried out by a Turing machine.  The difficulty of defining "effective
procedure" is nicely brought out by the example of a Turing machine for
which the rest of the tape, beyond the finite portion designated as the
input for the present run, has as its n-th bit whether the n-th Turing
machine suitably enumerated halts on blank tape.  (Prize committee
member Gregory Chaitin would describe this as the binary representation
of his real number Omega.)  Such a machine cannot hope to be emulated by
a Turing machine with blank tape in place of that very useful information.

Like so many other definitional challenges, this one of defining
computability has yielded satisfactorily to the axiomatic method, in
this case as represented by recursion theory.  In the form to which it
stabilized during the three decades starting in the mid-1930s, recursion
theory defines a program to be a natural number denoting a partial
function f: N --> N, that is, a (total) function f: W --> N where W is a
subset of N called the *domain* of f.    The partial functions so
denoted are called the partial recursive functions, p.r. functions for
short.  A p.r. function with domain N is called simply a *recursive
function*, or total recursive function by those few who understand
"function" to include the partial functions.  The p.r. function denoted
by x is conventionally written phi_x.  Programmers call x a program,
mathematicians call it a Goedel number, but it comes to the same thing.

To be a member of the domain of phi_x is to be an input on which program
x *halts*; on nonmembers one says that program x *diverges*.  When y is
in the domain of phi_x, phi_x(y) is understood to be the value
eventually returned by the program x when started on input y.

While those not inclined to question authority may want to jump over
Chapter 1 of Hartley Rogers' 1967 text "Theory of Recursive Functions
and Computability" in order to get to the theorems, if you haven't
already read it I would expect you would greatly enjoy it as it
addresses many of the issues that concern you, as prologue to its
careful development of the above approach.

If recursion theory is the axiomatic answer to "what is computability?",
what are its axioms?  These were originally developed in the 1930s, most
notably by Kleene, Church, and Turing.  In 1936 Kleene defined the
partial recursive functions to be those obtainable by composition and
minimalization (an elegant abstraction of iteration) from a starting set
of basic functions all obviously well within the scope of the
Church-Turing thesis.  Martin Davis's 1958 book *Computability and
Unsolvability* replaces Kleene's basic functions with those of Julia
Robinson from 1950: successor, projection, addition, monus (max of 0 and 
minus), and multiplication.  This is then relativized to an arbitrary 
set A of natural numbers by including as well the characteristic 
function of A, allowing the notion of an A-recursive function and the 
development of a hierarchy of degrees of computability.

What is remarkable about the process of coming up with basic functions
is that it completes so quickly under a wide range of choices of
obviously simple-to-compute functions.  This was noticed by many in the
1930s, and amounts to an even stronger version of the phenomenon that
Wolfram has more recently called the Principle of Computational
Equivalence, PCE: that the first few functions thrown into the pool
produce trivial behavior, and then suddenly you have all the partial
recursive functions.  This phenomenon is a consequence of the choice of
composition and minimization as the closure operations: the more general
Church-Turing thesis makes no assumption about the closure operations
other than that they be effective in some sense, and does not of itself
entail PCE.  Wolfram's application of PCE is to cellular automata, which
I'll address shortly under the heading of what to do about
nonterminating computation.  Cellular automata incorporate the essence
of composition and minimalization, which is why PCE shows up in that
context.

More recently Manuel Blum (whose advisor *and* spouse are both on the
present prize committee) and others such as Yiannis Moschovakis in Logic
Colloquium'69, Sol Feferman in Logic Colloquium'76, and Jens Fensted in
his 1980 book *General recursion theory: An axiomatic approach*,
have since the mid-1960s advocated more abstract axioms, e.g.
by taking certain theorems of recursion theory such as the s-m-n theorem
as its axioms, in place of the customary listing of basic functions.
The justification here is that since few if any of the interesting
proofs refer to any of the customary basic functions, what's the point
of having them in an axiomatization of the subject?  Feferman makes the
additional argument that defining recursion theory in terms of
particular basic functions is like defining a vector space in terms of a
particular choice of basis.

==============

The second issue is what to do about nonterminating computations.  This
is particularly relevant to cellular automata such as Conway's Life.
Whereas partial recursive functions withdraw hermit-like until they
emerge to announce the answer, in Conway's case Life goes on.  With some
starting configurations Life eventually settles down into a periodic
pattern, with others not.  Wolfram has proposed a finer classification
of both these cases, distinguishing the periodic patterns according to
whether or not they disappear altogether, and distinguishing the
aperiodic ones according to whether they grow indefinitely at a fixed
rate or exhibit more complicated behavior.

It is not clear to me whether the case of the disappearing pattern
warrants separate attention.  The steadily growing vs. more varied ones
however seems on the face of it to be a worthwhile distinction given
that one might expect to find the latter a source of more interesting
behavior than the former.  In 1998 however Baldwin and Shelah in a paper
at http://www.math.uic.edu/~jbaldwin/pub/cabs7.ps give a result
suggesting that the latter distinction may not as be well-defined as
might seem at first sight.

In an earlier FOM post I suggested a definition of universality of a
particular cellular automaton or other nonhalting automaton A in terms
of two sets, an r.e.-complete set K of finite initial configurations
depending on A (to save A the additional decoding effort needed for a
fixed set K, in the spirit of trying to promote at least one 2,3 machine
to universality, the whole point of the present prize competition), and
a set H of configurations recursive in the size of the initial
configuration.  By "recursive in" I mean that the time needed to
determine whether A has entered a member of H is a recursive function of
the size of the initial configuration; this notion is quite robust in
that its meaning is not changed when "space" is substituted for "time".
Earlier I had suggested allowing H to be recursive in the size of the
whole computation but then realized via a two-PDA-type argument that my
criterion was *too* generous, admitting PDAs for example.  (I still
don't understand how Hector Zenil managed to interpret this PDA-based
criticism of my own criterion as a criticism of Smith's proof, which I 
objected to in terms of linear bounded automata.)

In the interest of lowering the bar for universality in order to admit
2,3 machines it has been proposed to allow nonblank medium (tape,
surface, volume, etc.) outside the finite initial configuration.  With
the caveat that this may make some machines universal that might not
otherwise have quite made the grade on a bureaucratic technicality, I
have no objection in principle to dispensations of this sort just so
long as they do not inadvertently admit provably nonuniversal machines
into the caste of recognized universal machines, especially not such
untouchables as LBAs and PDAs.  Thus a periodic ambient background
should be ok (since a Turing machine with a few more states could
generate that background on its own), but certainly not an encoding of
some r.e.-complete set such as Chaitin's Omega: taking your copy of
Chaitin into an automata theory exam would be like taking your copy of
Feynman into a physics exam.

Vaughan Pratt