[FOM] 183:Ideas in Proof Checking 2

Harvey Friedman friedman at math.ohio-state.edu
Sun Jun 22 17:48:58 EDT 2003


In #182, 10:50PM 6/21/03, I promised the following. Our IPC 
(interactive proof system) will be driven by

1) a truly appropriate language for the formalization of mathematics;
2) model trees (Beth tableaux);
3) trivial proofs;
4) interaction protocols for constructing nontrivial proofs.

In this posting I will talk about 2 and 4, at least to the extent 
that I can before I take up 1.

PS: This posting got too long, and some of 2,4 is spilling over into 
posting #184.

I think that the right way of dealing with 3 will only be apparent 
after 1 is specified, and I work out some examples. I have a lot of 
confidence that a good notion of 3 will emerge. I.e.,

THESIS. A naturally occurring mathematical statement either has a 
trivial proof, or there are appropriate naturally associated 
mathematical statements in the form of Lemmas that naturally precede 
it that cause it to have a trivial proof. Furthermore, trivial proofs 
can be trivially obtained from the computer, relative to obvious 
imports from the human.

Lemma rich exposition is not the norm among mathematicians. But some 
of us do practice this. It is a rather effective form of exposition, 
and leads to more error free publications. It does require extra 
effort, but it is well worth it for other humans. Now we have another 
use of it - the interactive creation of absolute certainty.

I'm not sure that an analysis of trivial proofs is so critical. My 
plan is to make the human/machine interaction in the proof system 
being designed so easy and simple, that it is actually a lot of fun, 
especially on theorems with trivial proofs.

I am well aware that the necessary presence of a lot of possibly 
undefined terms can make things more complicated than otherwise. 
However, I believe that I have enough user interface tricks to make 
this painless.

1. MODEL TREES (BETH TABLEAUX) - TOYS.

In this section, we give the theory behind the IPC proposal.

To make the ideas clear, I will confine the discussion to the toy 
system of ordinary propositional calculus, with multiple conjunction 
and disjunction. I will also continue with this toy case also in the 
next section, which is design o the IPC.

I will then redo everything for the semitoy system of ordinary 
predicate calculus with equality. Then I will address the semiserious 
system of free predicate calculus. Any further discussion will be 
postponed until 1) above is addressed: i.e., the truly appropriate 
language for the formalization of mathematics.

We have atoms pn, n >= 1. We use connectives not, and, or, implies, iff.

We define formulas as follows.

i. pn is a formula.
ii. if A is a formula then (notA) is a formula.
iii. if A1,...,An are formulas, n >= 2, then (A1 or ... or An) and 
(A1 and ... and An) are formulas.
iv. if A,B are formulas then (A implies B) and (A iff B) are formulas.

We will take up the matter of precedence and parentheses when we get 
to the real thing, later.

The extended formulas are

T(A)
F(A).

The first means that "we want A to be true". The second means that 
"we want A to be false".

Suppose that we want to prove A. We build a model tree for F(A).

This is an attempt to find a model (truth assignment) in which A is false.

If A is provable, then we are going to fail. This corresponds to the 
model tree closing off, in the sense that all of the leaves are 
absurd.

More specifically, we start with the root labeled F(A).

We are going to nondeterministically create a tree going upward from the root.

At any stage, we will have a nonempty tree in which every node is 
labeled with a finite set of extended formulas.

There are four kinds of moves that can be made at any stage.

i. Two or more new nodes are created which lie just above some 
existing terminal node. The previously terminal node is not terminal 
any more. One or more formulas are placed in each of these new nodes. 
This is called a splitting.

ii. No new nodes are created. Instead, for one of the nodes, some 
formulas are added.

iii. Certain terminal nodes are marked as absurd.

The model tree is declared ABSURD if and only if all terminal nodes 
are marked as absurd.

Generally, the strategy is to attempt to get halting with as few 
splits as possible. Generally, this will cut down the size of the 
tree.

Here are the basic rules.

For iii, a terminal node is marked absurd if and only if when looking 
along the path from the root up to and including the terminal node. 
one finds some T(B) and F(B). They may be from different nodes.

For ii. Any one of the following is a legal move.

a. Add F(B) if T(notB) is present.
b. Add T(B) if F(notB) is present.
c. Add any of T(A1),...,T(An) if T(A1 and ... and An) is present.
d. Add any of F(A1),...,F(An) if F(A1 or ... or An) is present.
e. Add any of T(A),F(B) if F(A implies B) is present.
f. Add  any of T(A implies B),T(B implies A) if T(A iff B) is present.

For i. We first identify a terminal node. Then we identify a nonempty 
list from the forms not covered in ii above, that lie on the path 
from the root to the terminal node:

T(A1 or ... or An)
F(A1 and ... and An)
T(A implies B)
F(A iff B).

If the list has k entries, then we will have a split whose size is 
the product of the arities of the k entries. The arity of the above 
entries are, respectively, n,n,2,2.

We will be content to give the details in a simple case, leaving the 
general case to the reader.

Suppose

T(A or B)
F(C iff D)

have been identified on the path from the root to the terminal node.

Then we split into four new terminal nodes, which are respectively labeled with

T(A),T(C),F(D)
T(A),T(D),F(C)
T(B),T(C),F(D)
T(B),T(D),F(C).

THEOREM 1.1. If there is an absurd model tree starting with the root 
labeled F(A), then A is provable (is a tautology). Conversely, if A 
is provable (is a tautology), then there is an absurd model tree 
starting with the root labeled F(A).

It is also useful to have the following sharper form.

THEOREM 1.2. A1,...,An tautologically implies B if and only if there 
is an absurd model tree starting with T(A1),...,T(An),F(B).

When giving examples of model trees, it is helpful to delete entries 
so the tree looks less cluttered.

EXAMPLE. ((p implies q) implies p) implies p.

1. F(((p implies q) implies p) implies p).
1. T((p implies q) implies p), F(p).
1.1. F(p implies q)).
1.2. T(p). ABSURD.
1.1. T(p), F(q). ABSURD.

EXPLANATION. Each line represents a move. The nodes are 1, 1.1, 1.2. 
The first line is the initialization of the problem. The second line 
adds a formula, and then deletes the original formula. (Any deletions 
are OK). The third and fourth lines are a binary split. The node 1.2 
is recognized as absurd. The fifth line adds a formula to node 1.1, 
and deletes the formula in line 3. Then node 1.1 is recognized as 
absurd. Finally, all terminal nodes are recognized as absurd.

There is an important additional optional move we add to the mix.

vi. Split at any time in any legal way on any terminal node.

Here the splitting is given by a list of distinct formulas, 
A1,...,An. There are 2^n new terminal nodes placed above the given 
terminal node. Each new node is labeled with a different set of n 
formulas, which are just the A's with T's and F's put in front.

THEOREM 1.3. Theorems 1.1 and 1.2 still hold with this new move vi allowed.

Rule vi is very useful - even ESSENTIAL - in the full blown real 
thing we are designing. Here is the reason.

The reason is to create manageable imports. There are also other 
important reasons.

I used to think that at the outset of the proof of a statement, one 
should first do all of the needed imports.

This is awkward, unfriendly, and difficult to do completely.

What instead you want to do is to go ahead and make these model tree 
moves. Then at some point you say

"gee, if only I knew this, I could do what I want now."

So you just simply enter what you need. Don't worry at all about the 
fact that what you need has not been established. Since the proof you 
are constructing interactively is supposed to be nearly trivial, you 
know that it is going to be easy to justify this entry when you feel 
like it at a later time. The order in which things are done is of no 
consequence, as long as they all get done.

And you don't have to remember that there is something left over. The 
computer keeps an inventory of model trees that you later need to 
close off. At some point, you just complete all these little tasks 
that the computer asks you to do, in the form of ABSURDITY CREATION. 
All organization of tasks is done by the computer.

The computer is even good at looking up whether you have entered 
something that really is just a legal import. Then the computer won't 
bother you about it later.

In fact, all the time you spend at the computer slowly making your 
mouse clicks and entering some terms, the computer is in the 
background trying to close off the model tree all by itself, and also 
justifying the stuff you have left over for later - also, possibly by 
closing off other model trees. You may be pleasantly surprised that 
the computer is working very effectively as you are working. Even 
many nodes - maybe not all nodes - that you thought you would have to 
work on, will have already been branded as absurd by the computer.

NOTE: Now it becomes clear that a proper workable notion of TRIVIAL 
PROOF is, after all, important. This is what the computer will look 
for emanating from any newly created terminal node.

2. IMPLEMENTATION OF MODEL TREES - TOYS.

It is nasty to look at a tree displayed on the screen. That is why, 
in my implementation of the previous section, I will protect the 
human entirely from actual model trees.

But at what cost? No cost!

The computer keeps a list of tasks for the human. While the human is 
working on these tasks, the computer tries to solve its inventory of 
tasks not yet solved on its own. Any task that it solves on its own 
is removed from the inventory. Between the human and the computer, 
hopefully the inventory is reduced to nothing.

All tasks have the following form. A list of signed formulas are 
given. The human is supposed to read them and see that they lead to 
absurdity. Of course, if the list contains a T(B),F(B), then the 
computer won't even bother to present this task (list). It will have 
already been solved.

So the human is presented with a nonempty list of signed formulas. 
The human can take any of the following actions.

1. I don't feel like working on this task now. Give me another one.

2. Click on T(A1 and ... and An). Then T(A1),...,T(An) appear on the list.

3. Click on F(A1 or ... or An). Then F(A1),...,F(An) appear on the list.

4. Click on F(A implies B). Then T(A), F(B) appear on the list.

5. Click on T(A iff B). Then T(A implies B), T(B implies A) appear on the list.

6. Click on T(notA). Then F(A) appears on the list.

7. Click on F(notA). Then T(A) appears on the list.

8. Click on T(A1 or ... or An). Then T(A1) appears on the list. Also 
the computer stores the present list appended with, separately, 
T(A2),...,T(An), thereby hiding from view n-1 tasks, all of which 
will be eventually sprung on the human at later times.

9. Click on F(A1 and ... and An).  Then T(A1) appears on the list. 
Also the computer stores the present list appended with, separately, 
F(A2),...,F(An), thereby hiding from view n-1 tasks, all of which 
will be eventually sprung on the human at later times.

10. Click on T(A implies B).  Then T(B) appears on the list. Also the 
computer stores the present list appended with F(A), separately, 
thereby hiding from view 1 task, which will be eventually sprung on 
the human at a later time.

11. Click on F(A iff B).  Then T(A),F(B) appears on the list. Also 
the computer stores the present list appended with F(A),T(B) 
separately, thereby hiding from view 1 task, which will be eventually 
sprung on the human at a later time.

12. Doubleclick on any entry. The entry disappears.

13. You can restore the disappearing entries from 8, one at a time.

The idea behind 12 is to make things less cluttered. The computer 
always considers the disappeared entries as part of the list. 
However, they will always remain marked to disappear from view, 
unless 13, 14 instructs them to be put back into view.

Tasks are considered completed when and only when an absurdity, 
T(B),F(B) appears.

There is one more move of essential importance.

14. The user can simply pick any extended formula out of the air, and 
put it on the list. Presumably this makes the task easier - that is 
the point of putting it on the list. The cost is that the computer 
then stores the following task to be sprung later on the human: the 
same list but with the new extended formula dualized.

I.e., the human wishes to add the new formula T(A) to the list L. 
Then the computer stores the task consisting of: L together with F(A) 
for later. Or the human adds the new formula F(A) to L. Then the 
computer stores the task: L together with T(A).

Of course this is an obvious waste of time if T(A) or F(A) is already 
on the list. The user is prevented from such foolishness.

This is also subject to the undo operation.

This last move occurs in the following typical situation. The human 
realizes that if only they had T(A) then they would know just how to 
proceed, and they have confidence that they can get T(A) when asked 
to. They will get asked to in the form of being presented with the 
associated task with F(A), but a later time when they are ready to 
think about that.

That's it. If the Lemma is A, then the original task is

F(A).

More generally, suppose that the Lemma takes the form:

Assume A1,...,An. Then B.

Then the original task is:

T(A1),...,T(An),F(B).

I want to indicate how the computer stores the not yet completed tasks.

At any stage, the computer has a list of one or more not yet 
completed tasks. Recall that a task gets completed when and only when 
an absurdity becomes present in the form of some T(B),F(B).

Each task is merely a nonempty finite set of signed formulas. 
However, we want more data. We mark some of them as disappearing from 
view, and we assign a number to them, so that we can implement 14 
above. This extra data is maintained as the list changes, so that 
when the list is revisited because of, say, 1, we have maintained the 
instructions on how to implement 14 above.

We indicate what the computer does according to moves 1-14.

First of all, the computer keeps a count of the tasks not yet 
completed, and displays this to the human. It also displays the 
position of the present task being displayed, in this list. E.g., the 
computer might have 18 incompleted tasks, and the one on display is 
number 7.

1. The present task is put at the end of the list of incomplete tasks 
(default), or at a position specified by the human by number.

2-7. The appropriate adjustment is made in the present task.

8-11. The new hidden tasks become the very next task (default), or is 
put into the list of incomplete tasks at the position specified by 
the human.

12. The entry is put at the end of the list of disappearing entries 
for this task.

13. Make them appear, one at a time, from the back of the list of 
disappearing entries.

14. The new hidden task becomes the very next task (default), or is 
put at a position specified by the human by number.

If a task becomes completed - i.e., gets an absurdity of the form 
T(B),F(B), then it is deleted from the list of incomplete tasks.

The object of the human is to get all tasks deleted.

3. MODEL TREES - PREDICATE CALCULUS.

We use variables xn, n >= 1. We use the usual connectives not, and, 
or, implies, iff. We use the quantifiers forall, therexists. We use 
constant symbols cn, n >= 1, n-ary relation symbols Rnm, n,m >= 1, 
and n-ary function symbols Fnm, n,m >= 1. No equality in this version.

More specifically, we start with the root labeled F(A).

We are going to nondeterministically create a tree going upward from the root.

At any stage, we will have a nonempty tree in which every node is 
labeled with a finite set of extended formulas.

There are four kinds of moves that can be made at any stage.

i. Two or more new nodes are created which lie just above some 
existing terminal node. The previously terminal node is not terminal 
any more. One or more formulas are placed in each of these new nodes. 
This is called a splitting.

ii. No new nodes are created. Instead, for one of the nodes, some 
formulas are added.

iii. Certain terminal nodes are marked as absurd.

The model tree is declared ABSURD if and only if all terminal nodes 
are marked as absurd.

Generally, the strategy is to attempt to get halting with as few 
splits as possible. Generally, this will cut down the size of the 
tree.

Here are the basic rules.

For iii, a terminal node is marked absurd if and only if when looking 
along the path from the root up to and including the terminal node. 
one finds some T(B) and F(B'), where B' is an alphabetic variant of B 
(change of bound variables). They may be from different nodes.

For ii. Any one of the following is a legal move.

a. Add F(B) if T(notB) is present.
b. Add T(B) if F(notB) is present.
c. Add T(A1),...,T(An) if T(A1 and ... and An) is present.
d. Add F(A1),...,F(An) if F(A1 or ... or An) is present.
e. Add T(A),F(B) if F(A implies B) is present.
f. Add T(A implies B),T(B implies A) if T(A iff B) is present.
g. Add T(A[x/t]) if T((forall x)(A)) is present and t is 
substitutable for x in A.
h. Add F(A[x/t]) if F((therexists x)(A)) is present and t is 
substitutable for x in A.
i. Add T(A[x/v]) if T((therexists x)(A)) is present and v is 
substitutable for x in A and v is not free in the formulas from the 
root up through the node.
j. Add F(A[x/v]) if F((forall x)(A)) is present and v is 
substitutable for x in A and v is not free in the formulas from the 
root up through the node.

For i. we first identify a terminal node. Then we identify a nonempty 
list from the forms not covered in ii above, that lie on the path 
from the root to the terminal node:

T(A1 or ... or An)
F(A1 and ... and An)
T(A implies B)
F(A iff B).

If the list has k entries, then we will have a split whose size is 
the product of the arities of the k entries. The arity of the above 
entries are, respectively, n,n,2,2.

THEOREM 3.1. A1,...,An logically implies B if and only if there is an 
absurd model tree starting with T(A1),...,T(An),F(B).

We can also allow splitting on an arbitrary formula at any stage, and 
things are still OK.

4. IMPLEMENTATION OF MODEL TREES - PREDICATE CALCULUS.

The computer keeps a list of tasks for the human. While the human is 
working on these tasks, the computer tries to solve its inventory of 
tasks not yet solved on its own. Any task that it solves on its own 
is removed from the inventory. Between the human and the computer, 
hopefully the inventory is reduced to nothing.

All tasks have the following form. A list of signed formulas are 
given. The human is supposed to read them and see that they lead to 
absurdity. Of course, if the list contains a T(B),F(B), then the 
computer won't even bother to present this task (list). It will have 
already been solved.

So the human is presented with a nonempty list of signed formulas. 
The human can take any of the following actions.

1. I don't feel like working on this task now. Give me another one.

2. Click on T(A1 and ... and An). Then T(A1),...,T(An) appear on the list.

3. Click on F(A1 or ... or An). Then F(A1),...,F(An) appear on the list.

4. Click on F(A implies B). Then T(A), F(B) appear on the list.

5. Click on T(A iff B). Then T(A implies B), T(B implies A) appear on the list.

6. Click on T(notA). Then F(A) appears on the list.

7. Click on F(notA). Then T(A) appears on the list.

8. Click on T(A1 or ... or An). Then T(A1) appears on the list. Also 
the computer stores the present list appended with, separately, 
T(A2),...,T(An), thereby hiding from view n-1 tasks, all of which 
will be eventually sprung on the human at later times.

9. Click on F(A1 and ... and An).  Then T(A1) appears on the list. 
Also the computer stores the present list appended with, separately, 
F(A2),...,F(An), thereby hiding from view n-1 tasks, all of which 
will be eventually sprung on the human at later times.

10. Click on T(A implies B).  Then T(B) appears on the list. Also the 
computer stores the present list appended with F(A), separately, 
thereby hiding from view 1 task, which will be eventually sprung on 
the human at a later time.

11. Click on F(A iff B).  Then T(A),F(B) appears on the list. Also 
the computer stores the present list appended with F(A),T(B) 
separately, thereby hiding from view 1 task, which will be eventually 
sprung on the human at a later time.

12. Click on T((forall x)(A)). Then the computer asks which term 
should be used. The human replies with t. The computer determines 
whether t is substitutable for x in A. If so, then T(A[x/t]) appears 
in the list. If not, the human is notified, and is asked to pick 
another term.

13. Click on F((therexists x)(A)). Then the computer asks which term 
should be used for x. The human replies with t. The computer 
determines whether t is substitutable for x in A. If so, then 
F(A[x/t]) appears in the list. If not, the human is notified, and is 
asked to pick another term. The default choice is x.

14. Click on T((therexists x)(A)). Then the computer asks for a 
variable v to be used for x. The human replies with v. The computer 
determines whether v is substitutable for x in A and v is not free in 
the other formulas in the list. If so, then T(A[x/v]) appears on the 
list. If not, the human is notified, and is asked to pick another 
variable. The default choice is x.

15. Click on F((forall x)(A)). Then the computer asks for a variable 
v to be used for x. The human replies with v. The computer determines 
whether v is substitutable for x in A and v is not free in the other 
formulas in the list. If so, then F(A[x/v]) appears on the list. If 
not, the human is notified, and is asked to pick another variable. 
The default choice is x.

16. Doubleclick on any entry. The entry disappears.

17. You can restore the disappearing entries from 16, one at a time.

Tasks are considered completed when and only when an absurdity, 
T(B),F(B') appears, where B,B' are alphabetic variants.

There is one more move of essential importance.

18. The user can simply pick any extended formula out of the air, and 
put it on the list. Presumably this makes the task easier - that is 
the point of putting it on the list. The cost is that the computer 
then stores the following task to be sprung later on the human: the 
same list but with the new extended formula dualized.

I.e., the human wishes to add the new formula T(A) to the list L. 
Then the computer stores the task consisting of: L together with F(A) 
for later. Or the human adds the new formula F(A) to L. Then the 
computer stores the task: L together with T(A).

Of course this is an obvious waste of time if T(A) or F(A) is already 
on the list. The user is prevented from such foolishness.

This is also subject to the undo operation.

This last move occurs in the following typical situation. The human 
realizes that if only they had T(A) then they would know just how to 
proceed, and they have confidence that they can get T(A) when asked 
to. They will get asked to in the form of being presented with the 
associated task with F(A), but a later time when they are ready to 
think about that.

That's it. If the Lemma is A, then the original task is

F(A).

More generally, suppose that the Lemma takes the form:

Assume A1,...,An. Then B.

Then the original task is:

T(A1),...,T(An),F(B).

I want to indicate how the computer stores the not yet completed tasks.

At any stage, the computer has a list of one or more not yet 
completed tasks. Recall that a task gets completed when and only when 
an absurdity becomes present in the form of some T(B),F(B).

Each task is merely a nonempty finite set of signed formulas. 
However, we want more data. We mark some of them as disappearing from 
view, and we assign a number to them, so that we can implement 14 
above. This extra data is maintained as the list changes, so that 
when the list is revisited because of, say, 1, we have maintained the 
instructions on how to implement 14 above.

We indicate what the computer does according to moves 1-18.

First of all, the computer keeps a count of the tasks not yet 
completed, and displays this to the human. It also displays the 
position of the present task being displayed, in this list. E.g., the 
computer might have 18 incompleted tasks, and the one on display is 
number 7.

1. The present task is put at the end of the list of incomplete tasks 
(default), or at a position specified by the human by number.

2-7, 12-15. The appropriate adjustment is made in the present task.

8-11. The new hidden tasks become the very next task (default), or is 
put into the list of incomplete tasks at the position specified by 
the human.

16. The entry is put at the end of the list of disappearing entries 
for this task.

17. Make them appear, one at a time, from the back of the list of 
disappearing entries.

18. The new hidden task becomes the very next task (default), or is 
put at a position specified by the human by number.

If a task becomes completed - i.e., gets an absurdity of the form 
T(B),F(B'), where B' is an alphabetic variant of B, then it is 
deleted from the list of incomplete tasks.

The object of the human is to get all tasks deleted.

In the next installment, we will consider free predicate calculus, 
which is predicate calculus with equality and undefined terms.

*********************************************

I use http://www.mathpreprints.com/math/Preprint/show/ for manuscripts with
proofs. Type Harvey Friedman in the window.
This is the 183rd in a series of self contained numbered postings to 
FOM covering a wide range of topics in f.o.m. The list of previous 
numbered postings #1-149 can be found at 
http://www.cs.nyu.edu/pipermail/fom/2003-May/006563.html  in the FOM 
archives, 5/8/03 8:46AM. Previous ones counting from #150 are:

150:Finite obstruction/statistics  8:55AM  6/1/02
151:Finite forms by bounding  4:35AM  6/5/02
152:sin  10:35PM  6/8/02
153:Large cardinals as general algebra  1:21PM  6/17/02
154:Orderings on theories  5:28AM  6/25/02
155:A way out  8/13/02  6:56PM
156:Societies  8/13/02  6:56PM
157:Finite Societies  8/13/02  6:56PM
158:Sentential Reflection  3/31/03  12:17AM
159.Elemental Sentential Reflection  3/31/03  12:17AM
160.Similar Subclasses  3/31/03  12:17AM
161:Restrictions and Extensions  3/31/03  12:18AM
162:Two Quantifier Blocks  3/31/03  12:28PM
163:Ouch!  4/20/03  3:08AM
164:Foundations with (almost) no axioms, 4/22/0  5:31PM
165:Incompleteness Reformulated  4/29/03  1:42PM
166:Clean Godel Incompleteness  5/6/03  11:06AM
167:Incompleteness Reformulated/More  5/6/03  11:57AM
168:Incompleteness Reformulated/Again 5/8/03  12:30PM
169:New PA Independence  5:11PM  8:35PM
170:New Borel Independence  5/18/03  11:53PM
171:Coordinate Free Borel Statements  5/22/03  2:27PM
172:Ordered Fields/Countable DST/PD/Large Cardinals  5/34/03  1:55AM
173:Borel/DST/PD  5/25/03  2:11AM
174:Directly Honest Second Incompleteness  6/3/03  1:39PM
175:Maximal Principle/Hilbert's Program  6/8/03  11:59PM
176:Count Arithmetic  6/10/03  8:54AM
177:Strict Reverse Mathematics 1  6/10/03  8:27PM
178:Diophantine Shift Sequences  6/14/03  6:34PM
179:Polynomial Shift Sequences/Correction  6/15/03  2:24PM
180:Provable Functions of PA  6/16/03  12:42AM
181:Strict Reverse Mathematics 2:06/19/03  2:06AM
182:Ideas in Proof Checking 1 10:50PM  6/21/03

Harvey Friedman


More information about the FOM mailing list