[FOM] Pratt on categorical foundations

Mon Feb 20 08:01:16 EST 2012

Is categorical foundations of mathematics philosophically coherent in  
the sense that set theoretic foundations of mathematics is, or finite  
set theory for finite mathematics, or PA is for restricted kinds of  
mathematics?

How does simplicity figure in to the discussion? How would you compare  
the simplicity of categorical foundations versus set theoretic  
foundations?

Harvey Friedman

On Feb 20, 2012, at 3:23 AM, Vaughan Pratt wrote:

On 2/18/2012 11:15 AM, Michael Lee Finney wrote:
> Category theory has the problem that it is based on functions which
> only exist inside of limited sets. Therefore, trying to apply category
> theory to larger areas has foundational issues. Even if you define
> functions using classes, you only move the problem back a step. There
> are some attempts to use Category theory as the foundations instead of
> Set Theory, but so far as I know they have either not been succesful
> or very awkward. I haven't gotten into category theory much, so I
> can't give you any references here. I only care about it relative to
> foundational issues.

I would say category theory suffers much less from these sorts of  
technical problems than from the pedagogical problem of being two  
steps removed from logic.  These steps are not necessarily easily  
taken by those accustomed to thinking of every mathematical object as  
a set that is uniquely determined by those entities of the  
mathematical universe that belong to it, as the heated debates here  
back in 1998 about the relevance to FOM of category theory made clear.

The first step away from quantificational logic is to algebra.  This  
step shifts the emphasis from truth-valued sentences to mathematical  
objects denoted by terms valued in one or more domains of individuals.  
Equality emerges as the dominant relation, and functions fill in for  
existential quantification.

Syntactically one can always skolemize, but the line between syntax  
and semantics is not as sharp in algebra as in logic.  Many operations  
arising in a given branch of mathematics that we take for granted as  
integral to the algebraic language of that branch might just as well  
be considered skolem functions derived from a formalization of that  
branch based on a logical language consisting of only a few  
fundamental relations.

The statement that f: A --> B is a bijection is a basic example for  
set theory as such a branch of mathematics.  In logic this property is  
expressed by saying that for all a,a', f(a) = f(a') implies a = a' (f  
is injective), and for all b there exists a such that f(a) = b (f is  
surjective).  Algebra skolemizes the existential quantifier in this  
definition of surjectivity by introducing the inverse f': B --> A into  
the language as an equal partner with f and restating the bijection  
property as the identities f'(f(a)) = a and f(f'(b)) = b.

The second step is from algebra to category theory, which dispenses  
with the distinctions between elements and functions and between  
application and composition.  It is hard to overestimate the value of  
this simplification, as I'll try to demonstrate with some examples  
from topos theory, which is just a small corner of category theory.   
My apologies in advance for the length, but my experience has been  
that merely listing the topos axioms conveys the subject about as well  
as E = mc^2 conveys general relativity.

The bijection example can be simplified to f'f = 1_A: A --> A and ff'  
= 1_B: B --> B where 1_A and 1_B denote the identity functions on  
respectively A and B.  Each equation constitutes a commutative diagram  
that eliminates elements and application altogether.

But these equations can also be written f'fa = a and ff'b = b where a:  
S --> A is understood as an element of A of sort S, and likewise b: T  
--> B as an element of B of sort T.  This looks more like algebra,  
except that a and b are now morphisms at the same level as f and g,  
application is realized as composition, and the shorter equations  
above can be recovered by canceling a and b on the right of their  
respective equations.

This way of thinking simplifies the type structure of algebra by  
eliminating the element-function distinction and the application- 
composition distinction.  But it does far more than just that, by  
cleanly introducing a notion of sort.

For set theory as a (rather degenerate) branch of mathematics there is  
only one sort of element, namely the atom, whose sort can be  
identified with the singleton 1.  Up to isomorphism every set is  
simply a sum (disjoint union, coproduct) of atoms, one for each element.

Graph theory as another branch of mathematics is a slight but very  
important extension of set theory.  Whereas sets contain only atoms,  
graphs contain vertices (which work like atoms) and edges (whose  
incidence relations are mediated by vertices).  There are sorts V for  
vertices and E for edges, identifiable with respectively the one- 
vertex graph with no edges and the two-vertex graph with a single edge  
connecting them, just as we identified the sort "atom" of set theory  
with the singleton set 1.  A vertex of G is a morphism v: V --> G  
while an edge is a morphism e: E --> G.  The two vertices of E  
manifest as the two morphisms from V to E, since vertices of G are  
morphisms from V to G.

As stipulated, V has no edges.  Hence there cannot be a morphism from  
E to V, making this the category of irreflexive graphs.  (To digress  
for a moment, for reflexive graphs V would have an edge in the form of  
a morphism to it from E, necessarily representing a self-loop in V.   
This morphism would have the evident compositions at each end with  
each of the two morphisms from V to E, namely the identity function  
1_V at one end and the two non-identity retractions on E at the other.)

But whereas a set is just the disjoint union of its atoms, a graph is  
more than just the disjoint union of its vertices and edges.  The  
notion of incidence in a graph demands some quotienting, namely that G  
be the result of identifying certain endpoints of its edges.  That is,  
whereas the category of sets up to any given size is just the closure  
of the category containing the singleton 1 under sums (coproducts) up  
to that size, the category of graphs must be the closure of V and E  
under colimits, not just coproducts.  Unlike coproducts, colimits must  
take the possibility of quotienting into consideration or the only  
graphs we would obtain would be isolated edges with twice as many  
vertices as edges, aka the *free* graphs generated by pairs of sets.

The singleton set 1 of set theory is not forgotten in graph theory  
however.  It has a counterpart among graphs, namely the *final* graph  
1 with one vertex and one edge, a self-loop.  For any graph G there is  
an evident unique morphism !_G: G --> 1.  (For the category of  
reflexive graphs, V coincides with 1, but not for that of irreflexive  
graphs.) The fact that 1 contains an edge will play a very important  
role in 11 paragraphs time (apologies that this message is so long).

The identification of elements with functions extends to yet other  
very convenient identifications, namely of pairs (more generally n- 
tuples) with functions, and likewise subsets with functions.

In the category of sets, an n-tuple of A is simply a morphism from n  
to A where n = 1 + 1 + ... + 1 is any n-element set.  The elements of  
n constitute the coordinates of the n-tuple.  A canonical choice of n  
is {0,1,...,n-1} as a subset (TBD below) of the set N of natural  
numbers.

Pairs arise as the case n = 2.  Writing Hom(A,B) for the (hom)set of  
morphisms from A to B, we have the bijections Hom(1+1, A) ~ Hom(1, A)  
x Hom(1, A) ~ Hom(1, AxA) (as a consequence of the homfunctor Hom:  
C^op x C --> Set preserving limits in both arguments, bearing in mind  
that the colimit 1+1 in C becomes the limit 1x1 in C^op).  These two  
bijections (however chosen) create a bijection between the elements of  
A of sort 1+1, i.e. the pairs of A, and the elements of AxA of sort 1,  
i.e. the atoms of AxA.

A subset B of A with n elements differs from an n-tuple in two  
respects: no element of A can be repeated in B, and order is  
immaterial (indexing is disabled).

The first property is easily achieved by taking an n-element subset of  
A to be an n-tuple of A, as a morphism from n to A, which is injective  
(more generally a monomorphism or monic, as a suitable generalization  
of "injective").

The second property is a little more delicate.  It can be achieved via  
the notion of *subobject* of A as an equivalence class of isomorphic  
injections to A.  Two injections are isomorphic when their domains and  
images are both isomorphic, a condition formalizable with the  
appropriate commutative diagram (exercise).  A set with m elements has  
2^m subobjects, aka subsets, with C(m,n) of them having domains with n  
elements where C(m,n) is a binomial coefficient.

Most of this carries over to graphs. It should come as no surprise  
that one can form n-tuples of graphs in the obvious way.  More  
interesting is how subobjects of graphs, aka subgraphs, are formed, by  
analogy with subgraphs.

One can form a subset B of A by deleting its complement relative to A.  
This process can be expressed via the characteristic function of B,  
namely the function χ_B: A --> 2 = {0,1} sending the elements for  
deletion to 0.

A subgraph G' of G is not so easily formed.  In order to delete any  
vertex one must first delete all edges incident on it.  The  
counterpart of the set 2 = {0,1} for graphs has the same vertices,  
namely {0,1}, but the corresponding set of two edges can only exist at  
vertex 1, since if either vertex of an edge is missing (the case of  
vertex 0), there is only one possibility for the edge, namely that it  
must be missing too.

It follows that the graph Omega (as we shall call the *subobject  
classifier*) playing the role of the set {0,1} has vertices {0,1}, and  
one edge between every pair of vertices (denoting a missing edge)  
except from 1 to 1, which permits two edges denoting presence or  
absence of an edge at the corresponding place in G'.  That is, Omega  
is the directed clique K_2 with a fifth edge from 1 to 1.

It follows that there are three morphisms from 1 to Omega.  The least  
one maps the vertex of 1 to the 0 vertex of Omega, and therefore maps  
the edge of 1 to the unique edge at that vertex.  The other two map  
the vertex of 1 to the 1 vertex of Omega, which has two edges to which  
the edge of 1 can be mapped, and these are ordered so that absent is  
below present for edges, as usual in the inclusion ordering.

These three graph morphisms play the role for graphs that the two  
morphisms from the set 1 to the set 2 = {0,1} play for sets.  Just as  
the latter defines a two-valued logic, so do the former define a three- 
valued logic forming a linear order.

The category of graphs is an instance of a presheaf category, which in  
turn is an instance of a topos.  In any topos the atoms (the 1-sorted  
elements, as distinct from the V-sorted and E-sorted elements) of  
Omega have an evident ordering by inclusion of subobjects so as to  
form a poset.  In every topos the atoms of Omega form a Heyting  
algebra under this ordering.  They constitute the truth values of the  
internal logic of that topos.

There is only one Heyting algebra with two elements, namely the two- 
element Boolean algebra, corresponding to the idea that the internal  
logic of set theory is Boolean.  There is likewise only one Heyting  
algebra with three elements, and it cannot be a Boolean algebra (which  
in the finite case must have cardinality a power of 2), corresponding  
to the idea that the internal logic of graph theory, as defined by the  
notion of subgraph by analogy with subset, is intuitionistic but not  
classical.

Statman has shown that general intuitionistic logic is PSPACE- 
complete.  The internal logic of graph theory however is essentially  
no harder than that of set theory, being coNP-complete.  This is  
because failure of P = TOP (the topmost of the three truth values) can  
be witnessed by assigning one of three values to each variable of P  
and then evaluating P in linear time.   This does not work for general  
intuitionistic logic, for which counterexamples and their truth tables  
are in general massively larger.

All this is one tiny little corner of category theory, a very rich and  
well-developed subject that comes nowhere near deserving Michael  
Finney's characterization of it as: "There are some attempts to use  
Category theory as the foundations instead of Set Theory, but so far  
as I know they have either not been succesful or very awkward."

Awkwardness is in the eye of the beholder.  Active users of category  
theory find it far from awkward, combining great conceptual economy  
with great expressive power.

Vaughan Pratt