[FOM] Fwd: Unreasonable effectiveness

Tue Nov 5 12:14:28 EST 2013

Hi Timothy,

Bas Spitters has forwarded Jacques' answer about our work (thanks
Jacques, Bas). Here are some more comments.

> I am not sure exactly how this might go, but here is a vague outline.
> Theorems are built on lemmas.  We want to construct some kind of model of
> the probability that Lemma X will be "useful" for proving Theorem Y.  This
> model would be time-dependent; that is, at any given time t, we would have
> a probabilistic model, trained on the corpus of mathematics known up to
> time t, that could be used to predict future uses of lemmas in theorems.
> This model would represent "reasonable effectiveness."  Then the thesis of
> "unreasonable effectiveness" would be that this model really does evolve
> noticeably over time---that the model at time t systematically
> underestimates uses of Lemma X in Theorem Y at times t' > t.

> I am wondering if anyone else has thought along these lines.  Also I am
> wondering if there is any plausible way of using the growing body of
> computerized proofs to make the above outline more precise.

Definitely. The main experiment is ten years old, see section 4.2
("Mizar Proof Advisor") of this 2004 JAR paper [1], particularly
figure 4.  When a naive-Bayes predictor was trained on all previous
Mizar proofs (at that time there were about 30k proofs and 40k
formulas in total), 70% of the lemmas needed for the next proof were
on average ranked in the top 100 by the predictor. This has been
improved over the last "machine-learning" decade. Kernel-based
predictor on a smaller Mizar-based corpus ranked 80% needed lemmas in
the top 30 ([2], Figure 1). Applying linguistic modelling like inverse
document frequency (IDF) raised the combined learning/ATP success rate
on Hales' Flyspeck to about 47% recently [3].

One caveat is that the "lemmas" and "proofs" in existing formal
corpora like Mizar and Flyspeck are often more fine-grained than what
"normal mathematicians" call "lemmas" and "proofs". Still, as
mentioned in [3], I think this shows that "real-world" math is in this
aspect fairly non-random (though I admire Chaitin's worst-case
randomness results). I quite believe that the Penrose-style arguments
about inability of computers to do mathematical thinking will be
refuted in this century by combination of learning from large math
corpora and smart learning-guided search techniques.

> There is of course the problem that the "ontogeny" of computerized
> proofs does not exactly recapitulate the "phylogeny" of how the
> theorems were arrived at historically, but nevertheless maybe
> something can still be done.

Alternative proofs are a very interesting issue.  Sometimes the
learning/deductive systems can find quite different proofs, see
e.g. section 6 in [4], or the paper [5]. Then we can choose from which
proofs to learn and how: they are all correct, but some may be better
for future problem solving in different contexts. This makes math
a pretty interesting AI domain [6].

Best,
Josef

[1]: http://dx.doi.org/10.1007/s10817-004-6245-1
[2]: http://dx.doi.org/10.1007/978-3-642-31365-3_30
[3]: http://www.easychair.org/publications/?page=2069121719
[4]: http://arxiv.org/abs/1211.7012
[5]: http://dx.doi.org/10.1007/978-3-642-28717-6_6
[6]: http://link.springer.com/chapter/10.1007/978-3-642-36675-8_13

On Tue, Nov 5, 2013 at 9:10 AM, Bas Spitters <spitters at cs.ru.nl> wrote:
> I don't know whether you are on FOM.
>
>
> ---------- Forwarded message ----------
> From: Jacques Carette <carette at mcmaster.ca>
> Date: Sun, Nov 3, 2013 at 2:37 PM
> Subject: Re: [FOM] Unreasonable effectiveness
> To: tchow at alum.mit.edu, Foundations of Mathematics <fom at cs.nyu.edu>
>
>
> On 2013-11-02 11:40 , Timothy Y. Chow wrote:
>>
>> I am wondering if anyone else has thought along these lines.  Also I am wondering if there is any plausible way of using the growing body of computerized proofs to make the above outline more precise.  There is of course the problem that the "ontogeny" of computerized proofs does not exactly recapitulate the "phylogeny" of how the theorems were arrived at historically, but nevertheless maybe something can still be done.
>
>
> At least along these lines, you should dig into the work of Josef
> Urban [1][2] and Cesary Kaliszyk [3][4].  The recent preprint [5]
> might be of particular interest, but they have many papers in this
> vein.
>
> Jacques
>
> [1] http://cs.ru.nl/~urban/
> [2] http://www.informatik.uni-trier.de/~ley/pers/hd/u/Urban:Josef.html
> [3] http://cl-informatik.uibk.ac.at/users/cek/
> [4] http://www.informatik.uni-trier.de/~ley/pers/hd/k/Kaliszyk:Cezary.html
> [5] http://arxiv-web3.library.cornell.edu/abs/1310.2797v1
>
> _______________________________________________
> FOM mailing list
> FOM at cs.nyu.edu
> http://www.cs.nyu.edu/mailman/listinfo/fom