|
The Compiler Validation Project: Developing tools for proving the correctness of compilation. |
|
Motivation:
There is a growing awareness in industry, academia, and government of
the crucial role of formally proving the correctness of
safety-critical portions of systems. Most verification methods deal
with the high-level specification of the system. However, if one is
to prove that the high-level specification is correctly implemented at
the lower level, one needs to verify the compiler which performs the
translations.
Goal:
Our goal is to develop a methodology, supported by
tools, which will provide practical means for ensuring
correct compilation by state-of-the art optimizing compliers.
The Methodology — Translation Validation:
Formally verifying a full-fledged optimizing compiler is not feasible
due to its size, rates of change, and proprietary considerations.
Translation validation is a novel approach that offers an
alternative to the verification of translators in general and
compilers in particular. According to the translation validation
approach, one constructs a validation tool that, after every
run of the compiler, formally confirms, via a proof script, that
the target code produced on that run is a correct translation
of the source program. The proof script can be independently checked,
for even greater assurance, by existing proof checkers.
Translation validation has already been shown to be effective in the
optimization restricted case, in which most compiler
optimizations are disabled. In this case, the proof that the target
code correctly implements the source program is reduced to the
proof of the validity of a set of automatically generated
verification conditions (proof obligations) which are implications
in first order logic. All that is required then is to establish the
validity of the set of these verification conditions. Under the
(provably realistic) assumption that only limited optimization is
applied to arithmetic expressions, the proof obligations are in a
restricted form of first order logic, called equational formulae,
using uninterpreted functions to represent all arithmetic operations.
Recent work has shown the feasibility of
building a tool for checking the validity of such formulae. This tool
is based upon finding small domain instantiations of equational
formulae and then using BDD-based representation to check for
validity.
New Challenges:
When the optimizations are enabled, it is no longer sufficient to use
the above verification conditions. The validation tool will need
additional information describing which optimizing transformations
have been performed. This information can only be supplied by the
compiler itself. We therefore extend the translation validation
approach to include compiler-generated instrumentation in the form of
program annotation. This annotation will be processed by the
validation tool to form invariant assertions at selected control
points. The verification conditions will then be augmented with these
assertions. However, the validator will also need to establish that
these compiler-generated assertions are indeed invariant.
Strategy:
A major component of our approach is to identify a minimal set of
program annotations that is both sufficient to deal with all
known optimizations and contains all the
information needed to generate the translation validation by
efficient, fully automatic, and simple to check proofs. Our early
results present such a set for classic optimizations (constant
folding, copy propagation, common-subexpression-elimination, strength
reduction, loop-invariant code motion, etc.) One of our immediate
goals is to extend the translation validation framework to handle
instruction scheduling transformations.
Ultimate Goal:
Ultimately, we envision integrating the validation tool into the
compiler. Thus, the result of compilation would be not only the
optimized target program, but also a simple, machine-checkable proof
script. This proof script could then be checked by a simple proof
checker that can be supplied with the compiler.
As part of this work, we intend to apply our results to optimizing
compilers for emerging architectures, including EPIC processors,
out-of-order superscalar machines, and, ultimately, highly scalable
parallel computing systems.
Who will benefit from the work:
We expect out work to be of great value to groups who develop
applications using a process in which analysis, verification, and
simulation are performed at the source level. It is therefore crucial
that whatever integrity has been achieved at the source level is not
jeopardized by a faulty translation from the source program to the
target program. The problem of the compiler being the weak link in an
otherwise high-integrity system development process is particularly
acute in safety-critical systems, where certification of any compiler
used is required by standards, as well as in compilation-into-silicon
methods where a translation error may be very expensive to
repair. Compiler developers will also benefit from this work, since it
will provide them with an extremely effective debugging tool.
|