| Project Goal |
The main goal of this research is to develop
techniques for applying aggressive machine level compiler
optimizations to high level, typed languages, such as ML, and Java.
SSA based scalar optimizations, global instruction scheduling algorithms,
and global register allocation algorithms are considered.
All optimizations that we consider are performed
in the granularity of individual instructions, and not on pseudo
instructions.
|
|
| Main Problems | High level languages, unlike their low level counterparts
such as C and C++, require non-trivial runtime system support such as
garbage collection.
Extensive compiler support is necessary to make the a back end work seamlessly
in such an environment. Such support includes ways of communicating
semantics information such as types and
dependence, and safety information such as
code motion constraints, with the back end.
For example, to support garbage collection-aware code generation, a back end has to provide hooks for generating garbage collection code, in the form of explicit garbage collection tests, and/or trace tables for locating live objects. To support precise garbage collection-safe optimization, the problem is even more intricate, since it is necessary for the front-end to specify the safety constraints on optimizations that perform code motion, and to provide a means of tracking transformations done by optimization phases, so that the root sets at gc safe-points can be determined. In a traditional framework, the goals of retargetability and extensibility are usually compromised given these difficult requirements. In many existing retargetable optimizers, specific source language features are frequently assumed, either explicitly or implicitly. For example, a C-oriented back end may hardwire the C calling convention assumption into its intermediate representation. This makes it very difficult to reuse existing compiler technologies for languages substantially different than C. |
|
| Contribution | As part of this research, I have developed an optimizer framework on top of the MLRISC customizable back end, to solves these problems. The framework is language agnostic: the optimizer makes no initial assumption about the source language semantics and the features of the target architectures. But instead it allows various components of the optimizer to be customized to fit the task at hand. The original MLRISC framework has already proven itself as a back end for various languages. My research aims to extend this flexibility to a high performance optimizer framework on top of the MLRISC system. | |
| Strategy |
MLRISC is implemented on top of the Standard ML language, and by
taking full advantage of the module system of Standard ML, the
intermediate representation and all aspects the optimization phases
can be parameterized. These include architecture information,
aliasing, dependence, semantics and type annotations from the front-end,
and even the intermediate representation itself.
For example, I use the following machine description driven strategy for implementing portable gc-safe optimizations.
An actual optimizer is obtained from specializing a generic optimizer which respect to (i) the client extensions to the immediate representation, (ii) architecture machine descriptions, and (iii) the gc type system. Given m architectures and n different sources languages with different sets of garbage collection safety constraints, only m different sets of (MLRISC supplied) machine descriptions, and n different sets of client-supplied typing rules are necessary to construct a multi-source/multi-target optimizer. |
|
| Validation | The optimizer is currently being developed and validated, as the back end for the SML/NJ compiler. | |
| Related Work | Here are some
current research that are related to this work.
|
|
| More Information | MLRISC documentation |
| Back to my home page |
|