How to Obtain MLRISC
Problem Statement
MLRISC Based Compiler
MLRISC Intermediate Representation
MLRisc Generation
Back End Optimizations
Register Allocation
Machine Description
Garbage Collection Safety
System Integration
Graphical Interface
Line Counts
Systems Using MLRISC
Future Work
Architecture of MLRISC
The MLTREE Language
MLTree Extensions
MLTree Utilities
Instruction Selection
Machine Code Emitters
Delay Slot Filling
Span Dependency Resolution
The Graph Library
The Graph Visualization Library
Basic Compiler Graphs
SSA Optimizations
ILP Optimizations
Optimizations for VLIW/EPIC Architectur...
Register Allocator
Back Ends
The Alpha Back End
The PA RISC Back End
The Sparc Back End
The Intel x86 Back End
The PowerPC Back End
The MIPS Back End
The TI C6x Back End
Basic Types
Client Defined Constants
Client Defined Pseudo Ops
Instruction Streams
Label Expressions

MLTree Extensions

MLTree Extensions
Why Extensions
Cyclic Dependency
Multiple Extensions
Pattern matching over the MLTREE intermediate representation may not be sufficient to provide access to all the registers or operations provided on a specific architecture. MLTREE extensions is a method of extending the MLTREE intermediate language so that it is a better match for the target architecture.

Why Extensions

Pattern matching over the MLTREE intermediate representation may not be sufficient to provide access to all the registers or operations provided on a specific architecture. MLTREE extensions is a method of extending the MLTREE intermediate language so that it is a better match for the target architecture.

For example there may be special registers to support the increment-and-test operation on loop indices, or support for complex mathematical functions such as square root, or access to hardware specific registers such as the current register window pointer on the SPARC architecture. It is not usually possible to write expression trees that would directly generate these instructions. Some complex operations can be generated by performing a peephole optimization over simpler instructions, however this is not always the most convenient or simple thing to do.

Cyclic Dependency

The easiest way to provide extensions is to parameterize the MLTREE interface with types that extend the various kinds of trees. Thus if the type sext represented statement extensions, we might define MLTREE statement trees as :
   datatype stm
     = ...
     | SEXT of sext * mlrisc list * stm list
   and mlrisc = GPR of rexp | FPR of fexp | CCR of ccexp
where the constructor SEXT applies the extension to a list of arguments. This approach is unsatisfactory in several ways, for example, if one wanted to extend MLTREEs with for-loops, then the following could be generated:
   SEXT(FORLOOP, [GPR from, GPR to, GPR step], body)
First, the loop arguments have to be wrapped up in GPR and there is little self documentation on the order of elements that are arguments to the for-loop. It would be better to be able to write something like:
   SEXT(FORLOOP{from=f, to=t, step=s, body=b}) 
Where f, t, and s are rexp trees representing the loop index start, end, and step size; b is a stm list representing the body of the loop. Unfortunately, there is a cyclic dependency as MLTREEs are defined in terms of sext, and { sext} is defined in terms of MLTREEs. The usual way to deal with cyclic dependencies is to use polymorphic type variables.


The statement extension type sext, is now a type constructor with arity four, i.e. ('s, 'r, 'f, 'c) sx where sx is used instead of { sext}, and 's, 'r, 'f, and 'c represents MLTREE statement expressions, register expressions, floating point expressions, and condition code expressions. Thus the for-loop extension could be declared using something like:
   datatype ('s,'r,'f,'c) sx 
     = FORLOOP of {from: 'r, to: 'r, step: 'r, body: 's}
and the MLTREE interface is defined as:
   signature MLTREE = sig
     type ('s, 'r, 'f, 'c) sx
     datatype stm =
       = ...
       | SEXT of sext
    withtype sext = (stm, rexp, fexp, cexp) sx
where sext is the user defined statement extension but the type variables have been instantiated to the final form the the MLTREE stm, rexp, fexp, and cexp components.


There are dedicated modules that perform pattern matching over MLTREEs and emit native instructions, and similar modules must be written for extensions. However, the same kinds of choices used in regular MLTREE patterns must be repeated for extensions. For example, one may define an extension for the Intel IA32 of the form:

   datatype ('s,'r,'f,'c) sx = PUSHL of 'r | POPL of 'r | ...
that translate directly to the Intel push and pop instructions; the operands in each case are either memory locations or registers, but immediates are allowed in the case of PUSHL. Considerable effort has been invested into pattern matching the extensive set of addressing modes for the Intel architecture, and one would like to reuse this when compiling extensions. The pattern matching functions are exposed by a set of functions exported from the instruction selection module, and provided in the MLTREE interface. They are:

   struture I : INSTRUCTIONS
   datatype reducer = 
     REDUCER of {
       reduceRexp    : rexp -> reg,
       reduceFexp    : fexp -> reg,
       reduceCCexp   : ccexp -> reg,
       reduceStm     : stm * an list -> unit,
       operand       : rexp -> I.operand,
       reduceOperand : I.operand -> reg,
       addressOf     : rexp -> I.addressing_mode,
       emit          : I.instr * an list -> unit,
       instrStream   : (I.instr, I.regmap, I.cellset) stream,
       mltreeStream  : (stm, I.regmap, mlrisc list) stream
where I is the native instruction set.
: reduces an MLTREE rexp to a register, and similarly for reduceFexp and reduceCCexp.
: reduces an MLTREE stm to a set of instructions that implement the set of statements.
: reduced an MLTREE rexp into an instruction operand --- usually an immediate or memory address.
: moves a native operand into a register.
: reduces an MLTREE rexp into a memory address.
: emits an instruction together with an annotation.
: is the native instruction output stream, and
: is the MLTREE output stream.

Each extension must provide a function compileSext that compiles a statement extension into native instructions. In the MLTREE_EXTENSION_COMP interface we have:

   val compileSext: reducer -> {stm: MLTREE.sexp, list} -> unit
The use of extensions must follow a special structure.
  1. A module defining the extension type using a type constructor of arity four. Let us call this structure ExtTy and must match the MLTREE_EXTENSION interface.
  2. The extension module must be used to specialize MLTREEs.
  3. A module that describes how to compile the extension must be created, and must match the MLTREE_EXTENSION_COMP interace. This module will typically be functorized over the MLTREE interface. Let us call the result of applying the functor, ExtComp.
  4. The extension compiler must be passed as a parameter to the instruction selection module that will invoke it whenever an extension is seen.

Multiple Extensions

Multiple extensions are handled in a similar fashion, except that the extension type used to specialize MLTREEs is a tagged union of the individual extensions. The functor to compile the extension dispatches to the compilation modules for the individual extensions.


Suppose you are in the process of writing a compiler for a digital signal processing(DSP) programming language using the MLRISC framework. This wonderful language that you are developing allows the programmer to specify high level looping and iteration, and aggregation constructs that are common in DSP applications. Furthermore, since saturated and fixed point arithmetic are common constructs in DSP applications, the language and consequently the compiler should directly support these operators. For simplicity, we would like to have a unified intermediate representation that can be used to directly represent high level constructs in our language, and low level constructs that are already present in MLTree. Since, MLTree does not directly support these constructs, it seems that it is not possible to use MLRISC for such a compiler infrastructure without substantial rewrite of the core components.

Let us suppose that for illustration that we would like to implement high level looping and aggregation constructs such as

    for i := lower bound ... upper bound
    x := sum{i := lower bound ... upper bound} expression
together with saturated arithmetic mentioned above.

Here is a first attempt:

 structure DSPMLTreeExtension
    structure Basis = MLTreeBasis
    datatype ('s,'r,'f,'c) sx = 
       FOR of Basis.var * 'r * 'r * 's
    and ('s,'r,'f,'c) rx = 
       SUM of Basis.var * 'r * 'r * 'r
     | SADD of 'r * 'r
     | SSUB of 'r * 'r
     | SMUL of 'r * 'r
     | SDIV of 'r * 'r
    type ('s,'r,'f,'c) fx = unit
    type ('s,'r,'f,'c) ccx = unit
 structure DSPMLTree : MLTreeF
     (structure Extension = DSPMLTreeExtension
In the above signature, we have defined two new datatypes sx and rx that are used for representing the DSP statement and integer expression extensions. Integer expression extensions include the high level sum construct, and the low levels saturated arithmetic operators. The recursive type definition is necessary to ``inject'' these new constructors into the basic MLTree definition.

The following is an example of how these new constructors that we have defined can be used. Suppose the source program in our DSP language is:

    for i := a ... b
    {  s := sadd(s, table[i]);
where sadd is the saturated add operator. For simplicity, let us also assume that all operations and addresses are in 32-bits. Then the translation of the above into our extended DSP-MLTree could be:
    EXT(FOR(i, REG(32, a), REG(32, b),
            MV(32, s, REXT(32, SADD(REG(32, s), 
                     ADD(32, REG(32, table), 
                         SLL(32, REG(32, i), LI 2)),
One potential short coming of our DSP extension to MLTree is that the extension does not allow any further extensions. This restriction may be entirely satisfactory if DSP-MLTree is only used in your compiler applications and no where else. However, if DSP-MLTree is intended to be an extension library for MLRISC, then we must build in the flexibility for extension. This can be done in the same way as in the base MLTree definition, like this:
 functor ExtensibleDSPMLTreeExtension
   (Extension : MLTREE_EXTENSION) =
    structure Basis = MLTreeBasis
    structure Extension = Extension
    datatype ('s,'r,'f,'c) sx = 
       FOR of Basis.var * 'r * 'r * 's
     | EXT of ('s,'r,'f,'c) 
    and ('s,'r,'f,'c) rx = 
       SUM of Basis.var * 'r * 'r * 'r
     | SADD of 'r * 'r
     | SSUB of 'r * 'r
     | SMUL of 'r * 'r
     | SDIV of 'r * 'r
     | REXT of ('s,'r,'f,'c) Extension.rx
         ('s,'r,'f,'c) fx   = ('s,'r,'f,'c) Extension.fx
    and  ('s,'r,'f,'c) ccx  = ('s,'r,'f,'c) Extension.ccx
As in MLTREE, we provide two new extension constructors EXT and REXT in the definition of DSP_MLTREE, which can be used to further enhance the extended MLTREE language.
Lal George
Allen Leung
SML/NJ Validate this page
Generated by mltex2html
Last modified: Thu Jan 9 19:38:15 EST 2003 by leunga@slinky