MLRISC
MLRISC
Contributors
Requirements
How to Obtain MLRISC
Overview
Problem Statement
Contributions
MLRISC Based Compiler
MLRISC Intermediate Representation
MLRisc Generation
Back End Optimizations
Register Allocation
Machine Description
Garbage Collection Safety
System Integration
Optimizations
Graphical Interface
Line Counts
Systems Using MLRISC
Future Work
System
Architecture of MLRISC
The MLTREE Language
MLTree Extensions
MLTree Utilities
Instruction Selection
Assemblers
Machine Code Emitters
Delay Slot Filling
Span Dependency Resolution
The Graph Library
The Graph Visualization Library
Basic Compiler Graphs
The MLRISC IR
SSA Optimizations
ILP Optimizations
Optimizations for VLIW/EPIC Architectur...
Register Allocator
Back Ends
The Alpha Back End
The PA RISC Back End
The Sparc Back End
The Intel x86 Back End
The PowerPC Back End
The MIPS Back End
The TI C6x Back End
Basic Types
Annotations
Cells
Cluster
Client Defined Constants
Client Defined Pseudo Ops
Instructions
Instruction Streams
Label Expressions
Labels
Regions
Regmap

Instruction Selection


Instruction Selection
Interface Definition
-Compiling Client Extensions
-Extension Example
Instruction Selection Modules
Instruction selection modules are reponsible for translating MLTree statements into a flowgraph consisting of target machine instructions. MLRISC decomposes this complex task into three components:
Instruction selection modules
which are responsible for mapping a sequence of MLTree statements into a sequence target machine code,
Flowgraph builders
which are responsible for constructing the graph representation of the program from a sequence of target machine instructions, and
Client Extender
which are responsible for compiling MLTree extensions (see also Section MLTree Extensions).
By detaching these components, extra flexiblity is obtained. For example, the MLRISC system uses two different internal representations. The first, cluster, is a light-weight representation which is suitable for simple compilers without extensive optimizations; the second, MLRISC IR, is a heavy duty representation which allows very complex transformations to be performed. Since the flowgraph builders are detached from the instruction selection modules, the same instruction selection modules can be used for both representations.

For consistency, the three components communicate to each other via the same stream interface.

Interface Definition

All instruction selection modules satisfy the following signature:

 signature MLTREECOMP = 
 sig
    structure T : MLTREE
    structure I : INSTRUCTIONS
    structure C : CELLS
       sharing T.LabelExp = I.LabelExp
       sharing I.C = C
 
    type instrStream = (I.instruction,C.regmap,C.cellset) T.stream
    type mltreeStream = (T.stm,C.regmap,T.mlrisc list) T.stream
 
    val selectInstructions : instrStream -> mltreeStream
 end
 
Intuitively, this signature states that the instruction selection module returns a function that can transform a stream of MLTree statements (mltreeStream) into a stream of instructions of the target machine (instrStream).

Compiling Client Extensions

Compilation of client extensions to MLTREE is controlled by the following signature:
 signature MLTREE_EXTENSION_COMP =
 sig
    structure T : MLTREE
    structure I : INSTRUCTIONS
    structure C : CELLS
       sharing T.LabelExp = I.LabelExp
       sharing I.C = C
 
    type reducer = 
      (I.instruction,C.regmap,C.cellset,I.operand,I.addressing_mode) T.reducer
 
    val compileSext : reducer -> {stm:T.sext, an:T.an list} -> unit
    val compileRext : reducer -> {e:T.ty * T.rext, rd:C.cell, an:T.an list} -> unit
    val compileFext : reducer -> {e:T.ty * T.fext, fd:C.cell, an:T.an list} -> unit
    val compileCCext : reducer -> {e:T.ty * T.ccext, ccd:C.cell, an:T.an list} -> unit
 end
 
Methods compileSext, compileRext, etc.~are callbacks that are responsible for compiling MLTREE extensions. The arguments to these callbacks have the following meaning:
reducer
The first argument is always the reducer, which contains internal methods for translating MLTree constructs into machine code. These methods are supplied by the instruction selection modules.
an
This is a list of annotations that should be attached to the generated code.
ty, fty
These are the types of the extension construct.
stm, e
This are the extension statement and expression.
rd, fd, cd
These are the target registers of the expression extension, i.e.~the callback should generate the appropriate code for the expression and writes the result to this target.

For example, when an instruction selection encounters a FOR(i,a,b,s) statement extension defined in Section MLTree Extensions, the callback

   
   compileStm reducer { stm=FOR(i,a,b,s), an=an }
 
is be involved.

The reducer type is defined in the signature MLTREE and has the following type:

   datatype reducer =
     REDUCER of
     { reduceRexp    : rexp -> reg,
       reduceFexp    : fexp -> reg,
       reduceCCexp   : ccexp -> reg,
       reduceStm     : stm * an list -> unit,
       operand       : rexp -> I.operand,
       reduceOperand : I.operand -> reg,
       addressOf     : rexp -> I.addressing_mode,
       emit          : I.instruction * an list -> unit,
       instrStream   : (I.instruction,C.regmap,C.cellset) stream,
       mltreeStream  : (stm,C.regmap,mlrisc list) stream
     }
 
The components of the reducer are
reduceRexp, reduceFexp, reduceCCexp
These functions take an expression of type integer, floating point and condition code, translate them into machine code and return the register that holds the result.
reduceStm
This function takes an MLTree statement and translates it into machine code. it also takes a second argument, which is the list of annotations that should be attached to the statement.
operand
This function translates an rexp into an operand of the machine architecture.
reduceOperand
This function takes an operand of the machine architecture and reduces it into an integer register.
addressOf
This function takes an rexp, treats it as an address expression and translates it into the appropriate addresssing_mode of the target architecture.
emit
This function emits an instruction with attached annotations to the instruction stream
instrStream, mltreeStream
These are the instruction stream and mltree streams that are currently bound to the extender.

Extension Example

Here is an example of how the extender mechanism can be used, using the DSP_MLTREE extensions defined in Section MLTree Extensions. We need supply two new functions, compileDSPStm for compiling the FOR construct, and compileDSPRexp for compiling the SUM, and saturated arithmetic instructions.

The first function, compileDSPStm, is generic and simply translates the FOR loop into the appropriate branches. Basically, we will translate FOR(i,start,stop,body) into the following loop in pseudo code:

         limit = stop
         i  = start
         goto test
   loop: body
         i = i + 1
   test: if i <= limit goto loop
 
This transformation can be implemented as follows:
  functor DSPMLTreeExtensionComp
     (structure I : DSP_INSTRUCTION_SET
      structure T : DSP_MLTREE
        sharing I.LabelExp = T.LabelExp
     ) =
  struct
    structure I = I
    structure T = T
    structure C = I.C
 
    type reducer = 
      (I.instruction,C.regmap,C.cellset,I.operand,I.addressing_mode) T.reducer
    
    fun mark(s, []) = s
      | mark(s, a::an) = mark(ANNOTATION(s, a), an)
    fun compileSext (REDUCER{reduceStm, ...}) 
       {stm=FOR(i, start, stop, body), an} =
    let val limit = C.newReg()
        val loop  = Label.newLabel ""
        val test  = Label.newLabel ""
    in  reduceStm(
          SEQ[MV(32, i, start),
              MV(32, limit, stop),
              JMP([], [LABEL(LabelExp.LABEL test)], []),
              LABEL loop,
              body,
              MV(32, i, ADD(32, REG(32, i), LI 1),
              LABEL test,
              mark(BCC([], 
                     CMP(32, LE, REG(32, i), REG(32, limit)), 
                       loop),
                   an),
             ]
       )
    end
 
    ...
 
In this transformation, we have chosen to proprogate the annotation an into the branch constructor.

Assuming the target architecture that we are translated into contains saturated arithmetic instructions SADD, SSUB, SMUL and SDIV, the DSP extensions SUM and saturated arithmetic expressions can be handled as follows.

    fun compileRext (REDUCER{reduceStm, reduceRexp, emit, ...}) 
        {ty, e, rd, an} =
    (case (ty,e) of
       (_,T.SUM(i, a, b, exp)) =>
         reduceStm(SEQ[MV(ty, rd, LI 0),
                       FOR(i, a, b, 
                          mark(MV(ty, rd, ADD(ty, REG(ty, rd), exp)), an))
                      ]
                  )
    | (32,T.SADD(x,y)) => emit(I.SADD{r1=reduceRexp x,r2=reduceRexp y,rd=rd},an)
    | (32,T.SSUB(x,y)) => emit(I.SSUB{r1=reduceRexp x,r2=reduceRexp y,rd=rd},an)
    | (32,T.SMUL(x,y)) => emit(I.SMUL{r1=reduceRexp x,r2=reduceRexp y,rd=rd},an)
    | (32,T.SDIV(x,y)) => emit(I.SDIV{r1=reduceRexp x,r2=reduceRexp y,rd=rd},an)
    | ...
    )
 
    fun compileFext _ _ = ()
    fun compileCCext _ _ = ()
 
   end
 
Note that in this example, we have simply chosen to reduce a SUM expression into the high level FOR construct. Clearly, other translation schemes are possible.

Instruction Selection Modules

Here are the actual code for the various back ends:
  1. Sparc
  2. PA-RISC
  3. Alpha
  4. Power PC
  5. X86
  6. C6xx

Lal George
Allen Leung
SML/NJ Validate this page
Generated by mltex2html
Last modified: Thu Jan 9 19:38:15 EST 2003 by leunga@slinky