Compiler Construction

G22.2130.01
Monday 5:00 - 6:50 pm
Room 102, Warren Weaver Hall
Professor Edmond Schonberg

Instructor: Edmond Schonberg (schonber@cs.nyu.edu) Office hours: WWH410, Monday and Tuesday 3-5 pm, and by appointment.

Teaching Assistant: Nikhil Sethi (nsethi@nyu.edu)
Office hours: WWH 605, Tuesday 5-7 pm.

Class list

All students should register themselves with the class list, which is used for all technical discussions concerning the course and the course project. To subscribe to this list visit the web page:

http://www.cs.nyu.edu/mailman/listinfo/g22_2130_001_fa05

where you will find full directions.

Please send all your questions to this list (not only to the instructor) so that everyone can participate.

Assignments

The course involves the construction of a compiler for a simple imperative language. The compiler will produce assembly language for the machine of your choice. This motivates Assignment 0 of the course. The first step in the project is to review the assembly language for that machine, so you know what your code generator is supposed to generate! For this zeroth assignment, you must write a small procedure in assembly language. For example, a function that computes the hash-code of a string. You should write a driver program in C, that calls your routine and displays the results. A reasonable hashing algorithm for strings uses the numeric value of each character, and computes a small (16-bit, say) natural number out of it, using multiplications and exclusive ors.
Hand in the source for the assembly program and for the driver, and the output of the program. Indicate the machine on which you are running and the assembler you used.

If you need to refresh your knowledge of assembly language, the following may prove useful:

Free Assembly Tutorial
direct download

On a Unix system you can run gcc with the -S switch to obtain the generated assembly code, which provides the quickest introduction to the conventions of the assembler on your machine.

Submission

Send listing of assembly program, driver program, and output, to me and to the graders.
Due date: September 26.


Textbooks

required:

Aho, Sethi and Ullman: Compilers, principles, Techniques and tools (Addison Wesley)

This is still a standard text, and even though more than 15 years old it covers most of the topics we want to discuss in sufficient detail. A new edition is to appear shortly, but not early enough for this semester.

Recommended:

Scott: Programming Language Pragmatics.
A nice balance of language issues and implementation issues, with substantial elementary information about compiler structures and organization. The required text for Programming Languages.

Additional recommended readings:

Cooper and Torczon: Engineering a Compiler (Morgan Kaufman 2004).
Excellent discussion of intermediate representations, and of optimization techniques.

Fraser and Hanson: a retargetable C compiler (Benjamin Cummings, 1996)
Good treatment of the machine-description techniques used among others by the GCC family of compilers.

Muchnick: advanced compiler design and implementation (Morgan Kaufman, 1997)
Excellent in-depth treatment of code generation and optimization issues for a variety of processors.

Appel : modern compiler implementation in C : basic techniques
Also available in ML and Java versions. Compact presentation, strongly influenced by compilation techniques for functional languages.

Dewar & Smosna : Microprocessors, a programmer's perspective (Mc-Graw Hill)
An excellent survey of what a software engineer needs to know about hardware.

Course outline

sample questions for final examination

The sources of GNAT

Throughout the course we will be making references to the sources of the GNAT compiler,The compact overview of GNAT provides a roadmap to its main components. The sources of the compiler can be browsed the GNAT sources here.

Bison Documentation

On-line documentation for the Bison parser-generator can be found here

Assignments

Assignment 1: lexical analysis
Assignment 2: top-down parsing
Assignment 3: Intermediate code generation

Project

Lexical analyser
Parser
Semantic analysis: visibility and typing
Code generation