Computer Systems Organization I - Prof. Grishman
Write a program in C which will interpret the statements of a simple
SIMPL. SIMPL has four statements, an assignment statement, while and endwhile statements, and a print statement. The
interpreter should print whatever is specified by the print statements. For
example, the program
num = 1
while num <
square = num * num
num = num + 1
num = 1
square = 1
num = 2
square = 4
num = 3
square = 9
SIMPL Specifications (minimum requirements)
There are five types of SIMPL tokens: variables, reserved words,
constants, operators, and punctuation.
A variable consists of 1 to 31 letters (upper or lower case).
Reserved words: the words "while", "endwhile", and "print" are
reserved and may not be used as variable names.
A (decimal) constant consists of 1 to 9 decimal digits.
There are six operators: +, -, *, /, < and >
There is only one punctuation token, =
Each SIMPL statement is on a separate line.
Tokens are separated by at least one whitespace character.
Statements may begin and end with zero or more whitespace characters.
An expression element is a
variable or constant.
An expression is either an
expression element or two expression elements separated by an operator.
The operators < and > return 1 if the relation is true, and 0 if
the relation is false.
A reference in an expression or a print statement to a variable which
has not been assigned a value is an error.
The assignment statement has the form 'variable
= expression' and
assigns to variable the
value of expression.
The while statement has
the form 'while expression' and executes the
statements up to the following 'endwhile'
statement as long as expression is
non-zero (true). while statements
cannot be nested.
The print statement has
the form 'print variable' and prints the name and
value of variable to standard
After the last statement in the SIMPL program has been executed, the
This assignment offers substantial opportunity for adding optional
features for extra credit. Some possibilities are listed here:
- Standard spacing: whitespace is required only following the
keywords while and print (easy)
- Additional comparison operators: support the operators ==, !=, <=, and >= (easy)
- Unary operators: support unary -, ++, and -- (medium)
- Expressions: support full expressions with parentheses
- Comments: allow comments following a '//' token (easy)
- Allow nested while statements (medium)
- Additional statements: if / else / endif; for (hard)
Suggestions for organizing the program
You may write the program any way you like, so long as the final
program meets the specifications just described. However, we
offer the following suggestions for data structures and program
organization to help you get started.
Global data structures
Note that, because we do not have something like a Java ArrayList in C,
we need global variables to keep track of the number of currently
filled elements in each array (program,
- an array of strings, each containing one line of the program ("program")
- an integer, the index of the statement currently being executed ("current")
- to keep track of the current value of each variable, a pair of
arrays, one containing the name of each variable ("variable") and the other the
current value of the variable ("value");
if you prefer, you can combine these into a structure
- for the statement currently being executed, an array of strings
containing its tokens ("token").
This design assumes that a statement is re-tokenized each time it is
executed; keeping the tokenized form of a statement is more
efficient but requires a bit more complex data structure.
- if in a while loop, the index of the while statement, otherwise 0
High-level program structure
The general strategy is to have the structure of the interpreter
reflect the structure of the language being interpreted, and to have
lots of relatively small functions, one to handle each language
construct. This makes it easier to debug the program, and to
understand the program once it is written.
- main: calls loadProgram, listProgram, and executeProgram
reads statements from standard input into program (modeled after
main/getline as done in class)
repeatedly does executeStatement
until we have executed the last statement of program
tokenizes the current statement, determines the type of statement
(based on the first token), and executes the corresponding function
- tokenize: puts
the tokens of the current statement into token
does evaluateExpression to
evaluate the right-hand side, and then assignVariable to assign that
value to the variable on the left-hand side
prints the name and value of the variable
evaluates the expression; if true, sets whileStatement and continues
with the next statement; if false, clears whileStatement and skips to
the following endwhile statement
if whileStatement is
non-zero, goes back to that statement
if the expression is a single token, gets its value with evaluateExpressionElement;
if it is three tokens, evaluates each operand with evaluateExpressionElement and
then applies the operator using applyOperator
determines whether the token is a variable (using isVariable) or constant (using isConstant); if a
variable, get its value using getVariableValue;
if a constant, get its value using intValue
returns true if the token is all alphabetic (a variable or reserved
returns true if the token is a decimal constant
returns the value of a decimal constant (similar to your Assignment #4,
but for decimal)
returns the value of a variable (or an error if the variable is not
computes the value of an expression given its two operands and operator
Stages of implementation
Altogether, this system, with minimal comments, should take about 300
lines of code. We will do this in three stages, each of which
(very roughly) requires 100 lines of new code and will be due in
(roughly) one week increments.
Stage 1: due Nov. 22 -- tokenization and token classification
Stage 1 will be the most straightforward because it builds on the
getline function we did in class and the gettoken function you did for
homework. It should read in the program and then print it, with
each line followed by a list of the tokens in the line and whether the
token is an integer, variable (or reserved word), or other.
Reserved words will be considered variables for this stage. In
terms of the functions listed above, it involves writing loadProgram, listProgram, tokenize, isVariable, and isConstant. A sample
output would be
0: a = 3
Line 1: while a < 60
Stage 2: due Dec. 2 -- assignment statements
The focus in stage 2 is on keeping track of variables and their
values. This version of the program should accept a SIMPL program
consisting only of assignment statements where the right-hand side is a
simple variable or constant. After each statement is executed, it
should print the name and value of the variable. This involves
writing executeProgram, a
simple version of executeStatement
(which always calls) executeAssignment,
intValue and getVariableValue. A
sample input would be
moo = quack
moo = woof
and output would be (in addition to a listing of the program)
quack = 3
Assign moo = 3
Error: undefined variable
Assign moo = 0
[default value for undefined variables]
Stage 3 (final version): due Dec. 8 -- complete system
The main things to be added in this stage are the code for expressions,
and the code for the other statements (print and while).
This project will be worth 16 points. Successful completion of
stage 1 on time gets 4 points; successful completion of stage 2
on time gets 5 points; and successful completion of the final
system gets 7 points. A final system without a getwhile will be
worth 5 points (2 point penalty). Credit will be based primarily
on correct execution of correct programs, but some credit will also be
given for comments (we expect at least a line before each function
explaining what it does) and error checks (if we can get your program
to crash with an invalid input, it won't get full credit; make
sure you don't overrun any arrays).
Extra credit will depend on features added, starting with 0.5 points
for 'easy' additions; we anticipate awarding up to 4 points of
extra credit for interesting combinations of features, but stand ready
to be wow-ed.
Label your program files for the three stages lastname6a.c, lastname6b.c, and lastname6.c.
Submit your program (.c file)
by email, as an attachment, to the
e-tutor, Ian Lau <email@example.com>,
by one minute before midnight on the specified date. (Late assignments
will be penalized 1/2 point
for each weekday late.) Label your email "CSO
Asgn 6a", "CSO Asgn6b", and "CSO Asgn 6". For the final version
It is important to keep up with the stages of the project. If you
are having trouble, contact me or Ian by email.
- send a copy to me <firstname.lastname@example.org>
at the same time that you submit it to Ian
- if you are doing any extra credit, describe in your mail and in
comments in the program what extra features you are including, and
provide at least one test file demonstrating these features