CS372H Spring 2011 Homework 1 Solutions

Problem 1

Define three styles of switching from user mode to supervisor mode.

Interrupts, when a device sends an interrupt to the CPU.
When a program executes a system call, which is usually implemented by a trap instruction.
When a program performs an operation that causes a hardware exception, such as divide by zero, illegal memory access or execution of an illegal opcode.

Problem 2

Which of the following components is responsible for loading the initial value in the program counter for an application program before it starts running:

        * Compiler
        * Linker
        * Loader
        * Boot module or boot ROM The loader. The linker groups several modules and resolves external references, but the linker really does not perform the actual loading of the program itself (except for dynamic linking, which we will study later in the course).

Problem 3

A hardware designer argues that there are enough transistors on the chip to provide 1024 integer registers and 512 floating point registers. You have been invited as the operating system guru to give opinion about the new design.

        1. What is the effect of having such a large number of registers on the operating system?
        2. What additional hardware features you would recommend added to the design above.
        3. What happens if the hardware designer also wants to add a 16-station pipeline into the CPU. How would that affect the context
            switching overhead?

The cost of a context switch or interrupt handling becomes very expensive. Also, the data structures that store the registers as part of a process context become large. This leads high overhead in memory and also may make the implementation of lightweight threads expensive.
A multithreaded architecture would be a good use of the abundance of registers. The idea is to partition the registers into a number of banks, for example, a set of 1024 registers could be divided into 32 banks of 32 registers each. Then, wewould be able to have the register sets of 32 processes without having to save it or restore it each time a context switchoccurs.
Since the pipeline has to be flushed on an interrupt, a deeper pipeline will take time to be flushed, increasing the latency of serving the interrupt and overhead. The penalty of a context switch increases.

Problem 4

Given the following piece of code:

main(int argc, char ** argv)

{

int child = fork();

int c = 5;

if(child == 0)
{

c += 5;

}

else

{

child = fork();

c += 10;

if(child)

c += 5;

}

How many different copies of the variable care there? What are their values?

The piece of code shown creates two processes. Therefore, we have a total of three processes, the parent, the first and second child. Each of these has its own private copy of the variable c. For the parent, the variable c be 20 before the end of the program. For the first child (the one created in the first program statement), the variable c will contain the value 10 before the end of the program. For the second child (the one created in the else clause), the variable c will contain the value 15 before the end of the program.

Problem 5

Given the following piece of code

main(int argc, char ** argv)

{

forkthem(5)

}

void forkthem(int n)

{

if(n > 0)

{

fork();

forkthem(n-1);

}

How many processes are created if the above piece of code is run? Hint: It may be easier to solve this problem by induction. To solve this problem, we compute the number of processes that get created by calling the function forkthem(). This can be given by the following equation:

n > 0: T(n) = 2 T(n-1) + 1
n = 0: T(0) = 0

where T(n) is the number of processes created by the function. To see why this is the case, consider what happens when the function is called. The first statement calls the system call fork() which creates a child in addition to the caller. Both the caller and the child then execute a recursive call to forkthem() with an argument set to n-1. Therefore, a call to forkthem() creates one process of its own, and then is responsible for all the children that will get created by the function with n-1. The solution to the recurrence equation is 2^n - 1.

Problem 6

What is the output of the following programs (inspect the manual for the system calls if you need more information, but please solve the problem without compiling and running the program).

Program 1:
main()
{
    val = 5;
    if(fork())
        wait(&val);
    val++;
    printf("%d\n", val);
    return val;
}

Program 2:
main()
{
    val = 5;
    if(fork())
        wait(&val);
    else
        exit(val);
    val++;
    printf("%d\n", val);
    return val;
}

In the first program, the parent process creates a child and then waits for the child to exit (through the system call "wait"). The child executes and prints out the value of val, which is "6" after the v++ statement. The child then returns the value of val to the parent, which receives it in the argument to "wait" (& val). The parent then prints out the value of val, which is now 7. Note that the parent and child have seperate copies of the variable "val".

Using similar reasoning, you can see that the parent in program 2 waits for the child to return, and the child exits immediately. In this case only one value gets printed out which is the number 6 (from the parent process).

Problem 7

A typical hardware architecture provides an instruction called return from interrupt, and abbreviated by something like iret. This instruction switches the mode of operation from supervisor mode to user mode. This instruction is usually only available while the machine is running in supervisor mode.

1. Explain where in the operating system this instruction would be used.
2. What happens if an application program executes this instruction?

The operating system calls "iret" whenever it wants to give control to a user program. This happens at the end of a contextswitch, or when an interrupt service routine is finished.
Since the "iret" instruction is usually a privileged instruction, an application program trying to execute it will cause a hardware exception, because it is illegal for user programs to execute privileged opcodes.

Problem 8

System Calls vs. Procedure Calls: How much more expensive is a system call than a procedure call? Write a simple test program to compare the cost of a simple procedure call to a simple system call ("getpid()" is a good candidate on UNIX; see the man page.) (Note: be careful to prevent the optimizing compiler from "optimizing out" your procedure calls. Do not compile with optimization on.)

Run your experiment on two different hardware architectures and report the results.
Explain the difference (if any) between the time required by your simple procedure call and simple system call by discussing what work each call must do (be specific). [Note: Do not provide the source code for your program, just the results].

Hint: You should use system calls such as gethrtime() or gettimeofday() for time measurements. Design your code such that the measurement overhead is negligible. Also, be aware that timer values in some systems have limited resolution (e.g., millisecond resolution).

A system call is expected to be significantly more expensive than a procedure call (provided that both perform very little actual computation). A system call involves the following actions, which do not occur during a simple procedure call, and thus entails a high overhead:

A context switch
A trap to a specific location in the interrupt vector
Control passes to a service routine, which runs in 'monitor' mode
The monitor determines what system call has occurred
Monitor verifies that the parameters passed are correct and legal

For your experiment, you should measure the total time for a large number of system/function calls, and then find the average time per call in order to overcome the course resolution of your timing functions. For example, here's a sample code for measuring the time taken for a simple system call and a simple function call:

#include <sys/time.h>
#include <unistd.h>
#include <assert.h>

int foo(){
return(10);
}

long nanosec(struct timeval t){ /* Calculate nanoseconds in a timeval structure */
return((t.tv_sec*1000000+t.tv_usec)*1000);
}

main(){
int i,j,res;
long N_iterations=1000000; /* A million iterations */
float avgTimeSysCall, avgTimeFuncCall;
struct timeval t1, t2;

/* Find average time for System call */
res=gettimeofday(&t1,NULL); assert(res==0);
for (i=0;i<N_iterations; i++){
j=getpid();
}
res=gettimeofday(&t2,NULL); assert(res==0);
avgTimeSysCall = (nanosec(t2) - nanosec(t1))/(N_iterations*1.0);

/* Find average time for Function call */
res=gettimeofday(&t1,NULL); assert(res==0);
for (i=0;i<N_iterations; i++){
j=foo();
}
res=gettimeofday(&t2,NULL); assert(res==0);
avgTimeFuncCall = (nanosec(t2) - nanosec(t1))/(N_iterations*1.0);

printf("Average time for System call getpid : %f\n",avgTimeSysCall);
printf("Average time for Function call : %f\n",avgTimeFuncCall);
}

Sample output on a linux machine :

> gcc -O0 testtime.c -o testtime
> ./testtime
Average time for System call getpid : 394.778015
Average time for Function call : 15.080000

Problem 9

When an operating system receives a system call from a program, a switch to the operating system code occurs with the help of the hardware. In such a switch, the hardware sets the mode of operation to supervisor mode, calls the operating system trap handler at a location specified by the operating system, and allows the operating system to return back to user mode after it finishes its trap handling. Now, consider the stack on which the operating system must run when it receives the system call. Should this be a different stack from the one that the application uses, or could it use the same stack as the application program? Assume that the application program is blocked while the system call runs.

Most compilers set the stack pointer to be at the beginning of the stack area when a function is called. Thus, automatic
variables are found underneath the location to which the stack pointer is pointing to (except for the HP PA-RISC, in which
the stack grows upward). Thus,

When an interrupt occurs, the kernel has no idea as to how many automatic variables the user program has allocated on the stack Therefore, it cannot just set the stack pointer past the current call frame of the use program, because it does not know where are the call frame limits. As such, the kernel must use a different stack to execute the interrupt handling routine.
On a multiprocessor, executing on the user stack allows another active user thread (on a different processor) to modify the stack and forcing the kernel to fail (maliciously, or unintentionally).
If the current stack is allocated on the user program's heap (such as in user-level thread packages), then the kernel code can cause a stack overflow and start destroying data in the user's address space.