CS202 Spring 2015 Lab 3: Multi-threaded programming

Released Thursday, February 12, 2015
Part A due Monday, February 23, 2015, 9:00 PM
Part B due Friday, February 27, 2015, 9:00 PM

Lab 3: Multi-threaded programming

Introduction

In this lab, you will implement a model of an online store. Your model will support safe parallel updates to a central inventory database, which maintains stock of all items available for sale from the online store. Updates to this inventory database will come from both suppliers (who add goods to the inventory and enable or disable deals on items, such as discounts and free shipping) and customers (who, by purchasing items from the online store, remove them from the inventory). In the model, customers and suppliers will be represented by different threads, all executing in parallel.

Though your store is only a model, the concepts you will learn and use throughout this lab could be applied similarly in a real-world setting. Suppliers and customers would be on different computers, talking to a central inventory database over the Internet. Requests sent over the Internet from suppliers and customers would be handled by different threads on the central inventory computer, and any operations on the database would have to be done with proper synchronization, just as in your model.

Please note the following expectations:

Because it is impossible to determine the correctness of a multithreaded programming via testing, grading on this project will primarily be based on reading your code. If your code is difficult to understand, then you will lose points (potentially lots of them), even if the program seems to work. Feel free to sit down with the TAs or instructor during office hours for code inspections before you turn in your project (note: to take advantage of this, start on time! [which is probably earlier than you think]).

Getting Started

We recommend doing this lab inside the development platform that you set up in the previous labs (the virtual devbox or the CIMS machines). However, this lab uses standard interfaces and tools, and as a consequence, you ought to be able to do the lab on OS X or even cygwin on Windows (though we have not tested this). As always, we will be grading on the standard lab platforms.

From within the development platform that you have chosen (which, in the recommended cases, will be the virtual devbox or the CIMS machines), change to your lab directory, use Git to commit changes you've made since handing in lab 2 (if any), obtain the latest version of the course repository, and create a local branch called lab3 based on origin/lab3 (which is our lab3 branch). The commands are as follows, but you may not need all of them (for example, if git status indicates that there are no changes to commit):

$ cd ~/cs202
$ git commit -am 'my solution to lab2'
Created commit 254dac5: my solution to lab2
 3 files changed, 31 insertions(+), 6 deletions(-)
$ git pull
Already up-to-date.
$ git checkout -b lab3 origin/lab3
Branch lab3 set up to track remote branch refs/remotes/origin/lab3.
Switched to a new branch "lab3"
$ cd lab3 

A Quick Introduction to C++

A very helpful way to think of shared state is as shared objects. We're therefore going to use C++, C's object-oriented descendant, for this project.

If you haven't used C++ before, don't worry. For the subset of C++ relevant to this project, the learning curve will be gentle (especially assuming you already know Java). Furthermore, we will provide template code to which you will add the details; this should largely insulate you from having to learn much C++ syntax.

Read A Quick Introduction to C++, and you should be good to go. Note that this document was written over a decade ago, so a few of the comments on the state of standards and tools are a bit out of date (for example, the document warns against using C++ templates because debuggers didn't understand them well back then; this warning is much less applicable today.) Nonetheless, it provides a good, quick overview of the key ideas to use (and some issues/pitfalls to avoid.)

Working with Threads

As noted earlier, you will need to read and follow Coding Standards for Programming with Threads, by Mike Dahlin. If you're shaky on monitors (we covered them in class, but you might want reinforcement), see this chapter from OSTEP.

To simplify your task, we supply a simple thread package (written by Mike Dahlin) on top of the standard POSIX thread library (which is known as pthreads). The idea is to shield you from irrelevant detail. This way, you use the standard package but you also focus on the project at hand. However, you are not required to use the wrapper; you may instead use pthreads if you so choose. The code for the simple thread package (which we will refer to hereafter as sthreads) we provide is in sthread.cpp and sthread.h.

The package provides threads (sthread_ts), mutex locks (smutex_ts), and condition variables (scond_ts) as well as some other utility functions that you may need. We suggest that you read over these functions to see how they are used. It may be helpful to write a couple of example programs using sthreads before starting this project. For more information, see the man pages for the pthreads library functions used in the sthread.cpp code.

You should keep the following in mind as you code these labs:

Part A: Task Queue and Coarse-Grained Store Synchronization

For our model, we will simulate a fixed number of customers and suppliers. For each customer and each supplier, there will be a unique thread representing that customer or supplier. We will often refer to these threads throughout this document as worker threads. These worker threads will get jobs to work on from a task queue, which you will implement. There will be one queue for all customer threads and one queue for all supplier threads. This task queue must allow for multiple worker threads to simultaneously attempt to add or remove jobs while still maintaining the integrity of the queue's internal data structures by using locks (i.e., your task queue must be thread-safe).

It is a common pattern in multi-threaded programming to have a single (or small number of) task queue(s) for a large number of threads. A thread-safe task queue makes the job of allocating work to threads (and having threads allocate work to other threads, should the case arise) easy to do and (relatively) easy to reason about.

Exercise 1. Look through the queue interface in TaskQueue.h and the documentation for its methods in TaskQueue.cpp. Fill out the rest of the members in class TaskQueue in TaskQueue.h to finish the definition of your task queue class, then implement the rest of the queue methods in TaskQueue.cpp. You are welcome to use any standard C++ container to help build the functionality of your queue (such as a std::queue or std::deque), or you can create your own data structures (like a linked list) and add any needed helper structures to do the same.

A task in the task queue is represented by a struct Task (also found in TaskQueue.h), which consists of a pointer to a function and an argument to be passed to that function. When a worker thread removes a task from the task queue, it should call the function given in the struct Task with the argument given in the Task.

Run make to make sure your queue code compiles. For now, there's no executable to run.

A task queue is little good to worker threads if there is no work available to be put on it. We have provided some code to generate work requests to be put on your task queues in RequestGenerator.h and RequestGenerator.cpp. Become familiar with this code. Specifically, there are two subclasses of the main class RequestGenerator that you should know the purpose of: class CustomerRequestGenerator and class SupplierRequestGenerator. The former CustomerRequestGenerator, as its name would suggest, generates requests for customer threads to perform (such as buying items). The latter SupplierRequestGenerator generates requests for supplier threads to perform (such as adding or removing items from inventory, or putting items on sale). You will use these generator classes to implement task generator threads, which will produce work for customer and supplier threads.

Before you start implementing any thread functions, however, you should actually have some threads running. In the main source file for our simulator, estoresim.cpp, is a function called startSimulation. This function kicks off all threads in the simulator (generators, customers, and suppliers) and then, after starting all threads, waits for them to finish. When all threads have finished working, the simulation is complete.

Exercise 2. Read the documentation for, and then implement, the startSimulation function in estoresim.cpp. Use sthread_create to create threads. The worker thread functions supplierGenerator, supplier, customerGenerator and customer all reside in estoresim.cpp. Use the provided class Simulation to keep track of task queues and the number of customers and suppliers. When you have finished writing your code, run make run-sim to run the simulator. You may want to have the threads produce some output to make sure that you are starting them correctly.

Don't worry about the meaning of the fineMode variable in Simulation for now, just make sure it is set to the value passed in the parameter useFineMode.

You may also find it useful in this exercise, and in others throughout this lab, to call printf in some choice places (where worker threads start, for instance). The code we provide you doesn't print anything to the terminal, and so when you run the simulator with make run-sim, you won't see any output unless you put the printfs in yourself.

At the moment, all of the threads you have created start and then immediately stop, as none of their associated functions are implemented (they just return when called). However, since you have actual threads running, you can proceed to implement the generator, customer, and supplier thread functions. If you put printfs in these functions as suggested, you should see all the threads print to the terminal.

Exercise 3. Implement the customer, supplier, supplierGenerator, and customerGenerator functions in estoresim.cpp. Read the code referred to above (e.g. RequestGenerator.cpp) and the comments for these functions to get an idea of what they should be doing. The generator functions should look similar to each other, as should the supplier and customer functions to each other. Produce some output from these threads to make sure they're running and that jobs are being pushed between them properly.

For the supplierGenerator and customerGenerator functions, you will also have to implement the enqueueStops method in RequestGenerator.cpp.

Run make run-sim to run the simulator. You should see the output of your worker threads. You can stop the simulator by pressing Control-C.

Now you have many customers and suppliers working on jobs generated by the generator threads. But there is no inventory for items in the store, nor are any of the work functions produced by the request generators implemented. So, for the moment, our simulator spawns many workers in parallel to do very little. For the workers to have anything to work on, there should be an inventory of items to purchase from and add to. To this end, we have provided the skeleton of this inventory for you as class EStore, in the files EStore.h and EStore.cpp.

Exercise 4. Design and implement the EStore class, filling in all the method skeletons we provide and adding any new methods you may find necessary. Read through all the provided comments in the files related to the EStore class (but don't worry about the buyManyItems method for now). The EStore will have to keep track of many Items and ensure that all modifications to Items in the EStore are synchronized.

You should implement the EStore as a monitor; that is, there should be a single lock on the entire store, which is acquired upon entering any of the store's methods and released upon exit.

As a reminder, the buyItem method in EStore should wait for an item to become available or on discount if necessary. So, if a customer comes in looking to buy an item outside of the given budget, the customer should wait until the item is being sold at a sufficiently high discount and then buy, instead of just immediately returning.

Run make run-sim to make sure your code compiles and doesn't have any segfault-inducing bugs in the constructor for class EStore. For the moment, no threads are actually interacting with the EStore, so if you put any printfs in its methods, you won't see them output anything just yet.

There is only one piece remaining to make the simulator work now: the handlers for the jobs created by the request generators and pushed to the worker threads. These jobs consist of adding and removing items from the EStore inventory or setting discounts on items (for suppliers) and of purchasing items (for customers). We've provided skeletons for these handlers in RequestHandlers.cpp. Before you implement these handlers, you may find it helpful to read through all the different kinds of requests that exist in Request.h.

Exercise 5. Go through and implement each handler in RequestHandlers.cpp one-by-one. Print a message at the beginning of each handler which contains the name of the handler and the fields of the request struct passed to it (you don't need to print the value of the store pointer). This will produce a trace of the work done by your threads as they process jobs.

Your handler functions should largely make use of the methods you implemented previously in EStore.

After implementing all the handlers (including stop_handler), your simulator may not terminate on its own. This will happen in the case where a customer is blocked on buying an item, but the item never gets discounted enough for the customer to buy, which will happen sometimes and should be expected. In this case, you will still need to kill the simulator with Control-C.

Run make run-sim after you implement each handler and make sure that requests are being dispatched to your new handlers.

Here's some example output from the staff solution to part A of the lab to give you an idea of what running your simulator should look like. You don't have to worry about matching our output format exactly.

Handling AddItemReq: item_id - 77, quantity - 94, price - $6370.15, discount - 0.00
Handling BuyItemReq: item_id - 83, budget - $14308.86
Handling BuyItemReq: item_id - 35, budget - $33853.86
Handling AddItemReq: item_id - 92, quantity - 22, price - $5167.49, discount - 0.00
Handling BuyItemReq: item_id - 62, budget - $9900.27
Handling AddItemReq: item_id - 90, quantity - 64, price - $5201.59, discount - 0.00
Handling BuyItemReq: item_id - 26, budget - $6805.40
Handling AddItemReq: item_id - 26, quantity - 37, price - $892.72, discount - 0.00
...
Handling ChangeItemPriceReq: item_id - 96, new_price - $7007.23
Handling BuyItemReq: item_id - 46, budget - $19934.51
Handling ShippingCostReq: new shipping cost - $29.21
Handling BuyItemReq: item_id - 79, budget - $30974.88
Handling AddItemReq: item_id - 64, quantity - 42, price - $8883.28, discount - 0.00
...
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.

You should now have a working store simulator. Run make run-sim a few times and watch the output of your workers as they process jobs. If you notice anything suspicious in the output, go back and check to make sure you did your synchronization correctly and that you are following the multi-threaded coding guidelines.

Handing in the lab

If you have not done so yet, commit your code.

$ git status  # see what files you changed
$ git diff    # examine your changes
$ git commit -am "My solutions to lab3a"
Then, follow these steps:

If you submit multiple times, we will take the latest submission and count slack hours accordingly.

This ends part A of the lab.

Part B: Fine-Grained Store Synchronization

In part A, you implemented a store simulator using coarse-grained synchronization: there was one lock on the entire EStore that kept concurrent accesses from stepping on each other, which effectively serialized execution of the program (as no two threads could modify the inventory in EStore at the same time). However, you might have noticed that you don't necessarily need to lock the entire inventory if two threads are trying to modify two different items in the inventory. Thus, in part B, we will explore an approach that exposes more parallelism, by using fine-grained locking. Instead of using one large lock on the entire EStore, you will use locks over smaller units of code. In this way, multiple threads that aren't touching the same data can operate on the inventory concurrently.

Keep in mind throughout the next exercises that you should be able to run the simulator using your old coarse-grained locking approach by running make run-sim and using your yet-to-be-implemented fine-grained locking approach by running make run-sim-fine. The latter command, as might be expected, runs the simulator in "fine-grained mode" (this is related to the mysterious fineMode variable you saw before, which is set to false with make run-sim, but set to true with make run-sim-fine). You may want to abstract some of your locking with methods whose function varies depending on the value of fineMode. To reiterate, your simulator must use your coarse-grained locking approach from part A when we run make run-sim and use your new fine-grained locking approach to be done in part B when we run make run-sim-fine!

Exercise 6. Change your EStore implementation to allow multiple threads to access different items at the same time. If two threads try to access the same item at the same time, then only one should be allowed to modify the item at a time.

Don't worry about modifying the implementation of buyItem. In fine-grained mode, buyItem is not called. You will implement the fine-grained mode version of buyItem, called buyManyItems, in the next exercise.

Run make run-sim-fine to make sure your code still runs. Since you are now running in fine-grained mode, BuyItem requests are not being generated; instead, BuyManyItems requests are generated. This function is still unimplemented, so only suppliers will be making any modifications to the state of the EStore at the moment.

To flex the new fine-grained locking features of the simulator, you must implement the handler and EStore functionality for the BuyManyItems request. This request requires that many items be looked at (with synchronization!) and, if the right conditions hold, modified (i.e. purchased).

Exercise 7. Implement the EStore method buyManyItems and its corresponding handler in RequestHandlers.cpp. Make sure that, as with previous request handlers, you print out the name of the request and the fields of the request struct passed to the handler at the beginning of the buyManyItems handler.

Run make run-sim-fine. You should now see buyManyItems requests being handled by threads. You should also make sure that your old code still functions correctly with make run-sim.

Here's some sample output from the staff solution to part B. As before, your output doesn't have to look exactly like ours, but this should give you an idea of what we'd like to see when we run your simulator.

Handling AddItemReq: item_id - 62, quantity - 91, price - $4901.27, discount - 0.00
Handling BuyManyItemsReq: item_ids - 15 35 49 77 86 92 93 
Handling BuyManyItemsReq: item_ids - 26 40 63 
Handling AddItemReq: item_id - 36, quantity - 69, price - $53.11, discount - 0.00
...
Handling BuyManyItemsReq: item_ids - 21 29 35 37 58 74 93 95 
Handling AddItemReq: item_id - 43, quantity - 30, price - $3360.28, discount - 0.00
Handling BuyManyItemsReq: item_ids - 4 11 43 63 76 
Handling BuyManyItemsReq: item_ids - 18 
...
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.

Honors supplement. Honors students (section 001) should choose ONE of the two exercises below; students in section 002 are free (even encouraged) to do the exercises, but there will be no credit given.

Do these problems on a new branch called lab3-extended (i.e., your commits for these exercises should go on this new branch). The lab3 branch should include your code only up until Exercise 7; we will grade under this assumption. When submitting, you can be on either branch. The submission script will pack your entire git repo, which includes both branches.

Exercise 9 (Honors). Implement a version of the EStore::buyManyItems method that will wait until the order can be fulfilled instead of giving up. The implementation should not wake up threads unecessarily. For instance, if an item decreases in price, only threads that are waiting to buy an order that includes that item should be signaled (though all such threads should be signaled).

Exercise 10 (Honors). Ensure that the shipping cost and store discount does not change while processing an order in EStore::buyManyItems.

Handing in the lab

If you have not done so yet, commit your code.

$ git status  # see what files you changed
$ git diff    # examine your changes
$ git commit -am "My solutions to lab3b"
Then, follow these steps:

If you submit multiple times, we will take the latest submission and count slack hours accordingly.

This ends part B of the lab.

Acknowledgements

Parts of the documentation for this lab, as well as the sthreads library, were taken from Mike Dahlin's Network Scheduler lab. The lab itself was written by Victor Vu and Isami Romanowski, with contributions from Sebastian Angel, Parth Upadhyay, and Navid Yaghmazadeh.


Last updated: Mon May 04 11:24:46 -0400 2015 [validate xhtml]