The APRAM: A model for asynchronous parallel computation

Candidate: Zajicek,Ofer

It is becoming increasingly clear that parallel computers will play a significant role in the area of computer science and its applications. In order to develop parallel machines and in order to be able to take advantage of them as they become available it is important to understand the issues underlying parallel computation. This thesis investigates one such issue, the synchronization costs of shared memory parallel computation. It defines the APRAM model, an asynchronous variation of the PRAM model, and analyzes a number of fundamental algorithms in this model; it uses three different complexity measures. The first part of the thesis defines the rounds complexity. It describes the complexity of an algorithm as a function of the slowest process. It is used to measure the explicit costs of synchronization: the cost of executing extra code in order to achieve synchronization. Three algorithms are analyzed under this complexity measure: a tree based summation algorithm; a list based recursive doubling algorithm; and an algorithm for computing the connected components of an undirected graph. In all three cases it is shown that global synchronization can be replaced by local synchronization thereby reducing the explicit costs of synchronization. The connectivity algorithm is significantly more substantial than the other two. We avoid the need to synchronize the processes, thereby obtaining an algorithm whose behavior appears somewhat chaotic. Due to its apparently chaotic nature and the unpredictability of the asynchronous environment, its analysis is quite challenging. In an asynchronous environment processes may proceed at different speeds. In the second part of the thesis we model the non-uniformity of the environment by defining the speeds of the processes to be random variables with a known probability distribution. We then quantify conditions under which asynchronous execution may have a significant advantage over a lock step execution, even if the explicit costs of a lock step execution are ignored. Both the summation algorithm and the recursive doubling algorithm are analyzed using two different probability distributions. In addition, we quantify conditions under which the list based recursive doubling algorithm is significantly faster than the tree based summation algorithm.