DOCTORAL DISSERTATION DEFENSE
PERFORMANCE EVALUATION OF SOLUTIONS TO THE TLB CONSISTENCY PROBLEM

Candidate: Patricia J. Teller
Advisor: Allan Gottlieb
1:20 p.m., Wednesday, April 24, 1991
room 402, Warren Weaver Hall

Abstract

To implement virtual memory efficiently, virtual-to-physical address translation information is stored in page tables and cached in translation-lookaside buffers (TLBs). In multiprocessors with multiple TLBs, page-table modifications can result in outdated TLB entries, the use of which can cause erroneous memory accesses.

We propose three new solutions to this TLB consistency problem, which unlike existing solutions for highly-parallel shared-memory multiprocessors do not require interprocessor synchronization and communication, and neither interrupt processor execution nor introduce unnecessary serialization.

The cost of each of our solutions is embodied in the cost of TLB reloads, which load translation information for referenced pages into TLBs. Two assume TLBs at processors and one assumes TLBs at memory. We study their performance in scalable multiprocessor architectures via a trace-driven simulation system capable of simulating a range of systems using just one address trace.

Our results show that system performance improves if TLBs are located at memory, rather than processors, provided that memory is organized as multiple paging arenas, where the mapping of pages to arenas is fixed.

A class of parallel workloads can produce a number of TLB reloads, R, that grows linearly with N. A set of our simulations for processor-based TLBs validate this model.

A processor-based TLB reload costs O(log N) because of network transit. Thus, management of processor-based TLBs, be it consistency ensuring or not, has an overhead that grows as N log N.

The cost of a memory-based TLB reload within a paging arena can be made smaller than that of a processor-based TLB, since additional network transits are not required.

Simulation result show that when there is only one paging arena, memory-based TLBs exhibit generally larger miss rates than processor-based TLBs, and the related overhead is generally larger. When there are two paging arenas, memory-based TLBs produce smaller miss rates than processor-based TLBs of equal size, and the related overhead is generally smaller. To maintain low overhead for large machines, it is likely that the number of paging arenas must grow as O(N).