======== START LECTURE #22 ========

Most common for caches is an intermediate configuration called set associative or n-way associative (e.g., 4-way associative).

How do we find a memory block in an associative cache (with block size 1 word)?

Tag size and division of the address bits

We continue to assume a byte addressed machines with all references to a 4-byte word (lw and sw).

The 2 LOBs are not used (they specify the byte within the word but all our references are for a complete word). We show these two bits in dark blue. We continue to assume 32 bit addresses so there are 2**30 words in the address space.

Let's review various possible cache organizations and determine for each how large is the tag and how the various address bits are used. We will always use a 16KB cache. That is the size of the data portion of the cache is 16KB = 4 kilowords = 2**12 words.

  1. Direct mapped, blocksize 1 (word).
  2. Direct mapped, blocksize 8
  3. 4-way set associative, blocksize 1
  4. 4-way set associative, blocksize 8

Homework: 7.39, 7.40

Improvement: Multilevel caches

Modern high end PCs and workstations all have at least two levels of caches: A very fast, and hence not very big, first level (L1) cache together with a larger but slower L2 cache.

When a miss occurs in L1, L2 is examined, and only if a miss occurs there is main memory referenced.

So the average miss penalty for an L1 miss is

(L2 hit rate)*(L2 time) + (L2 miss rate)*(L2 time + memory time)
We are assuming L2 time is the same for an L2 hit or L2 miss. We are also assuming that the access doesn't begin to go to memory until the L2 miss has occurred.

Do an example

7.4: Virtual Memory

I realize this material was covered in operating systems class (V22.0202). I am just reviewing it here. The goal is to show the similarity to caching, which we just studied. Indeed, (the demand part of) demand paging is caching: In demand paging the memory serves as a cache for the disk, just as in caching the cache serves as a cache for the memory.

The names used are different and there are other differences as well.

Cache conceptDemand paging analogue
Memory blockPage
Cache blockPage Frame (frame)
TagNone (table lookup)
Word in blockPage offset
Valid bitValid bit
MissPage fault
HitNot a page fault
Miss ratePage fault rate
Hit rate1 - Page fault rate

Cache conceptDemand paging analogue
Placement questionPlacement question
Replacement questionReplacement question
AssociativityNone (fully associative)

Homework: 7.32

Write through vs. write back

Question: On a write hit should we write the new value through to (memory/disk) or just keep it in the (cache/memory) and write it back to (memory/disk) when the (cache-line/page) is replaced?

Translation Lookaside Buffer (TLB)

A TLB is a cache of the page table

Putting it together: TLB + Cache

This is the decstation 3100

Actions taken

  1. The page number is searched in the fully associative TLB
  2. If a TLB hit occurs, the frame number from the TLB together with the page offset gives the physical address. A TLB miss causes an exception to reload the TLB from the page table, which the figure does not show.
  3. The physical address is broken into a cache tag and cache index (plus a two bit byte offset that is not used for word references).
  4. If the reference is a write, just do it without checking for a cache hit (this is possible because the cache is so simple as we discussed previously).
  5. For a read, if the tag located in the cache entry specified by the index matches the tag in the physical address, the referenced word has been found in the cache; i.e., we had a read hit.
  6. For a read miss, the cache entry specified by the index is fetched from memory and the data returned to satisfy the request.

Hit/Miss possibilities

hithithit Possible, but page table not checked on TLB hit, data from cache
hithitmiss Possible, but page table not checked, cache entry loaded from memory
hitmisshit Impossible, TLB references in-memory pages
hitmissmiss Impossible, TLB references in-memory pages
misshithit Possible, TLB entry loaded from page table, data from cache
misshitmiss Possible, TLB entry loaded from page table, cache entry loaded from memory
missmisshit Impossible, cache is a subset of memory
missmissmiss Possible, page fault brings in page, TLB entry loaded, cache loaded

Homework: 7.31, 7.33

7.5: A Common Framework for Memory Hierarchies

Question 1: Where can/should the block be placed?

This question has three parts.

  1. In what slot are we able to place the block.
  2. If several possible slots are available, which one should be used?
  3. If no possible slots are available, which victim should be chosen?

Question 2: How is a block found?

AssociativityLocation methodComparisons Required
Direct mappedIndex1
Set AssociativeIndex the set, search among elements Degree of associativity
FullSearch all cache entries Number of cache blocks
Separate lookup table0

Typical sizes and costs

Feature Typical values
for caches
Typical values
for demand paging
Typical values
for TLBs
Size 8KB-8MB 16MB-2GB 256B-32KB
Block size 16B-256B 4KB-64KB 4B-32B
Miss penalty in clocks 10-100 1M-10M 10-100
Miss rate .1%-10% .000001-.0001% .01%-2%

The difference in sizes and costs for demand paging vs. caching, leads to different algorithms for finding the block. Demand paging always uses the bottom row with a separate table (page table) but caching never uses such a table.

Question 3: Which block should be replaced?

This is called the replacement question and is much studied in demand paging (remember back to 202).

Question 4: What happens on a write?

  1. Write-through

Homework: 7.41

  1. Write-back

Write miss policy (advanced)