The main goal of caches is that figuring out how we can build a big and fast memory hierarchy with combining SRAM, DRAM, and disk.
Hard disk is huge but super slow,
DRAM is medium size but slow,
SRAM is small but fast. (It’s cache that is small fast memory to store important data.)
When we combine them, we can create system that both fast and big.
Memory Hierarchy
A memory hierarchy organizes different forms of computer memory based on performance. Memory performance decreases and capacity increases for each level down the hierarchy.
Cache memory is placed in the middle of the hierarchy to bridge the processor-memory performance gap.
Cache
Cache is memory placed in between the processor and main memory. Cache is responsible for holding copies of main memory data for faster retrieval by the processor.
Cache memory consists of a collection of blocks. Each block can hold an entry from the main memory. Each entry has the following information:
- A tag that corresponds to the main memory location
- The data from the main memory location
Cache Hit
A cache hit is when a computer processor finds the data it needs inside cache memory.
When a program requests data from memory, the processor will first look in the cache. If the memory location matches one of the tags in a cache entry the result is a cache hit and the data is retrieved from the cache.
Cache hits improve performance by retrieving data from a smaller and faster memory source.
Cache Miss
A cache miss is when a computer processor does not find the data it needs in cache memory and must request it from the main memory. The main memory places the memory location and data in as an entry in the cache. The data is then retrieved by the processor from the cache.
Placement Policy
A placement policy, or associativity, is the process of mapping locations in main memory to specified blocks in the cache.
- A fully associative cache maps each memory location to any location in the cache.
– Hard to search
– Flexible - A direct-mapped cache maps each memory location to one location in the cache. This associativity does not require a replacement policy since there is only one cache entry for each location in memory.
– Easy to search (with modulo addressing to find the one location)
– Not flexible because if two addresses map to the same line then we can only store one or the other
- A set-associative cache maps each memory location to a specified number of locations in cache. Combine the easy searching of direct-mapped and flexibility of fully-associative cache. Each cache block can go in only 1 set but each set can have multiple blocks.
– Easy to search because you know which set based on the address and only have to check the few cache blocks in the set.
– Flexible because you can put each cache block anywhere in the set, it reduces the conflicts (but doesn’t eliminate).
– More complex than direct-mapped and less flexible than fully-associate.
A 2-way set-associative cache has 2 blocks per set. A cache with 4 blocks that is 2-way set associative has 2 sets. Each main memory location maps to a set based on the location address.
Comparison
Write Policy
A write policy defines how data is written to the main memory once it is written to the cache.
- The write-through policy writes data to the cache and the main memory at the same time. This policy is easy to implement in the architecture but is not as efficient since every write to cache is a write to the slower main memory.
- The write-back policy writes data to the cache but only writes the data to the main memory when the data is about to be replaced in the cache. This policy is more difficult to implement but is more efficient since data is only written to the main memory only when it absolutely needs to be.
Replacement Policy
A replacement policy defines how data is replaced within the cache.
- Random: The data is replaced randomly. This policy is the easiest to implement within the architecture but the resulting performance increase may be small because of you’re randomly choosing things to replace, you may evict the data you’re about to reuse because it’s evicted at random.
- First In First Out (FIFO): The data is replaced in the order it was placed in the cache. This can provide a moderate performance increase and is not as difficult to implement in the architecture as LRU. The FIFO policy requires a counter to keep track of which entry is the oldest and next to be replaced.
- Least Recently Used (LRU): The data that has not been accessed for the longest period of time is replaced. This can provide a higher performance increase as the data that is used often stays in the cache. Implementing this policy in the architecture is difficult and may not be worth the cost.