You are on page 1of 8

Cache Mapping

Goals
• Understand how the cache system finds a
d t item
data it in
i th
the cache.
h
Cache Mapping • Be able to break an address into the fields
used by the different cache mapping
schemes.
COMP375 Computer Architecture
and
dOOrganization
i ti

Cache Challenges Cache to Ram Ratio


• The challenge in cache design is to ensure • A processor might have 512 KB of cache
th t the
that th desired
d i dd data
t and
d iinstructions
t ti are andd 512 MB off RAM
RAM.
in the cache. The cache should achieve a • There may be 1000 times more RAM than
high hit ratio. cache.
• The cache system must be quickly • The cache algorithms have to carefully
searchable as it is checked every memory select the 0
0.1%
1% of the memory that is likely
reference. to be most accessed.

COMP375 1
Cache Mapping

Tag Fields Cache Lines


• A cache line contains two fields • The cache memory is divided into blocks
– Data from RAM or lines. Currentlyy lines can range g from 16
– The address of the block currently in the to 64 bytes.
cache.
• Data is copied to and from the cache one
• The part of the cache line that stores the line at a time.
address of the block is called the tag field.
• The lower log2(line size) bits of an address
• Many different addresses can be in any specify
if a particular
i l b byte iin a liline.
given cache line. The tag field specifies
the address currently in the cache line. Line address Offset
• Only the upper address bits are needed.

Line Example Computer Science Search


0110010100
0110010101 With a line size of 4, • If you ask COMP285 students how to
0110010110 the offset is the search h ffor a data
d t item,
it they
th should
h ld b
be able
bl
0110010111 to tell you
These boxes log2(4) = 2 bits.
represent RAM 0110011000
addresses
0110011001
0110011010 • Linear Search – O(n)
The lower 2 bits
0110011011 • Binary Search – O(log2n)
0110011100
specify which byte
in the line • Hashing – O(1)
0110011101
0110011110
• Parallel Search – O(n/p)
0110011111

COMP375 2
Cache Mapping

Mapping Direct Mapping


• The memory system has to quickly • Each location in RAM has one specific
determine if a given address is in the cache. place in cache where the data will be held.
• There are three popular methods of • Consider
C id th
the cacheh tto b
be lik
like an array.
mapping addresses to cache locations. Part of the address is used as index into
– Fully Associative – Search the entire the cache to identify where the data will be
cache for an address. held.
– Direct – Each address has a specific • Since a data block from RAM can onlyy be
place in the cache. in one specific line in the cache, it must
always replace the one block that was
– Set Associative – Each address can be
already there. There is no need for a
in any of a small set of cache locations.
replacement algorithm.

Direct Cache Addressing Cache Constants


• The lower log2(line size) bits define which • cache size / line size = number of lines
b t in
byte i th
the bl
block
k • log2(line size) = bits for offset
• The next log2(number of lines) bits defines • log2(number of lines) = bits for cache index
which line of the cache • remaining upper bits = tag address bits
• The remaining upper bits are the tag field.

Tag Line Offset

COMP375 3
Cache Mapping

Example direct address Example Address


Assume you have
• 32 bit addresses (can address 4 GB) • Using the previous direct mapping scheme
• 64 byte lines (offset is 6 bits) with
ith 17 bit tag,
t 9 bit iindex
d and
d 6 bit offset
ff t
• 32 KB of cache 01111101011101110001101100111000
• Number of lines = 32 KB / 64 = 512 01111101011101110 001101100 111000
• Bits to specify which line = log2(512) = 9 Tag Index offset
• Compare the tag field of line 001101100
17 bits 9 bits 6 bits (10810) for the value 01111101011101110. If it
Tag Line Offset matches, return byte 111000 (5610) of the line

How many bits are in the tag, line Associative Mapping


and offset fields? • In associative cache mapping, the data
Direct Mapping 25% 25% 25% 25% from any location in RAM can be stored in
any location in cache.
cache
24 bit addresses
• When the processor wants an address, all
64K bytes of cache tag fields in the cache as checked to
16 byte cache lines determine if the data is already in the
1. tag=4, line=16, offset=4 cache.
2. tag=4, line=14, offset=6
• Each tag line requires circuitry to compare
3. tag=8, line=12, offset=4
4

6
t=

t=

t=

t=

the desired address with the tag field.


e

e
ffs

ffs

ffs

ffs
,o

,o

,o

,o

4. tag=6, line=12, offset=6


6

2
=1

=1

=1

=1
e

• All tag fields are checked in parallel.


in

in

in

in
,l

,l

,l

,l
4

6
g=

g=

g=

g=
ta

ta

ta

ta

COMP375 4
Cache Mapping

Associative Cache Mapping Example Address


• The lower log2(line size) bits define which • Using the previous associate mapping
b t in
byte i th
the bl
block
k scheme
h with
ith 27 bit ttag and
d 5 bit offset
ff t
• The remaining upper bits are the tag field. 01111101011101110001101100111000
• For a 4 GB address space with 128 KB 011111010111011100011011001 11000
cache and 32 byte blocks: Tag offset
27 bits 5 bits • Compare all tag fields for the value
Tag Offset 011111010111011100011011001. If a match
is found, return byte 11000 (2410) of the line

How many bits are in the tag and Set Associative Mapping
offset fields?
• Set associative mapping is a mixture of
Associative Mapping 25% 25% 25% 25% pp g
direct and associative mapping.
24 bit addresses • The cache lines are grouped into sets.
128K bytes of cache • The number of lines in a set can vary from
64 byte cache lines 2 to 16.
1. tag= 20, offset=4 • A portion of the address is used to specify
2. tag=19, offset=5 which set will hold an address.
3. tag=18, offset=6 • The data can be stored in any of the lines
=4

=5

=6

=8

4. tag=16, offset=8
et

et

et

et

in the set.
ffs

ff s

ff s

ff s
,o

,o

,o

,o
20

19

18

16
g=

g=

g=
g=

ta

ta

ta
ta

COMP375 5
Cache Mapping

Set Associative Mapping Example Set Associative


Assume you have
• When the processor wants an address, it
• 32 bit addresses
indexes to the set and then searches the
tag fields of all lines in the set for the • 32 KB of cache 64 byte lines
desired address. • Number of lines = 32 KB / 64 = 512
• n = cache size / line size = number of lines • 4 way set associative
• b = log2(line size) = bit for offset • Number of sets = 512 / 4 = 128
• w = number of lines / set • Set bits = log2(128) = 7
• s = n / w = number of sets 19 bits 7 bits 6 bits
Tag Set Offset

Example Address How many bits are in the tag, set


and offset fields?
• Using the previous set-associate mapping 2-way Set Associative 25% 25% 25% 25%
with
ith 19 bit tag,
t 7 bit iindex
d and
d 6 bit offset
ff t
24 bit addresses
01111101011101110001101100111000
128K bytes of cache
0111110101110111000 1101100 111000 16 byte cache lines
Tag Index offset 1. tag=8,
g set = 12, offset=4
• Compare the tag fields of lines 110110000 to 2. tag=16, set = 4, offset=4
110110011 for the value

=4
=4

=4

=4

et
et

et

et
3. tag=12, set = 8, offset=4

fs
ff s

ff s

ff s

of
,o

o
0111110101110111000. If a match is found,

,
4,

8,
12

10
=

=
=
4. tag=10, set = 10, offset=4

et

et

et
et

s
,s

6,

2,

0,
return byte 111000 (56) of that line

1
g=

g=

g=

g=
ta

ta

ta

ta
COMP375 6
Cache Mapping

Replacement policy Replacement Options


• When a cache miss occurs, data is copied
• First In First Out (FIFO)
into some location in cache.
• Least Recently Used (LRU)
• With Set Associative or Fully Associative
mapping, the system must decide where • Pseudo LRU
to put the data and what values will be • Random
replaced.
• Cache performance is greatly affected by
properly choosing data that is unlikely to
be referenced again.

LRU replacement Pseudo LRU


• Pseudo LRU is frequently used in set
• LRU is easy to implement for 2 way set pp g
associative mapping.
associative.
i ti
• In pseudo LRU there is a bit for each half
• You only need one bit per set to indicate of a group indicating which have was most
which line in the set was most recently used recently used.
• LRU is difficult to implement for larger ways. • For 4 way set associative, one bit
• For an N way mapping
mapping, there are N! i di
indicates that
h the
h upper two or llower two
different permutations of use orders. was most recently used. For each half
• It would require log2(N!) bits to keep track. another bit specifies which of the two lines
was most recently used. 3 bits total.

COMP375 7
Cache Mapping

Comparison of Mapping Comparison of Mapping


Fully Associative Direct
• A
Associative
i i mapping i works
k the
h b best, b
but iis • Direct has the lowest performance, but is
complex to implement. Each tag line easiest to implement.
requires circuitry to compare the desired
• Direct is often used for instruction cache.
address with the tag field.
• Sequential addresses fill a cache line and
• Some special purpose cache
cache, such as the
then go to the next cache line.
virtual memory Translation Lookaside
Buffer (TLB) is an associative cache. • Intel Pentium level 1 instruction cache
uses direct mapping.

Comparison of Mapping
Set Associative

• Set associative is a compromise between


the other two. The bigger the “way” the
better the performance, but the more
complex and expensive.
• Intel Pentium uses 4 way set associative
caching for level 1 data and 8 way set
associative for level 2.

COMP375 8

You might also like