Skip to content

Commit 7308f34

Browse files
committed
Quartz sync: Jan 26, 2026, 11:41 PM
1 parent fffdd79 commit 7308f34

3 files changed

Lines changed: 190 additions & 7 deletions

File tree

content/Computer Organization and Architecture/Cache Organization.md

Lines changed: 189 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
> - [[Cache Organization#No Write Allocate|No Write Allocate]]
1616
>- [[Cache Organization#Questions|Questions]]
1717
# Locality of Reference
18-
Programs tend to access the same memory location or nearby memory locations within short intervals of time.
18+
Locality of reference is a phenomenon when programs tend to access the same memory location or nearby memory locations within short intervals of time.
1919

2020
Types -
2121
1. **Temporal Locality -** Recently accessed memory locations are likely to be accessed again.
@@ -24,6 +24,14 @@ Types -
2424

2525
This phenomenon allows for caching a **block of memory** to be efficient. Currently demanded localities are kept in a smaller and faster memory called **cache**.
2626
# Working of Cache Memory
27+
**Caching** is temporarily storing copies of certain content of the main memory for a ease of access.
28+
29+
The memory used for caching is called a **cache memory**. Cache memory has a faster access time than main memory but is typically smaller in size. On-chip Cache memories are in the CPU itself.
30+
31+
When the CPU accesses -
32+
- On-chip Cache memory it doesn't require system buses. The cache can be accessed similar to registers.
33+
- Main memory it needs system buses.
34+
2735
![[Pasted image 20260125131640.png]]
2836

2937
Keywords -
@@ -48,7 +56,7 @@ $$
4856
^a692b1
4957
# Types of Cache Access
5058
## Simultaneous Access
51-
The memory access request is **sent to both** the cache memory as well as the main memory. Hence this is also called a **parallel access**.
59+
The memory read operation is **performed on both** the cache memory as well as the main memory. Hence this is also called a **parallel access**.
5260

5361
![[Pasted image 20260125134951.png]]
5462

@@ -60,7 +68,7 @@ $$
6068

6169
Here $T_{cm}$ is the cache memory access time and $T_{mm}$ is the main memory access time.
6270
## Hierarchical Access
63-
The memory access request is sent to the main memory **only when a Cache Miss** occurs. Hence this is also called as **serial access**.
71+
The memory read operation is performed to the main memory **only when a Cache Miss** occurs. Hence this is also called as **serial access**.
6472

6573
The average memory access time is -
6674

@@ -86,7 +94,7 @@ If a question has the words "cache memory access time" and "main memory access t
8694

8795
Otherwise, if the question just mentions "time for cache hit" and "time for cache miss", use the [[Cache Organization#^a692b1|generic formula]].
8896

89-
See [[Cache Organization#^q1|Question 1]] for a simple example.
97+
See [[Cache Organization#^1007d7|Question 1]] for a simple example.
9098
## Memory Access Time when Locality of Reference is used
9199
In the previous cases we were just looking at the cases where on a Cache miss we retrieve the data directly from the main memory. But on a cache miss, the block in which the data belongs to needs to be brought in the cache memory for future usage as well.
92100

@@ -170,14 +178,157 @@ Write Through cache with No Write Allocate -
170178
2. Write -
171179
- Hit - Perform write in cache and main memory simultaneously.
172180
- Miss - Perform write in main memory but do not bring missing block to the cache.
181+
# Cache Mapping
182+
A cache line is the **smallest data unit** that is transferred between the cache and the main memory. In simpler words, a block in the cache memory is called a cache line.
183+
184+
Cache mapping is the rule/mechanism that dictates which **cache line** would hold which main memory block.
185+
## Direct Mapping
186+
In direct mapping, **each main-memory block is mapped to exactly one fixed cache line/index**.
187+
188+
One index in direct mapping is one cache block. The below formula is applicable for [[Cache Organization#Set Associative Mapping|Set Associative Mapping]] and [[Cache Organization#Fully Associative Mapping|Fully Associative Mapping]] as well, it's just that the number of indices for those mappings are different.
189+
190+
$$
191+
\begin{aligned}
192+
\text{m.m. block no.} &= \left\lfloor\frac{\text{m.m. address.}}{\text{No. of indices}} \right\rfloor\\[8pt]
193+
\text{c.m. block no.} &= (\text{m.m. block no.}) \,\,\% \,\,(\text{No. of indices}) \\[8pt]
194+
\end{aligned}
195+
$$
196+
197+
Suppose if the cache memory has 10 blocks and the second block is holding the main memory block 52. If we lookup block 92 in the cache, it will be checked in the second block of the cache memory. Because the second block is already holding block 52, looking up block 92 in cache should be a cache miss. But how can we differentiate between the content in the cache block and the block being requested?
198+
### Tag
199+
A tag is a part of the memory address and helps in identifying whether the requested block is present in its respective cache line or not.
200+
### Index
201+
The cache block no. is also known as index in direct mapping.
202+
### Memory address layout
203+
The tag and cache memory block no. make up the main memory block no.
204+
205+
$$
206+
\begin{aligned}
207+
\text{m.m. block no.} &= \boxed{\strut \text{Tag}}\boxed{\strut \text{c.m. block no.}} \\[8pt]
208+
\text{block 92} &= \underbrace{\boxed{\strut \,\,\,9\,\,\\\,}}_{\text{Tag}} \underbrace{\boxed{\strut \,\,\,2\,\,\,}}_\text{c.m.} \\
209+
\end{aligned}
210+
$$
211+
212+
But a memory address is byte/word addressable not block addressable. A block will be made up of some bytes/words, thus the memory address will be of layout -
213+
214+
$$
215+
\begin{aligned}
216+
\text{memory address} &= \boxed{\strut \text{m.m. block no.}}\boxed{\strut \text{byte offset}} \\[8pt]
217+
&= \boxed{\strut \text{Tag}}\boxed{\strut \text{c.m. block no.}}\boxed{\strut \text{byte offset}} \\[8pt]
218+
\end{aligned}
219+
$$
220+
221+
- No. of bits for byte no./byte offset - $\operatorname{log}_2(\text{block size})$
222+
- No. of bits for main memory block no. - $\operatorname{log}_2(\text{blocks in m.m.})$
223+
- No. of bits for cache memory block no./index - $\operatorname{log}_2(\text{blocks in c.m.})$
224+
- No. of tag bits - $(\text{No. of bits for m.m. block no.} - \text{No. of bits for c.m. block no.})$
225+
226+
See [[Cache Organization#^cba289|Question 3]] for an example.
227+
228+
Suppose the cache memory size $= 2^i$ and main memory size $= 2^m$, but the byte offset is unknown. Assume byte offset to be $2^x$. The index bits would be $\operatorname{log}_2\left(\frac{2^i}{2^x}\right)=i-x$. The tag bits would be -
229+
230+
$$
231+
\begin{aligned}
232+
\text{tag bits} &= m - ((i-x) + x) \\[8pt]
233+
&= m-i \\[8pt]
234+
&= \operatorname{log}_2(\text{m.m. size}) - \operatorname{log}_2(\text{c.m. size})
235+
\end{aligned}
236+
$$
237+
238+
So we can say,
239+
240+
$$
241+
\begin{aligned}
242+
\text{memory address} &= \boxed{\strut \text{Tag}}\boxed{\strut \text{cache line bits}} \\[8pt]
243+
\text{c.m. address bits} &= \text{c.m. block no. bits} + \text{byte offset}
244+
\end{aligned}
245+
$$
246+
247+
### Cache Controller
248+
The cache controller is a device which acts as the control logic for all cache related operations. The cache controller maintains a tag directory/metadata which holds all the tag bits and status bits.
249+
250+
$$
251+
\text{Tag directory size} = (\text{Tag bits + Status bits}) * \text{No. of blocks in c.m.}
252+
$$
253+
254+
^c1c03b
255+
256+
Unlike the tag bits, the valid/invalid bit and dirty bit are not stored in the main memory address. Instead they are kept separately in the tag directory.
257+
258+
The tag directory size **doesn't** depend upon the type of cache mapping used.
259+
#### Tag bits
260+
The bits required to denote a [[Cache Organization#Tag|block tag]].
261+
#### Valid/Invalid Bit
262+
Whenever we start our computer, the cache is initialized with garbage values. It is possible that when we lookup a memory address in the cache, the tag of this memory address matches the garbage value in the tag bit and invalid content is sent to the CPU. To avoid this we use valid/invalid bits.
263+
- 0 means invalid
264+
- 1 means valid
265+
266+
How it works -
267+
- When the cache is initialized, all valid/invalid bits for each cache line are initialized with 0 signifying that all tag bits are garbage values.
268+
- When a cache line is filled with a block from the main memory, this bit is set to 1 to signify that going forward the cache line would contain valid content.
269+
#### Dirty bit
270+
In [[Cache Organization#Write Back|write back]] cache when a dirty/modified block is replaced from the cache, the content in the block needs to be written to the main memory. To keep track of modifications in a cache block, we use dirty bits.
271+
- 0 means the cache block has not been modified.
272+
- 1 means the cache block has been modified and upon replacement should be written to the main memory.
273+
## Set Associative Mapping
274+
In set associative mapping, **each index corresponds to a set containing multiple cache lines**, and a memory block can be placed in **any line within its indexed set**.
275+
276+
$$
277+
\text{no. of sets in k-way set associativity} = \frac{\text{no. of c.m. blocks}}{k}
278+
$$
279+
280+
From here, using the no. of indices we can calculate the number of bits required for indices consequently calculate the tag bits and tag directory size. The memory address layout now looks like -
281+
282+
$$
283+
\begin{aligned}
284+
\text{memory address} &= \boxed{\strut \text{m.m. block no.}}\boxed{\strut \text{byte offset}} \\[8pt]
285+
&= \boxed{\strut \text{Tag}}\boxed{\strut \text{set offset}}\boxed{\strut \text{byte offset}} \\[8pt]
286+
\end{aligned}
287+
$$
288+
289+
No matter which type of mapping or associativity we use, each block in the cache memory would still require a tag. Thus the formula for the tag directory size stays the same as the one mentioned [[Cache Organization#^c1c03b|under Direct Mapping]].
290+
291+
However, the tag bits required for an associative mapping will increase which will consequently increase the size of the tag directory.
292+
293+
By each time increasing the associativity by a factor of 2, we increase the tag bits by 1 and decrease the index bits by 1. Thus if the set associativity is $k$, the index bits are decreased by $\operatorname{log}_2k$. In direct mapping memory address size = Tag Bits + Cache line bits, but in $k$-way set associative mapping,
294+
295+
$$
296+
\begin{aligned}
297+
\text{memory address} &= \boxed{\strut \text{Tag}}\boxed{\strut \operatorname{log}_2(\text{cache size}) - \operatorname{log}_2k} \\[8pt]
298+
\end{aligned}
299+
$$
300+
301+
### How it's implemented
302+
303+
![[Pasted image 20260126223006.png|550]]
304+
305+
Assume that we are dealing with a 2-way set associative cache. In such a cache, for each set there would exist two tags for each block in a set. These tags are **parallelly** compared with the tag of the memory address being requested. If any of the tag matches, we have a cache hit else a cache miss.
306+
307+
The higher the associativity, more such tags need to be compared in parallel and thus increasing the complexity and expense.
308+
## Fully Associative Mapping
309+
In fully associative mapping, **any main-memory block can be placed in any cache line**. There's no fixed mapping between memory blocks and cache lines.
310+
311+
In a fully associative cache, the index has zero bits. The memory address layout looks like -
312+
313+
$$
314+
\begin{aligned}
315+
\text{memory address} &= \boxed{\strut \text{m.m. block no.}}\boxed{\strut \text{byte offset}} \\[8pt]
316+
&= \boxed{\strut \text{Tag bits}}\boxed{\strut \text{byte offset}} \\[8pt]
317+
\end{aligned}
318+
$$
319+
320+
The formula for the tag directory size stays the same as the one mentioned [[Cache Organization#^c1c03b|under Direct Mapping]]. Here the tag bits = m.m. block no. so the same formula can be re-written as,
321+
322+
$$
323+
\text{Tag directory size} = (\text{m.m block no. bits + Status bits}) * \text{No. of blocks in c.m.}
324+
$$
173325

174326
---
175327
# Questions
176-
^q1
177328
<h6 class="question">Q1) If in a two level memory hierarchy, the top level memory access time is 8ns and the bottom level memory access time is 60ns, the hit-rate required is __ for the average access time to be 10ns. What is __?</h6>
178329

179330
$\underline{\text{Sol}^n} -$
180-
Here as "memory hierarchy" and "two level" is mentioned, we are dealing with a hierarchical access cache organization. So,
331+
Here as "memory hierarchy" and "two level" is mentioned, we are dealing with a hierarchical access cache organization. So, ^1007d7
181332

182333
$$
183334
\begin{alignedat}{3}
@@ -186,4 +337,36 @@ $$
186337
&\Rightarrow&\,\,60H&= 58 \\[8pt]
187338
&\Rightarrow&H&= \boxed{0.967} \\[8pt]
188339
\end{alignedat}
340+
$$
341+
---
342+
<h6 class="question">Q2) What is the size of data sent from the CPU to main memory when:</h6>
343+
1. For write hit, a write through cache is used
344+
2. For write miss, a write through cache is used
345+
3. For write hit, a write back cache is used
346+
4. For write miss, a write back cache is used
347+
348+
$\underline{\text{Sol}^n} -$
349+
For any write operation, the CPU sends **1 data item** of 1 byte or 1 word to the cache/main memory depending upon what type of cache is used.
350+
351+
1. Write hit in Write Through - **1 data item**, because in Write Through cache both cache and main memory is written simultaneously regardless of hit or miss.
352+
2. Write miss in Write Through - **1 data item**, same reasoning as above.
353+
3. Write hit in Write Back - **No data item**, because the write operation is performed in the cache memory and not the main memory.
354+
4. Write miss in Write Back -
355+
1. If using **write allocate** - **No data item**, because the missing block is brought from the main memory to the cache memory and then the write operation is performed on the cache memory.
356+
2. If using no-write allocate - **1 data item,** because the write operation is performed directly on the main memory without bring the block to cache.
357+
358+
---
359+
<h6 class="question">Q3) If the main memory is of size 1MB with a block size of 16 bytes, and the cache memory is of size 64KB, how many bits are required for the Tag, cache block no., and byte no.?</h6>
360+
361+
$\underline{\text{Sol}^n} -$ ^cba289
362+
- Here the memory is of $2^{20}$ bytes. Thus the memory address is made up of **20 bits**.
363+
- A block is of 16 bytes, thus **4 bits** is required for byte no.
364+
- The number of blocks in the main memory is $\frac{2^{20}}{2^4}=2^{16}$. Thus **16 bits** are required for the main memory block no.
365+
- The main memory block no. is made up of Tag and cache memory block no. The number of blocks in cache memory is $\frac{2^{16}}{2^4}=2^{12}$. Thus **12 bits** is required for cache memory block number.
366+
- The remaining **4 bits** of the main memory block no. are for the block tag.
367+
368+
Thus the layout is -
369+
370+
$$
371+
\underbrace{\boxed{\strut \,\,\text{4 bits}\,\,}}_{\text{Tag bits}} \underbrace{\boxed{\strut \,\,\text{12 bits}\,\,}}_\text{c.m. block no.} \underbrace{\boxed{\strut \,\,\text{4 bits}\,\,}}_\text{byte no.}
189372
$$
776 KB
Loading

content/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ Includes:
7070

7171

7272
---
73-
## 🧱 Computer Organization and Architecture
73+
## ⚙️ Computer Organization and Architecture
7474
How a computer works.
7575

7676
Includes:

0 commit comments

Comments
 (0)