You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/OS/CPU/CPU Cache.md
+10-22Lines changed: 10 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ Author Profile:
6
6
tags:
7
7
- OS
8
8
Creation Date: 2023-07-14T20:41:40+08:00
9
-
Last Date: 2024-11-06T16:27:29+08:00
9
+
Last Date: 2024-11-09T10:50:16+08:00
10
10
References:
11
11
---
12
12
## Abstract
@@ -26,28 +26,16 @@ References:
26
26
>[!important] Spatial locality
27
27
> A cache line typically contains one or more [[Computer Data Representation#Word|words]]. When the CPU fetches data from memory, it retrieves an entire cache line, not just the specific bytes needed immediately. This takes advantage of [[Cache Locality#Spacial Locality|spatial locality]].
- In the above example, the [[CPU Cache]] has $2^{10}$ [[#Cache Line]], each contains $2^2$ words, each [[Computer Data Representation#Word|word]] is $2^2$ bytes
42
-
- Each cache line is indexed with a **cache index**. This allows a CPU cache with limited storage to cover the entire main memory because **multiple physical addresses can map to the same cache line**. However, this mapping also means that multiple physical addresses share the same cache line. To **distinguish between these different addresses**, each cache line includes a **cache tag** that **identifies the specific physical address** currently stored in that line
43
-
44
-
>[!question] How is cache line updated?
45
-
> ![[cpu_cache_cache_line.png|600]]
29
+
>[!question] How big should a cache line be?
30
+
> ![[cache_line_size.png]]
46
31
>
47
-
> 1. We first use the **cache index to locate the cache line**
48
-
> 2. We use the **valid bit** to check if the cache line contains data. If it does, and the **tag matches the given address**, we can select the word needed using the **word offset** with a help of a [[Multiplexer]]
49
-
> 3. Otherwise, there is a cache miss.
50
-
32
+
> The **larger the cache line**, the better we can **take advantage of spatial localit**y, since we have more surrounding data cached in the cpu cache.
33
+
>
34
+
> However, this brings a **larger miss penalty**, as it **takes longer to transfer** one cache line to the CPU cache.
35
+
>
36
+
> Furthermore, CPU cache has a **very limited size**. The larger the cache line, the **fewer cache lines** can be loaded into the CPU cache. Consequently, the cached data tends to be more concentrated, and the **miss rate will increase**.
37
+
>
38
+
> Therefore, we need to find a **sweet spot in the cache line size** to **maximise spatial locality** and **reduce the miss penalty and miss rate**.
Copy file name to clipboardExpand all lines: content/OS/CPU/Cache Miss.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ Author Profile:
6
6
tags:
7
7
- computer_organisation
8
8
Creation Date: 2024-11-06, 16:30
9
-
Last Date: 2024-11-06T17:42:43+08:00
9
+
Last Date: 2024-11-09T11:39:02+08:00
10
10
References:
11
11
draft:
12
12
description:
@@ -24,6 +24,9 @@ description:
24
24
- Also known as **collision miss** or **interference miss**
25
25
- When multiple data mapped to the same [[CPU Cache#Cache Line]]
26
26
27
+
>[!important]
28
+
> This can be reduced with [[Set Associative Cache]]. A [[Direct Mapped Cache]] of size $N$ has about the same miss rate as a [[Set Associative Cache|2-way set associative cache]] of size $N/2$.
29
+
27
30
### Capacity Miss
28
31
- When data is discarded from [[CPU Cache]] as the cpu cache is running out of space
- In the above example, the [[CPU Cache]] has $2^{10}$ [[#Cache Line]], each contains $2^2$ words, each [[Computer Data Representation#Word|word]] is $2^2$ bytes
27
+
- Each cache line is indexed with a **cache index**. This allows a CPU cache with limited storage to cover the entire main memory because **multiple physical addresses can map to the same cache line**. However, this mapping also means that multiple physical addresses share the same cache line. To **distinguish between these different addresses**, each cache line includes a **cache tag** that **identifies the specific physical address** currently stored in that line
28
+
29
+
>[!question] How is data read?
30
+
> ![[cpu_cache_cache_line.png|600]]
31
+
>
32
+
> 1. We first use the **cache index to locate the cache line**
33
+
> 2. We use the **valid bit** to check if the cache line contains data. If it does, and the **tag matches the given address**, we can select the word needed using the **word offset** with a help of a [[Multiplexer]]
- One way to design a [[CPU Cache|CPU cache]] is to have it consist of a number of sets, each containing $n$ [[CPU Cache#Cache Line|cache lines]]. Within a set, a memory block can be placed in any of the $n$ cache lines.
27
+
- In the above example, we have a 2-way set associative cache. The [[CPU Cache]] has $2^{1}$ sets, each containing $2$ cache lines. Each contains $2^{1}$ words, and each [[Computer Data Representation#Word|word]] is $2^2$ bytes
28
+
29
+
>[!question] What is the benefit?
30
+
> ![[2-way_set_associative_cache.png]]
31
+
>
32
+
> Set associative caches reduce the likelihood of [[Cache Miss#Conflict Miss|conflict misses]] compared to [[Direct Mapped Cache|direct-mapped caches]]. In a direct-mapped cache, if two **frequently accessed memory locations map to the same cache index, they will constantly evict each other,** causing repeated conflict misses. A set associative cache provides multiple cache lines within each set, allowing these memory locations to **coexist in the cache simultaneously**, minimising conflict misses and improving performance.
33
+
34
+
35
+
>[!question] How is data read?
36
+
> ![[set_associative_cache_read_circuit.png|600]]
37
+
>
38
+
> 1. We first use the **set index to locate the set**
39
+
> 2. We simultaneously "search" on all **valid bit** and **tags** of the set to check if one of the cache line contains data. If it does, and the **tag matches the given address**, we can select the word needed using with a help of a [[Multiplexer]]
0 commit comments