-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathtask.txt
More file actions
301 lines (157 loc) · 4.36 KB
/
task.txt
File metadata and controls
301 lines (157 loc) · 4.36 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
You are responsible for running all experiments required for a research paper evaluating an RL-based cache eviction algorithm compared to classical algorithms. You must run the following experiments, collect the specified metrics, generate the tables, and prepare all plots. Use the current RL implementation as-is unless otherwise specified.
1. Baseline Performance Evaluation
Run these algorithms on identical workloads:
LRU
LFU (optional)
ARC (optional)
RL-Hybrid (current RL model)
For each, record:
Hit rate (%)
Total runtime (seconds)
Average per-request latency (microseconds)
Latency p50, p90, p99
Memory usage (MB)
Output:
A baseline comparison table covering all metrics.
2. Microbenchmark Breakdown of RL
Break down RL-Hybrid runtime into the following components:
Time to construct candidate states
Time for model inference (1 candidate and k candidates)
Time for choosing eviction
Time for updating recency/frequency metadata
Output:
A table comparing each operation vs LRU’s equivalent O(1) operations.
A bar chart showing time distribution per component.
3. Sensitivity Analysis
3.1 Cache Size Variation
Run all algorithms on caches of size:
10, 20, 30, 50, 100, 200
Record for each:
Hit rate
Runtime
RL overhead vs LRU (%)
3.2 Candidate Size Variation
For RL, run with:
k = 4, 8, 16, 32
Record:
Hit rate
Runtime
Scaling of inference time
3.3 Workload Distribution Variation
Test the following workloads:
Zipf distributions with α = 0.5, 0.8, 1.0, 1.2, 1.5
Uniform random
Gaussian phase workloads
Bursty workloads
Any real-world trace if available
For each workload, record:
Hit rate
Runtime
RL slowdown factor vs LRU
Output:
Sensitivity tables for cache size, k, and workloads.
4. Workload Behavior Tests
4.1 Long-Term Patterns
Use periodic and phase-shift workloads to test adaptation.
Record:
RL hit rate trend over time
Does RL adapt or collapse?
4.2 Cold-Start Behavior
Record:
Hit rate during first 5k accesses
RL stabilization time
4.3 Adversarial Workloads
Run workloads known to break classical heuristics and record RL vs LRU performance.
Output:
One table summarizing behavioral responses for each workload type.
5. RL Model Internal Behavior Analysis
5.1 Weight Inspection
Extract RL model weights.
Determine:
Relative significance of recency vs frequency vs rank.
Whether RL behaves like LRU or LFU.
5.2 Q-Value Analysis
Collect:
Histogram of Q-values
Q-value variance
Q-value collapse indicators
5.3 Training Dynamics
Record:
Loss curve over training
Hit rate during training
5.4 Feature Correlation
Compute correlation between:
RL predicted scores
True LRU recency ranks
Output:
Weight inspection table
Correlation table
6. Ablation Studies
Run the following RL variants:
6.1 Remove Frequency Feature
State = [recency, rank]
6.2 Remove Rank Feature
State = [recency, frequency]
6.3 Linear Model Instead of MLP
Replace MLP with single linear layer.
6.4 Smaller Networks
Test architectures:
32–32–1
16–16–1
8–8–1
6.5 Pure Weighted Heuristic
Use:
score = αrecency + βfrequency + γ*rank
(no RL)
Record for each:
Hit rate
Runtime
Parameter count
Output:
Ablation table comparing all variants.
7. Statistical Significance
Run every experiment 10 times
For each:
Compute mean
Standard deviation
95% confidence intervals
Perform t-tests comparing:
LRU vs RL hit rate
LRU vs RL latency
Output:
Statistical significance table
Boxplots or error-bar graphs
8. Required Plots
You must generate all the following:
Hit rate vs cache size (line graph)
Runtime vs cache size
Runtime vs k
Hit rate vs Zipf α
Q-value distribution histogram
Training loss curve
Feature weight bar chart
Latency CDF plot (LRU vs RL)
9. Final Summary Tables
9.1 Performance Summary Table
Columns:
Algorithm | Hit Rate | Runtime | p99 Latency | Memory Usage | Cache Size
9.2 Sensitivity Table
Cache Size | k | LRU Hit | RL Hit | RL Latency | Notes
9.3 Workload Behavior Table
Workload Type | LRU Hit | RL Hit | RL Slowdown | Behavior Summary
9.4 Ablation Table
Variant | Hit Rate | Runtime | Params | Notes
9.5 Model Explanation Table
Feature | Weight | Importance | Interpretation
10. Values to Extract for Paper
You must extract:
Hit rate
Runtime
Latency percentiles
Q-value variance
Model parameter count
Inference time (microseconds)
Correlation with LRU ranking
Warm-up time
Memory overhead
Scaling factors