## Problem Different attack counts across runs (34/14/12) make comparisons misleading. ## Tasks - [ ] Create canonical attack subsets (e.g., core-14, full-61) - [ ] Add --attack-set flag to CLI - [ ] Warn when comparing runs with different attack sets - [ ] Add comparison tooling that checks alignment ## Acceptance Criteria - CLI prevents accidental apples-to-oranges comparisons
Problem
Different attack counts across runs (34/14/12) make comparisons misleading.
Tasks
Acceptance Criteria