perf(store): lazy CacheMultiStore — defer cachekv alloc until access#2806
perf(store): lazy CacheMultiStore — defer cachekv alloc until access#2806
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2806 +/- ##
===========================================
+ Coverage 46.28% 56.79% +10.51%
===========================================
Files 2005 2070 +65
Lines 163018 168396 +5378
===========================================
+ Hits 75449 95640 +20191
+ Misses 81047 64263 -16784
- Partials 6522 8493 +1971
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| for k, parent := range cms.storeParents { | ||
| if _, exists := cms.stores[k]; !exists { | ||
| cms.stores[k] = handler(k, parent.(types.KVStore)) | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.storeParents { | ||
| delete(cms.storeParents, k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k, parent := range cms.gigaStoreParents { | ||
| if _, exists := cms.gigaStores[k]; !exists { | ||
| cms.gigaStores[k] = handler(k, parent) | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.gigaStoreParents { | ||
| delete(cms.gigaStoreParents, k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.storeParents { | ||
| cms.ensureStoreLocked(k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.gigaStoreParents { | ||
| cms.ensureGigaStoreLocked(k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
d52d4a1 to
cc0301f
Compare
7abf219 to
6ac2e41
Compare
…l access CacheMultiStore previously created cachekv.NewStore wrappers eagerly for all ~21 module stores on every snapshot. This was wasteful because: 1. The OCC scheduler's prepareTask creates a CMS then immediately replaces all stores with VersionIndexedStores via SetKVStores — the cachekv stores allocated in CMS were discarded unused. 2. EVM transactions only access ~3 of 21 modules, so ~85% of cachekv stores per statedb snapshot were never touched. This change defers cachekv.NewStore allocation by storing parent references in storeParents/gigaStoreParents maps and materializing cachekv wrappers lazily on first GetKVStore/GetStore/GetGigaKVStore access. Benchmark results (M4 Max, 1000 EVM transfers/block): - TPS: 8,000-8,400 → 8,000-8,800 (~5% median uplift) - cachekv.NewStore allocations: -20,614 MB (-360M objects) - cachekv.Write allocations: -8,058 MB - memclrNoHeapPointers CPU: -9.94s - mallocgc CPU: -7.23s cumulative Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When CacheMultiStore branches (CacheContext, CacheTxContext), lazy parents were passed directly to the child CMS. This caused child.Write() to write directly to the raw commit store, bypassing the parent's caching layer — leading to state leaks (e.g., Simulate leaking account sequence increments to deliverState). Fix: force-materialize all lazy parents on the parent CMS before creating the child branch. This preserves proper cachekv isolation while keeping lazy init for the OCC hot path (SetKVStores). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The lazy ensureStore/ensureGigaStore methods write to the stores and storeParents maps on first access. When multiple goroutines share a CMS (e.g., slashing BeginBlocker's HandleValidatorSignatureConcurrent), concurrent calls to GetKVStore race on these map operations. Add *sync.RWMutex with double-checked locking: the fast path (store already materialized) takes only an RLock; the slow path (first access) takes a full Lock. After all stores are materialized, every call hits the RLock-only fast path with ~10-20ns overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cc0301f to
8755c45
Compare
f8b8fc9 to
12550b7
Compare
Summary
cachekv.NewStoreallocation inCacheMultiStoreuntil the store is actually accessed viaGetKVStore/GetStore/GetGigaKVStorestoreParents/gigaStoreParentsmaps; materialize cachekv wrappers on-demand viaensureStore()/ensureGigaStore()prepareTaskdiscards them immediately, and EVM txs only touch ~3 of 21 modulesProblem
After lazy-init sortedCache (#2804), the next largest allocator was
cachekv.NewStoreitself — still called eagerly for all ~21 module stores perCacheMultiStore, even though EVM transactions only access ~3 modules. OCC'sprepareTaskcreates a CMS then immediately replaces stores with VersionIndexedStores, discarding the cachekv wrappers. Pure waste.Profiling after #2804 (30s, 1000 EVM transfers/block):
cachekv.NewStorecachekv.Write(re-creates MemDB)cachemulti.newCacheMultiStoreFromCMSmemclrNoHeapPointersChanges
sei-cosmos/store/cachemulti/store.go:newCacheMultiStoreFromCMS: Store parent KVStore references instoreParents/gigaStoreParentsmaps instead of eagerly wrapping withcachekv.NewStoreensureStore()/ensureGigaStore(): Lazy materializers that create cachekv wrapper on first accessGetKVStore/GetStore/GetGigaKVStore: CallensureStore()/ensureGigaStore()before returningBenchmark Results (M4 Max, 1000 EVM transfers/block, 30s profile)
cachekv.NewStorealloccachekv.WriteallocmemclrNoHeapPointersCPUmallocgcCPU (cum)runtime.(*mspan).heapBitsSmallForAddrCPUNote: total CPU samples increase because workers complete faster → more goroutines actively executing instead of blocked on allocator contention.
pprof -top -diff_basehighlights (alloc_space)pprof -top -diff_basehighlights (CPU)Test plan
cd giga/tests && go test ./... -count=1)cd sei-cosmos/store/cachemulti && go test ./... -count=1)gofmt -s -lclean🤖 Generated with Claude Code