feat: pluggable index cache via CacheBackend trait by wjones127 · Pull Request #6222 · lance-format/lance

wjones127 · 2026-03-18T17:02:59Z

Previously the Session's index cache was hardcoded to Moka. This adds a CacheBackend trait so users can provide their own cache implementation. There is a default MokaCacheBackend that works the same as the existing cache.

flowchart LR
    LanceCache --> backend["dyn CacheBackend"]
    backend --> MokaCacheBackend
    backend --> CustomCacheBackend

The cache key construction is handled at the LanceCache layer, so CacheBackend implementations just receive opaque bytes for keys. They can optionally use parse_cache_key to get a unique u64 type id. This might be used by caches to figure out how they can downcast and serialize / deserialize the entry.

flowchart LR
    CacheKey --> key["&[u8]"]
    key --"parse_cache_key"--> typeid

The Session's index cache was hardcoded to use Moka. This adds a CacheBackend trait so users can provide their own cache implementation (e.g. Redis-backed, disk-backed, shared across processes). Two-layer design: - CacheBackend: object-safe async trait with opaque byte keys. This is what plugin authors implement (get, insert, invalidate_prefix, clear, num_entries, size_bytes). - LanceCache: typed wrapper handling key construction (prefix + type tag), type-safe get/insert, DeepSizeOf size computation, hit/miss stats, and concurrent load deduplication. MokaCacheBackend is the default, preserving existing behavior. Custom backends are wired through Session::with_index_cache_backend() or DatasetBuilder::with_index_cache_backend(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-18T17:04:59Z

PR Review: feat: pluggable index cache via CacheBackend trait

P0: TOCTOU race in `get_or_insert` — dedup is broken

The check-then-register pattern in get_or_insert has a race between the two separately-locked critical sections:

// Block 1: check (drops lock at end of block)
{
    let map = self.in_flight.lock().await;
    if let Some(rx) = map.get(&cache_key) { ... }
}
// <-- gap: another task can also see an empty map here

// Block 2: register as leader (re-acquires lock)
let (tx, rx) = tokio::sync::watch::channel(None);
{
    let mut map = self.in_flight.lock().await;
    map.insert(cache_key.clone(), rx);
}

Between dropping the lock in block 1 and re-acquiring it in block 2, N tasks can all see an empty in-flight map and all decide they are the "leader." The last one to insert overwrites the earlier receivers, so:

The loader runs N times (dedup completely defeated).
Tasks that cloned an earlier receiver get RecvError when that sender drops without the value propagating through the map.

Fix: Merge the check and register into a single critical section:

let mut map = self.in_flight.lock().await;
if let Some(rx) = map.get(&cache_key) {
    let mut rx = rx.clone();
    drop(map);
    // ... wait for leader ...
} else {
    let (tx, rx) = tokio::sync::watch::channel(None);
    map.insert(cache_key.clone(), rx);
    drop(map);
    // ... run loader ...
}

The test_get_or_insert_dedup test passes by luck because tokio::task::yield_now() with a broadcast barrier doesn't reliably interleave into this gap. A test with a longer sleep or explicit synchronization at the gap point would expose this.

P1: `type_tag` relies on unspecified pointer identity

fn type_tag<T: 'static>() -> [u8; 8] {
    (std::any::type_name::<T>().as_ptr() as u64).to_le_bytes()
}

The Rust spec does not guarantee that type_name::<T>().as_ptr() returns the same address across calls, nor that distinct types produce distinct addresses (string deduplication/interning is compiler-implementation-defined). The old code used TypeId as a key component, which has correct identity semantics.

If two different types happen to collide (same pointer or same string literal address), the cache will silently return a wrong type and the downcast::<T>().unwrap() will panic at runtime.

Consider using TypeId bytes instead. TypeId implements Hash and Eq, so you could hash it to get a stable discriminator, or use transmute to extract its bytes (it's currently 128 bits).

P1: `invalidate_prefix` is fire-and-forget

pub fn invalidate_prefix(&self, prefix: &str) {
    let prefix_bytes = self.make_prefix(prefix);
    let cache = self.cache.clone();
    tokio::spawn(async move {
        cache.invalidate_prefix(&prefix_bytes).await;
    });
}

This means invalidation is not guaranteed to complete before subsequent cache reads. A caller that invalidates then immediately reads could get stale data. The old code was synchronous. If this must remain sync (non-async signature), at minimum document this caveat prominently, or consider changing the signature to async fn.

Minor notes

DeepSizeOf for LanceCache now returns 0, which is a silent behavioral regression for any code that relies on deep_size_of() for memory accounting. Consider at least delegating to approx_size_bytes().
WeakLanceCache::get_or_insert lost all dedup behavior — concurrent loads for the same key will all run the loader independently. The old code used moka's optionally_get_with which handled this.

Add type_name()/type_id() to CacheKey and UnsizedCacheKey traits so backends can identify the type of cached entries. Add parse_cache_key() utility for backends to extract (user_key, type_id) from opaque key bytes. CacheKey-based methods now pipe the key's type_id through to the backend. Non-CacheKey methods use type_id_of::<T>() as a sentinel. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Remove #[cfg(test)] convenience methods; tests now use CacheKey via a TestKey helper, eliminating the parallel method hierarchy. 2. Fix dedup race condition: re-check the cache while holding the in-flight lock so no two tasks can both become leader for the same key. 3. Use Arc::try_unwrap on the leader error path to preserve the original error type when possible. 4. Make invalidate_prefix async instead of fire-and-forget spawn. 5. Replace type_name().as_ptr() with a hash of std::any::TypeId for stable type discrimination. Defined once in type_id_of() and used by CacheKey::type_id() default. 6. Add dedup to WeakLanceCache::get_or_insert, sharing the in-flight map from the parent LanceCache. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-19T06:44:03Z

Codecov Report

❌ Patch coverage is 69.12162% with 457 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance/src/index/vector/ivf/v2.rs	8.37%	164 Missing ⚠️
rust/lance/src/index/vector/ivf/partition_serde.rs	84.53%	58 Missing and 58 partials ⚠️
rust/lance-index/src/vector.rs	0.00%	78 Missing ⚠️
rust/lance-core/src/cache.rs	84.97%	47 Missing and 5 partials ⚠️
rust/lance/src/session.rs	30.43%	16 Missing ⚠️
rust/lance-index/src/vector/storage.rs	0.00%	15 Missing ⚠️
rust/lance/src/dataset/builder.rs	46.66%	8 Missing ⚠️
rust/lance-index/src/scalar/inverted/index.rs	37.50%	4 Missing and 1 partial ⚠️
rust/lance/src/session/index_caches.rs	66.66%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Address feedback: 1. Move get_or_insert() onto CacheBackend. The method takes a pinned future (not a closure), so LanceCache can type-erase the user's non-'static loader before passing it to the backend. Default impl does simple get-then-insert; MokaCacheBackend uses moka's built-in optionally_get_with for dedup. This eliminates duplicated dedup logic and the manual watch-channel machinery. 2. Restore type_name().as_ptr() for type_id derivation on CacheKey. Remove standalone type_id_of() function. The derivation lives in one place: CacheKey::type_id()/UnsizedCacheKey::type_id(). 3. Remove approx_size_bytes from CacheBackend trait and Session debug output. Only approx_num_entries remains. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove all methods that bypass CacheKey from WeakLanceCache (get, insert, get_or_insert, get_unsized, insert_unsized). Remove insert_unsized/get_unsized from LanceCache. Remove type_tag helper. All cache access now goes through CacheKey/UnsizedCacheKey. Make parse_cache_key return (empty, 0) instead of panicking on short keys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Restore approx_size_bytes on CacheBackend so DeepSizeOf on LanceCache reports actual cache memory usage (used by Session::size_bytes). Fixes test_metadata_cache_size Python test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The type_name().as_ptr() approach for type discrimination was unstable across crate boundaries due to monomorphization. Replace with an explicit fn type_id() -> &'static str that each CacheKey impl provides as a short human-readable literal (e.g. 'Vec<IndexMetadata>', 'Manifest'). Key format changes from user_key\0<8 LE bytes> to user_key\0<type_id str>. parse_cache_key() now returns (&[u8], &str).

Add IvfIndexState struct and serialization to lance-index, enabling IVFIndex to export its reconstructable state (IVF model, quantizer metadata) without non-serializable handles. Add reconstruct_vector_index() which rebuilds an IVFIndex from cached state by re-opening FileReaders (cheap with warm metadata cache) instead of re-fetching global buffers from object storage. Also adds IvfQuantizationStorage::from_cached() to skip global buffer reads during reconstruction, and Session::file_metadata_cache() to expose the metadata cache for the reconstruction context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reconstructed VectorIndex instances need the original cache key prefix to share partition entries with the two-tier cache backend. Also adds LanceCache::with_backend_and_prefix() and WeakLanceCache::prefix(). Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

github-actions bot added the enhancement New feature or request label Mar 18, 2026

wjones127 and others added 2 commits March 18, 2026 20:04

wjones127 force-pushed the feat/pluggable-index-cache branch from 56a3273 to 00867ad Compare March 19, 2026 16:56

wjones127 and others added 3 commits March 19, 2026 10:05

cleanup

74fdc2c

cleanup

1ba4ac3

Restore approx_size_bytes on CacheBackend so DeepSizeOf on LanceCache reports actual cache memory usage (used by Session::size_bytes). Fixes test_metadata_cache_size Python test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

wjones127 force-pushed the feat/pluggable-index-cache branch from 369afe2 to 1ba4ac3 Compare March 19, 2026 18:29

wjones127 marked this pull request as ready for review March 19, 2026 19:05

wjones127 and others added 5 commits March 19, 2026 17:07

feat: add partition serde for all quantizer types (PR lance-format#6223)

2e7602e

chore: make index_caches module public for downstream codec registration

f1ed934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: pluggable index cache via CacheBackend trait#6222

feat: pluggable index cache via CacheBackend trait#6222
wjones127 wants to merge 12 commits intolance-format:mainfrom
wjones127:feat/pluggable-index-cache

wjones127 commented Mar 18, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wjones127 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 18, 2026

PR Review: feat: pluggable index cache via CacheBackend trait

P0: TOCTOU race in get_or_insert — dedup is broken

P1: type_tag relies on unspecified pointer identity

P1: invalidate_prefix is fire-and-forget

Minor notes

Uh oh!

codecov bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wjones127 commented Mar 18, 2026 •

edited

Loading

P0: TOCTOU race in `get_or_insert` — dedup is broken

P1: `type_tag` relies on unspecified pointer identity

P1: `invalidate_prefix` is fire-and-forget

codecov bot commented Mar 19, 2026 •

edited

Loading