feat: add serialize/deserialize for IVF PQ partition cache entries#6223
feat: add serialize/deserialize for IVF PQ partition cache entries#6223wjones127 wants to merge 2 commits intolance-format:mainfrom
Conversation
Add zero-copy serialization for PartitionEntry<S, ProductQuantizer> used in IVF_PQ V3 indices. Uses Arrow IPC FileDecoder so deserialized arrays reference the original Bytes buffer directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR Review: feat: add serialize/deserialize for IVF PQ partition cache entriesClean implementation overall — the zero-copy IPC approach is well-suited for cache serde. A few issues to flag: P1: Integer overflow in deserialization offset arithmeticIn let sub_index_end = sub_index_start + header.sub_index_len as usize;
let codebook_end = codebook_start + header.codebook_len as usize;
let storage_end = storage_start + header.storage_len as usize;If the header contains corrupted or adversarial values (e.g., lengths close to Use P1: Missing
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Extends partition entry serialization to cover all IVF quantizer types: - FlatQuantizer (IVF_FLAT): single IPC section for storage batch - ScalarQuantizer (IVF_SQ): multiple chunks concatenated into one IPC section - RabitQuantizer (IVF_RABIT): Fast rotation signs in JSON header; Matrix rotation matrix as a separate IPC section; storage batch stored pre-packed to skip re-packing on deserialization All impls are generic over S: IvfSubIndex, covering both FlatIndex and HNSW sub-index types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
PartitionEntry<S, ProductQuantizer>, the type cached inLanceCachefor IVF_PQ V3 indices.FileDecoderfor zero-copy deserialization: Arrow arrays reference slices of the originalBytesbuffer directly.S: IvfSubIndex, so works for bothFlatIndex+PQandHNSW+PQvariants.Test plan
🤖 Generated with Claude Code