feat: add manifest namespace extended properties by wojiaodoubao · Pull Request #6215 · lance-format/lance

wojiaodoubao · 2026-03-17T16:59:35Z

This is a sub-task of #5741

github-actions · 2026-03-17T17:01:51Z

PR Review: feat: add manifest namespace extended properties

Performance: `query_manifest` reads all columns (P1)

The new unified query_manifest() method removes the .project() call that the old query_manifest_for_table and query_manifest_for_namespace had. Previously, table lookups only read object_id + location, and namespace lookups only read object_id + metadata. Now every query reads all columns including all extended property columns.

As the number of extended properties grows, this becomes increasingly wasteful for simple existence checks and lookups. Consider keeping a projection — at minimum project the basic columns plus the extended columns, or re-add targeted projections for the two call sites.

Performance: Row-at-a-time slicing in `query_manifest` (P1)

for row_idx in 0..batch.num_rows() {
    let sliced_columns: Vec<Arc<dyn Array>> = batch
        .columns()
        .iter()
        .map(|col| col.slice(row_idx, 1))
        .collect();
    let row = RecordBatch::try_new(batch.schema(), sliced_columns)?;
    objects.push(parse_manifest_object(&row)?);
}

This creates a new RecordBatch per row. For queries that match many rows (e.g., listing all tables), this is O(rows × columns) allocations. Consider indexing into the batch directly in parse_manifest_object using a row index parameter instead of slicing.

Bug: Unused `f32, f64` imports (P0 — clippy will fail)

use std::{
    collections::HashMap,
    f32, f64,
    hash::{DefaultHasher, Hash, Hasher},

These module imports (std::f32, std::f64) appear unused. This will likely trigger a clippy warning/error with -D warnings.

Bug: `is_multiple_of` requires nightly (P0)

if !s.len().is_multiple_of(2) {

usize::is_multiple_of is a nightly-only API (int_roundings feature). This won't compile on stable Rust. Use s.len() % 2 != 0 instead.

Design: Duplicated merge logic (P1)

The pattern of matching (extended_record, &request.properties) with 4 arms is copy-pasted across create_namespace_extended, declare_table_extended, and create_table_extended. Consider extracting a helper like fn resolve_extended_batch(...) to reduce duplication and the risk of the branches diverging.

Minor

build_metadata_json silently drops properties that fail to serialize (.ok()? on serde_json::to_string). HashMap<String, String> serialization cannot fail in practice, but the silent swallow is worth noting.
Comment typo: "Collect basic columns to excluded" → "to exclude"

🤖 Generated with Claude Code

codecov · 2026-03-17T17:33:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

wjones127

I'm feeling a little uncertain about the efficiency of this code, particularly around the handling of batches. Though I also think the bot is right about projecting many columns.

wjones127 · 2026-03-19T20:46:15Z