Skip to content

fix: like queries with a prefix should be accelerated by btree and zonemap#6188

Open
jackye1995 wants to merge 6 commits intolance-format:mainfrom
jackye1995:like-pruning
Open

fix: like queries with a prefix should be accelerated by btree and zonemap#6188
jackye1995 wants to merge 6 commits intolance-format:mainfrom
jackye1995:like-pruning

Conversation

@jackye1995
Copy link
Contributor

@jackye1995 jackye1995 commented Mar 13, 2026

This allows pruning LIKE "foo%" through any index that allows pruning a range of string prefixes, this includes btree and zonemap.

I ended up adding a LikePrefix instead of StartsWith because it seems like DataFusion converts StartsWith to LIKE

@github-actions github-actions bot added the bug Something isn't working label Mar 13, 2026
@github-actions
Copy link
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 13, 2026

PR Review: LIKE prefix acceleration for btree and zonemap

Overall this is a well-structured change with good test coverage. A few issues worth addressing:

Minor: Escape handling in has_wildcard is approximate

The has_wildcard check (expression.rs ~lines 283-298) only handles 1-2 levels of preceding escape characters, so deeply nested escapes like \\\\% (4 backslashes + %) may be misclassified. The actual prefix extraction loop handles escapes correctly though, so this only affects the early-exit path. Since a false positive in has_wildcard is safe (we just attempt prefix extraction on a literal pattern and it correctly falls through), and false negatives mean we skip optimization for exotic patterns, this is low priority — but worth a comment noting the limitation.

Tests

Good coverage of zonemap pruning, btree range conversion, expression parsing, starts_with conversion, and escape handling. The tests verify correctness of zone inclusion/exclusion with well-chosen boundary cases.

@jackye1995 jackye1995 changed the title fix: LIKE queries with a prefix should be accelerated by btree and zonemap fix: like queries with a prefix should be accelerated by btree and zonemap Mar 13, 2026
@jackye1995
Copy link
Contributor Author

looks like I did not handle AND properly, will take another pass tomorrow

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 92.32053% with 46 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-index/src/scalar/expression.rs 88.88% 31 Missing ⚠️
rust/lance-index/src/scalar.rs 80.00% 5 Missing and 1 partial ⚠️
rust/lance-index/src/scalar/zonemap.rs 97.34% 5 Missing ⚠️
rust/lance-index/src/scalar/btree.rs 96.07% 3 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants