Skip to content

fix: VLP WITH clause property resolution and path_relationships array#206

Merged
genezhang merged 4 commits intomainfrom
fix/vlp-with-property-resolution
Mar 11, 2026
Merged

fix: VLP WITH clause property resolution and path_relationships array#206
genezhang merged 4 commits intomainfrom
fix/vlp-with-property-resolution

Conversation

@genezhang
Copy link
Copy Markdown
Owner

Summary

  • Fix VLP property requirements to match both Cypher and DB column names during pruning
  • Fix DB-to-Cypher property name translation in VLP WITH CTE body expressions
  • Fix path_relationships to always use proper arrays when path variable is bound (not UInt8 placeholder)

Details

Bug 1 — Property pruning mismatch: PropertyRequirementsAnalyzer records DB column names (full_name) after view resolution, but VLP CTE property pruning in cte_extraction.rs only checked Cypher property names (name). Result: VLP CTE had zero property columns for WITH u1.name as start_name queries. Fix: check both prop_name (Cypher) and prop_value.raw() (DB) in the pruning loop.

Bug 2 — Column name translation: When WITH uses specific property access (WITH u1.name as start_name), the CTE body generated t.start_full_name (using DB column name) instead of t.start_name (using Cypher property name that VLP CTE columns use). Fix: added translate_db_columns_to_cypher_properties() helper that pre-translates PropertyAccessExp column names using CTE metadata before VLP rewriting.

Bug 3 — UInt8 path_relationships: VLP CTEs used CAST(0 AS UInt8) as placeholder for path_relationships because plan_uses_relationships_fn checked context.root_plan which only contained the VLP subtree — it couldn't see relationships(path) calls in parent WITH clauses. Fix: needs_path_data() now returns true whenever a path variable is bound, since the overhead of tracking a relationship type array is minimal.

Test plan

  • 1586 Rust tests pass (0 failures, 261 integration tests)
  • 35 LDBC SQL generation tests pass
  • New regression tests: test_vlp_with_specific_property_access, test_vlp_path_relationships_is_array
  • Manual SQL generation + execution testing against ClickHouse
  • cargo fmt --all && cargo clippy --all-targets clean

🤖 Generated with Claude Code

Three bugs fixed in VLP (variable-length path) + WITH clause queries:

1. VLP property requirements matching: PropertyRequirementsAnalyzer records
   DB column names (e.g., "full_name") after view resolution, but VLP CTE
   property pruning only checked Cypher property names (e.g., "name").
   Fix: check both names in the pruning loop (cte_extraction.rs).

2. DB-to-Cypher property name translation: When WITH uses specific property
   access (e.g., `WITH u1.name as start_name`), the CTE body generated
   `t.start_full_name` instead of `t.start_name`. Fix: pre-translate DB
   column names to Cypher property names before VLP rewriting
   (plan_builder_utils.rs).

3. path_relationships UInt8 placeholder: VLP CTEs used `CAST(0 AS UInt8)`
   for path_relationships because `plan_uses_relationships_fn` couldn't
   detect `relationships(path)` calls in parent WITH clauses (root_plan
   only covered the VLP subtree). Fix: always generate proper relationship
   arrays when a path variable is bound (variable_length_cte.rs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
genezhang added a commit that referenced this pull request Mar 11, 2026
Remove xfail from 5 tests now passing after JOIN rewriting fix:
- test_vlp_with_count_distinct_basic
- test_vlp_with_multiple_aggregates
- test_vlp_with_order_by_aggregate
- test_vlp_with_nested_property_aggregate
- test_vlp_only_vlp_nodes_in_aggregate

Update remaining xfail reasons with specific root causes:
- test_vlp_with_sum_aggregate: needs VLP property columns (PR #206)
- test_vlp_no_aggregation: needs VLP property columns (PR #206)
- test_vlp_bidirectional_with_aggregate: CTE column resolution
- test_vlp_with_having_equivalent: CTE column resolution
- test_vlp_different_hop_counts: *1..3 timeout on test data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@genezhang genezhang requested review from Copilot March 11, 2026 14:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes variable-length path (VLP) SQL generation edge cases around WITH-clause property pruning/name resolution and ensures path_relationships is emitted as an array whenever a path variable is bound (to support relationships(path) usage above the VLP subtree).

Changes:

  • Fix VLP property pruning to match both Cypher property names and resolved DB column names.
  • Pre-translate resolved DB column names back to Cypher property names for VLP WITH CTE body rewriting.
  • Treat bound path variables as always requiring path_relationships array data.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tests/rust/integration/complex_feature_tests.rs Adds regression tests covering VLP WITH specific property access and path_relationships non-UInt8 placeholder behavior.
src/render_plan/plan_builder_utils.rs Adds helper to translate DB column names back to Cypher property names before VLP expression rewriting in WITH CTE bodies.
src/render_plan/cte_extraction.rs Updates VLP property-pruning logic to check required properties against both Cypher and DB column identifiers.
src/clickhouse_query_generator/variable_length_cte.rs Changes path-data requirement logic so bound path variables always trigger path_relationships array population.

Open issues to address before merge:

  • src/render_plan/plan_builder_utils.rs: The doc comment for rewrite_render_expr_for_vlp_with_from_alias mentions a db_to_cypher parameter/mapping, but the function signature doesn’t take it; this is misleading documentation and should be updated to reflect the current design (translation happens via a separate pre-pass).
  • tests/rust/integration/complex_feature_tests.rs: test_vlp_path_relationships_is_array’s doc comment says “even without explicit relationships() call” but the query does call relationships(path). Also, the assertion matches an exact, case-sensitive SQL substring, which is more brittle than the surrounding tests’ to_lowercase()-based checks.
  • src/clickhouse_query_generator/variable_length_cte.rs: needs_path_data() no longer consults needs_path_relationships, making that flag (and its associated comments/plumbing elsewhere) effectively unused/misleading; either remove the flag and update docs, or keep the flag and apply it consistently.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

Comment thread tests/rust/integration/complex_feature_tests.rs Outdated
Comment thread tests/rust/integration/complex_feature_tests.rs
Comment thread src/render_plan/plan_builder_utils.rs Outdated
Comment thread src/render_plan/plan_builder_utils.rs
Comment thread src/clickhouse_query_generator/variable_length_cte.rs
genezhang and others added 3 commits March 11, 2026 09:58
…aversal, type consistency

1. Normalize SQL to lowercase before checking for UInt8 placeholder in test
2. Fix doc comment to match test query (uses relationships(path) in WITH)
3. Update rewrite_render_expr_for_vlp_with_from_alias doc to reflect
   that db_to_cypher translation now happens before this function is called
4. Add recursion into InSubquery, MapLiteral, ArraySubscript, ArraySlicing,
   and ReduceExpr variants in translate_db_columns_to_cypher_properties
5. Change CAST(0 AS UInt8) to CAST([] AS Array(String)) for
   path_relationships in BFS and weighted VLP branches for type consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@genezhang genezhang merged commit e51e901 into main Mar 11, 2026
4 checks passed
@genezhang genezhang deleted the fix/vlp-with-property-resolution branch March 11, 2026 18:34
genezhang added a commit that referenced this pull request Mar 12, 2026
Remove @pytest.mark.xfail from tests that pass after recent VLP,
shortestPath, denormalized schema, and multi-type property fixes
(PRs #206-#210).

Affected areas: VLP aggregation, shortest paths, path variables,
edge constraints, multi-type property extraction, denormalized VLP,
VLP relationship return, VLP WITH comprehension, pattern matrix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants