[fix](fe) Fix MATCH crash on alias slots and push down as virtual column#61584
Open
airborne12 wants to merge 1 commit intoapache:masterfrom
Open
[fix](fe) Fix MATCH crash on alias slots and push down as virtual column#61584airborne12 wants to merge 1 commit intoapache:masterfrom
airborne12 wants to merge 1 commit intoapache:masterfrom
Conversation
…l column ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: When MATCH expressions reference alias slots that have lost column metadata (e.g., `CAST(variant_col['subkey'] AS VARCHAR) AS fn`), and the MATCH is in a predicate that cannot be pushed below a join (due to OR with join-dependent conditions like EXISTS mark or LEFT JOIN null checks), ExpressionTranslator's visitMatch() crashes with "SlotReference in Match failed to get Column". Root cause: `Alias.toSlot()` only preserves originalColumn/originalTable when its child is a direct SlotReference. When wrapped in Cast/ElementAt, all metadata is lost. Combined with OR preventing filter pushdown, the MATCH is stuck at the join layer referencing a metadata-less slot. This PR fixes the issue with two changes: 1. **Graceful fallback in visitMatch()**: When the slot has lost column/table metadata, fall back to `invertedIndex = null` instead of throwing. The BE evaluates MATCH correctly via slow-path expression evaluation, or the virtual column mechanism (below) provides fast-path index evaluation. 2. **New rewrite rule PushDownMatchPredicateAsVirtualColumn**: Extracts MATCH from join/filter predicates, traces the alias slot back through the Project to find the original column expression, and creates a virtual column on OlapScan. The BE evaluates the virtual column via inverted index using fast_execute(), and the join layer references the boolean result. ### Release note Fix MATCH expressions crashing when used with CTE aliases involving type casts combined with EXISTS/LEFT JOIN and OR conditions. Also enables inverted index evaluation for such MATCH expressions via virtual column pushdown. ### Check List (For Author) - Test: Manual test - Behavior changed: No - Does this need documentation: No
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
run buildall |
TPC-H: Total hot run time: 26656 ms |
TPC-DS: Total hot run time: 168112 ms |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
When MATCH expressions reference alias slots that have lost column metadata
(e.g.,
CAST(variant_col['subkey'] AS VARCHAR) AS fn), and the MATCH is ina predicate that cannot be pushed below a join (due to OR with join-dependent
conditions like EXISTS mark or LEFT JOIN null checks), ExpressionTranslator's
visitMatch() crashes with "SlotReference in Match failed to get Column".
Root cause:
Alias.toSlot()only preserves originalColumn/originalTable whenits child is a direct SlotReference. When wrapped in Cast/ElementAt, all
metadata is lost. Combined with OR preventing filter pushdown, the MATCH is
stuck at the join layer referencing a metadata-less slot.
Reproducer:
This PR fixes the issue with two changes:
Graceful fallback in visitMatch(): When the slot has lost column/table
metadata, fall back to
invertedIndex = nullinstead of throwing. The BEevaluates MATCH correctly via slow-path expression evaluation, or the
virtual column mechanism (below) provides fast-path index evaluation.
New rewrite rule PushDownMatchPredicateAsVirtualColumn: Extracts MATCH
from join/filter predicates, traces the alias slot back through the Project
to find the original column expression, and creates a virtual column on
OlapScan. The BE evaluates the virtual column via inverted index using
fast_execute(), and the join layer references the boolean result.
Plan transformation:
Release note
Fix MATCH expressions crashing when used with CTE aliases involving type casts
combined with EXISTS/LEFT JOIN and OR conditions. Also enables inverted index
evaluation for such MATCH expressions via virtual column pushdown.
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)