Skip to content

Core: Fix rewrite_position_delete_files failure with array/map columns#15079

Closed
bk-mz wants to merge 6 commits intoapache:mainfrom
bk-mz:fix/rewrite-position-deletes-array-columns
Closed

Core: Fix rewrite_position_delete_files failure with array/map columns#15079
bk-mz wants to merge 6 commits intoapache:mainfrom
bk-mz:fix/rewrite-position-deletes-array-columns

Conversation

@bk-mz
Copy link

@bk-mz bk-mz commented Jan 19, 2026

Summary

Fixes rewrite_position_delete_files failing with ValidationException: Invalid partition field parent when tables contain array or map columns.

Root Cause

ExpressionUtil.identitySpec() attempted to create identity partition fields for all primitive fields in the schema, including those nested inside arrays and maps. Since commit 9fb80b7, PartitionSpec.checkCompatibility() validates that partition field parents must be struct types, causing the failure.

Fix

identitySpec() now filters out fields that cannot be partition sources by checking that all ancestor types are structs, using TypeUtil.indexParents() - the same approach used in PartitionSpec.checkCompatibility().

Testing

Added test case that reproduces the issue with a table containing ARRAY<STRUCT<...>> column.

Fixes #15080

… columns

This test demonstrates a regression in Iceberg 1.10.0 where
rewrite_position_delete_files fails with ValidationException when the
table contains array columns with primitive fields.

The error occurs because ExpressionUtil.identitySpec() creates a
temporary partition spec from schema field IDs, and the new validation
added in commit 9fb80b7 rejects fields whose parent type is a list.
ExpressionUtil.identitySpec() now skips fields that cannot be partition
sources. Fields nested inside arrays or maps are filtered out by checking
that all ancestor types are structs, matching the validation logic in
PartitionSpec.checkCompatibility().

Fixes apache#15080
@bk-mz bk-mz changed the title Add test reproducing rewrite_position_delete_files failure with array columns Core: Fix rewrite_position_delete_files failure with array/map columns Jan 19, 2026
}

@TestTemplate
public void testRewritePositionDeletesWithArrayColumns() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Shall we also add a similar test for Map column?

@szehon-ho
Copy link
Member

Yea I think this code is suggested by @aokolnychyi , to be honest I am not sure what happens when we have a filter on a nested column. I am wondering, what happens when we have a filter like `where map_col = map('a', 1)', was it passed before 9fb80b7 as a residual filter to the file, or not? This will help me understand if we should do another approach like bypass the PartitionSpec check here

We could potentially test this at the lower level TestPositionDeletesReader.

@yogevyuval
Copy link
Contributor

@bk-mz Hi!
Any update on this?

@bk-mz
Copy link
Author

bk-mz commented Feb 26, 2026

@yogevyuval IDK, this PR is sound to me, but looks like maintainers want to do additional research?

@szehon-ho
Copy link
Member

guys, this actually wont work here. The residualWithoutConstants now has no filters for map/array columns, which lead to wrong results for PositionDeletes metadata table.

@szehon-ho
Copy link
Member

i propose a cleaner solution : #15632

@bk-mz bk-mz closed this Mar 16, 2026
@bk-mz bk-mz deleted the fix/rewrite-position-deletes-array-columns branch March 16, 2026 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rewrite_position_delete_files fails with ValidationException for tables with array columns

4 participants