Skip to content

Core: close entries iterable in ManifestFilterManager#15713

Open
kinolaev wants to merge 1 commit intoapache:mainfrom
kinolaev:fix-unclosed-entries-iterable
Open

Core: close entries iterable in ManifestFilterManager#15713
kinolaev wants to merge 1 commit intoapache:mainfrom
kinolaev:fix-unclosed-entries-iterable

Conversation

@kinolaev
Copy link

This PR prevents deadlocks while filtering manifest entries. The problem is ManifestFilterManager.filterManifest in some cases reads each manifest twice: first in manifestHasDeletedFiles and then in filterManifestWithDeletedFiles. If manifestHasDeletedFiles returns in the middle of entries iterable, the underlying connection is open until the ManifestReader is closed. filterManifest method is called for all manifests in parallel. When number of simultaneous connections is limited (for example by http-client.apache.max-connections) it can lead to a deadlock because all connections are held by manifestHasDeletedFiles.

The problem can be reproduced using spark-sql with S3FileIO and http-client.apache.max-connections=1:

create table manifestfiltermanager(id bigint)
  partitioned by (truncate(1, id))
  tblproperties('write.delete.mode'='merge-on-read');
-- create a data manifest with 100 entries
insert into manifestfiltermanager select id from range(100);
-- create a delete manifest with 50 entries
delete from manifestfiltermanager where id in (select id from range(0, 100, 2));
-- make delete files dangling (fails without https://github.com/apache/iceberg/pull/15712)
call system.rewrite_data_files('manifestfiltermanager', options => map('rewrite-all', 'true'));
-- ManifestFilterManager.manifestHasDeletedFiles reads first block of the delete manifest
-- and returns true before the end of liveEntries() iterable without closing it.
-- ManifestFilterManager.filterManifestWithDeletedFiles fails with ConnectionPoolTimeoutException
call system.rewrite_data_files('manifestfiltermanager', options => map('rewrite-all', 'true'));

@github-actions github-actions bot added the core label Mar 21, 2026
Signed-off-by: Sergei Nikolaev <kinolaev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant