Iceberg tables require periodic maintenance to ensure optimal performance and manage storage costs. IceFrame provides simple methods for common maintenance tasks.
Remove old table snapshots to free up space and keep metadata size manageable.
# Remove snapshots older than 7 days, keeping at least the last 1
ice.expire_snapshots("my_table", older_than_days=7, retain_last=1)Clean up data files that are no longer referenced by any snapshot (e.g., from failed writes).
# Remove orphan files older than 3 days
ice.remove_orphan_files("my_table", older_than_days=3)Combine small data files into larger ones to improve read performance (compaction).
# Compact files to target size of 512 MB
ice.compact_data_files("my_table", target_file_size_mb=512)Tip
Run compaction regularly on tables with frequent small updates (streaming ingestion).
- Schedule Maintenance: Run these operations periodically (e.g., daily or weekly) via a scheduler like Airflow.
- Order of Operations:
expire_snapshotsremove_orphan_filescompact_data_files