Skip to content

Add support for Iceberg tables #207

@nwheeler81

Description

@nwheeler81

Is your feature request related to a problem? Please describe.
This specifies the requirements for migrating the CQLReplicator's snapshot storage from Parquet-based head/tail files to Apache Iceberg tables with built-in snapshot versioning. The current approach uses separate Parquet directories (tile_N.head and tile_N.tail) in S3 and relies on Spark DataFrame joins (leftanti, inner) to detect inserts, deletes, and updates between snapshots. The migration replaces this with a single Iceberg table per source table, leveraging Iceberg's time-travel and incremental read capabilities for change detection. The target replication destination remains Amazon Keyspaces.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions