IceFrame makes it easy to create Apache Iceberg tables with various schema formats.
ice.create_table("my_table", schema)You can define schemas in several ways:
Simple key-value pairs of column names and types.
schema = {
"id": "long",
"name": "string",
"price": "double",
"active": "boolean",
"created_at": "timestamp",
"birth_date": "date"
}
ice.create_table("products", schema)Supported types: string, int, long, float, double, boolean, timestamp, date.
For more control over types and nullability.
import pyarrow as pa
schema = pa.schema([
pa.field("id", pa.int64(), nullable=False),
pa.field("name", pa.string()),
pa.field("tags", pa.list_(pa.string()))
])
ice.create_table("users", schema)Infer schema from an existing DataFrame.
import polars as pl
df = pl.DataFrame({"id": [1], "name": ["test"]})
ice.create_table("inferred_table", df)You can specify namespaces (databases/schemas) in the table name:
# Creates table 'sales' in 'marketing' namespace
ice.create_table("marketing.sales", schema)If the namespace doesn't exist, IceFrame will attempt to create it.
Partition data for better query performance.
# Not yet fully exposed in high-level API, use underlying PyIceberg table object
# or pass partition_spec to create_table (requires PyIceberg PartitionSpec object)Set table properties like compression codec.
properties = {
"write.parquet.compression-codec": "zstd"
}
ice.create_table("optimized_table", schema, properties=properties)