Creating Tables

IceFrame makes it easy to create Apache Iceberg tables with various schema formats.

Basic Usage

ice.create_table("my_table", schema)

Schema Formats

You can define schemas in several ways:

Dictionary Schema

Simple key-value pairs of column names and types.

schema = {
    "id": "long",
    "name": "string",
    "price": "double",
    "active": "boolean",
    "created_at": "timestamp",
    "birth_date": "date"
}
ice.create_table("products", schema)

Supported types: string, int, long, float, double, boolean, timestamp, date.

PyArrow Schema

For more control over types and nullability.

import pyarrow as pa

schema = pa.schema([
    pa.field("id", pa.int64(), nullable=False),
    pa.field("name", pa.string()),
    pa.field("tags", pa.list_(pa.string()))
])
ice.create_table("users", schema)

Polars DataFrame

Infer schema from an existing DataFrame.

import polars as pl

df = pl.DataFrame({"id": [1], "name": ["test"]})
ice.create_table("inferred_table", df)

Namespaces

You can specify namespaces (databases/schemas) in the table name:

# Creates table 'sales' in 'marketing' namespace
ice.create_table("marketing.sales", schema)

If the namespace doesn't exist, IceFrame will attempt to create it.

Advanced Options

Partitioning

Partition data for better query performance.

# Not yet fully exposed in high-level API, use underlying PyIceberg table object
# or pass partition_spec to create_table (requires PyIceberg PartitionSpec object)

Table Properties

Set table properties like compression codec.

properties = {
    "write.parquet.compression-codec": "zstd"
}
ice.create_table("optimized_table", schema, properties=properties)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating Tables

Basic Usage

Schema Formats

Dictionary Schema

PyArrow Schema

Polars DataFrame

Namespaces

Advanced Options

Partitioning

Table Properties

FilesExpand file tree

creating_tables.md

Latest commit

History

creating_tables.md

File metadata and controls

Creating Tables

Basic Usage

Schema Formats

Dictionary Schema

PyArrow Schema

Polars DataFrame

Namespaces

Advanced Options

Partitioning

Table Properties