Skip to content

Commit c6d63f4

Browse files
Merge pull request #21 from patterninc/copilot/sub-pr-20
Fix `publish_pandas` docstring to match actual signature and behavior
2 parents 0764094 + 98895ce commit c6d63f4

1 file changed

Lines changed: 6 additions & 6 deletions

File tree

src/ds_platform_utils/metaflow/pandas.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,8 @@ def publish_pandas( # noqa: PLR0913 (too many arguments)
5656
:param add_created_date: When true, will add a column called `created_date` to the DataFrame with the current
5757
timestamp in UTC.
5858
59-
:param chunk_size: Number of rows to be inserted once. If not provided, all rows will be dumped once.
60-
Default to None normally, 100,000 if inside a stored procedure.
59+
:param chunk_size: Number of rows to be inserted once. If not provided, the chunk size will be
60+
automatically estimated based on the DataFrame's memory usage.
6161
6262
:param compression: The compression used on the Parquet files: gzip or snappy.
6363
Gzip gives supposedly a better compression, while snappy is faster. Use whichever is more appropriate.
@@ -69,9 +69,9 @@ def publish_pandas( # noqa: PLR0913 (too many arguments)
6969
7070
:param parallel: Number of threads to be used when uploading chunks. See details at parallel parameter.
7171
72-
:param quote_identifiers: By default, identifiers, specifically database, schema, table and column names
73-
(from df.columns) will be quoted. If set to False, identifiers are passed on to Snowflake without quoting.
74-
I.e. identifiers will be coerced to uppercase by Snowflake. (Default value = True)
72+
:param quote_identifiers: If set to True, identifiers, specifically database, schema, table and column names
73+
(from df.columns) will be quoted. If set to False (default), identifiers are passed on to Snowflake without
74+
quoting, i.e. identifiers will be coerced to uppercase by Snowflake.
7575
7676
:param auto_create_table: When true, will automatically create a table with corresponding columns for each column in
7777
the passed in DataFrame. The table will not be created if it already exists.
@@ -87,7 +87,7 @@ def publish_pandas( # noqa: PLR0913 (too many arguments)
8787
8888
:param use_s3_stage: Whether to use the S3 stage method to publish the DataFrame, which is more efficient for large DataFrames.
8989
90-
:param table_schema: Optional list of tuples specifying the column names and types for the Snowflake table.
90+
:param table_definition: Optional list of tuples specifying the column names and types for the Snowflake table.
9191
This is only used when `use_s3_stage` is True, and is required in that case. The list should be in the format: `[(col_name1, col_type1), (col_name2, col_type2), ...]`, where `col_type` is a valid Snowflake data type (e.g., 'STRING', 'NUMBER', 'TIMESTAMP_NTZ', etc.).
9292
"""
9393
if not isinstance(df, pd.DataFrame):

0 commit comments

Comments
 (0)