Skip to content

DONT-MERGE GSOC26: feat: export from_pyarrow as top-level function and add docstring example#474

Draft
tanisha-raha wants to merge 3 commits intolincc-frameworks:mainfrom
tanisha-raha:fix/export-from-pyarrow
Draft

DONT-MERGE GSOC26: feat: export from_pyarrow as top-level function and add docstring example#474
tanisha-raha wants to merge 3 commits intolincc-frameworks:mainfrom
tanisha-raha:fix/export-from-pyarrow

Conversation

@tanisha-raha
Copy link
Copy Markdown

Fixes #432

Changes

  • Added from_pyarrow to the top-level __init__.py imports so it can be used as npd.from_pyarrow()
  • Added from_pyarrow to __all__
  • Added a docstring example to from_pyarrow in io.py

Testing

All 412 existing tests pass with no failures.

@delucchi-cmu
Copy link
Copy Markdown
Contributor

Is this PR intended as a submission for Google Summer of Code? If so, see relevant notes from our guidelines:

Title your pull request "DONT-MERGE GSOC26: <title>", make it "draft".

@tanisha-raha tanisha-raha changed the title feat: export from_pyarrow as top-level function and add docstring example DONT-MERGE GSOC26: feat: export from_pyarrow as top-level function and add docstring example Mar 24, 2026
@tanisha-raha tanisha-raha marked this pull request as draft March 24, 2026 15:33
@tanisha-raha
Copy link
Copy Markdown
Author

Thank you for the guidance! I've renamed the PR and converted it to draft as per the GSoC guidelines. This is indeed intended as a GSoC 2026 contribution.

@hombit hombit added the GSOC26: WIP In-progress PRs for Google Summer of Code 2026 applicants label Mar 30, 2026
Comment thread src/nested_pandas/nestedframe/io.py Outdated
>>> import nested_pandas as npd
>>> import pyarrow as pa
>>> table = pa.table({"a": [1, 2, 3], "b": ["x", "y", "z"]})
>>> nf = npd.from_pyarrow(table)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> nf = npd.from_pyarrow(table)
>>> npd.from_pyarrow(table)

The docstring should show the output of the interesting call.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Removed nf = and added the expected output to the docstring.

Comment thread src/nested_pandas/nestedframe/io.py Outdated
--------
>>> import nested_pandas as npd
>>> import pyarrow as pa
>>> table = pa.table({"a": [1, 2, 3], "b": ["x", "y", "z"]})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use test data that contains some nested fields, to better showcase the feature.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done Updated the example to use nested astronomical data with obj_id and nested time/flux fields

from ._version import __version__
from .nestedframe import NestedFrame
from .nestedframe.io import read_parquet
from .nestedframe.io import read_parquet, from_pyarrow
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be exposed on the API Reference I/O section of the nested-pandas documentation.

docs/reference/nestedframe.rst

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added from_pyarrow to the I/O section in docs/reference/nestedframe.rst

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.03%. Comparing base (509562f) to head (e015b1d).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #474   +/-   ##
=======================================
  Coverage   96.03%   96.03%           
=======================================
  Files          20       20           
  Lines        2247     2247           
=======================================
  Hits         2158     2158           
  Misses         89       89           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

Before [509562f] After [81e92ab] Ratio Benchmark (Parameter)
598±200ms 712±100ms ~1.19 benchmarks.ReadFewColumnsHTTPS.time_run
665±20ms 876±100ms 1.32 benchmarks.ReadFewColumnsS3.time_run
44.5±0.2ms 45.9±0.7ms 1.03 benchmarks.ReassignHalfOfNestedSeries.time_run
257M 259M 1.01 benchmarks.AssignSingleDfToNestedSeries.peakmem_run
27.9±1ms 28.3±1ms 1.01 benchmarks.AssignSingleDfToNestedSeries.time_run
65.2±1ms 65.9±2ms 1.01 benchmarks.CountNestedBy.time_run
1.18G 1.19G 1.01 benchmarks.ReadFewColumnsS3.peakmem_run
137M 137M 1.00 benchmarks.CountNestedBy.peakmem_run
105M 105M 1.00 benchmarks.NestedFrameAddNested.peakmem_run
110M 110M 1.00 benchmarks.NestedFrameQuery.peakmem_run

Click here to view all benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GSOC26: WIP In-progress PRs for Google Summer of Code 2026 applicants

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Properly seat from_pyarrow as an I/O function

3 participants