Skip to content

ENH: skip redundant station coord reads for subsequent sweeps in DataTree#337

Merged
kmuehlbauer merged 33 commits intoopenradar:mainfrom
aladinor:feat/skip-redundant-station-coords-clean
Mar 25, 2026
Merged

ENH: skip redundant station coord reads for subsequent sweeps in DataTree#337
kmuehlbauer merged 33 commits intoopenradar:mainfrom
aladinor:feat/skip-redundant-station-coords-clean

Conversation

@aladinor
Copy link
Copy Markdown
Member

@aladinor aladinor commented Feb 27, 2026

Summary

Move station coordinates (latitude, longitude, altitude) to the root node as coordinates and drop them entirely from sweep nodes. This ensures the root is the single authoritative source for station location in a DataTree.

  • Rename boolean kwarg from site_coords to site_as_coords for clarity (per review)
  • Extract shared _apply_site_as_coords() helper into common.py, replacing duplicated if/else blocks across all backends
  • Refactor _assign_root() to return (root, cleaned_sweeps) — no hidden side effect
  • Drop station vars in _attach_sweep_groups() for backends using that path
  • Fix read-only array bug in _ipol_time using da.where instead of in-place mutation

Closes #334

Dependency

Do not merge until pydata/xarray#11230 is merged.

Xarray's DataTree currently does not inherit non-index coordinates from parent nodes. The xarray PR adds inherit="all" support, which will allow sweeps to access latitude, longitude, and altitude from the root via to_dataset(inherit="all").

Follow-up PR (after xarray PR merges)

Update the georeferencing module to use to_dataset(inherit="all") so ds.latitude, ds.longitude, ds.altitude are accessible on sweep datasets:

  • xradar/georeference/projection.py:129 — uses ds.latitude.values and ds.longitude.values
  • xradar/georeference/transforms.py:255 — uses ds.latitude.values and ds.altitude.values

Changes

  • xradar/io/backends/common.py: Add _apply_site_as_coords() helper; refactor _assign_root() to return cleaned sweeps; drop station vars in _attach_sweep_groups()
  • All backend open_dataset methods: Use _apply_site_as_coords(ds, site_as_coords)
  • All open_*_datatree() functions: Pass site_as_coords=False for sweep datasets
  • xradar/io/backends/cfradial1.py: Always assign station vars to sweeps before calling _apply_site_as_coords
  • xradar/io/backends/nexrad_level2.py, xradar/io/backends/uf.py: Use zip() to build sweep entries from cleaned ls_ds
  • xradar/util.py: Fix read-only array in _ipol_time with da.where
  • docs/history.md: Add changelog entry

Test plan

  • All 440 tests pass
  • TestStationCoordsOnRoot verifies root has station coords as coordinates and sweeps have none (ODIM, NEXRAD, GAMIC, CfRadial1)
  • Test data_vars counts updated to reflect 3 fewer vars per sweep
  • ruff check and black --check pass clean

…y_site_coords helper

Pass site_coords=False for all sweeps when building a DataTree so that
latitude/longitude/altitude are only placed as coordinates on the root
node. Sweeps keep them as plain data variables, avoiding redundant I/O
and coordinate promotion. Extracts the repeated if/else site_coords
logic into a shared _apply_site_coords() helper in common.py, used by
all 10 applicable backends.

Closes openradar#334
CfRadial1's conform_cfradial2_sweep_group strips station variables
(latitude, longitude, altitude) from sweep datasets. Previously these
were only re-added when site_coords=True. With the DataTree optimization
(site_coords=False), sweeps had no station vars at all, breaking
georeference functions that access ds.longitude on sweep datasets.

Now station vars are always assigned to sweeps, and _apply_site_coords
controls whether they are promoted to coordinates or kept as data vars.

Also adds a CfRadial1 test to TestStationCoordsOnRoot.
@aladinor aladinor force-pushed the feat/skip-redundant-station-coords-clean branch from 038f59a to 1eebc0e Compare February 27, 2026 11:45
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 98.24561% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 93.51%. Comparing base (e59d9e6) to head (51a6fe7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
xradar/io/backends/common.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #337      +/-   ##
==========================================
- Coverage   93.52%   93.51%   -0.02%     
==========================================
  Files          27       27              
  Lines        5717     5721       +4     
==========================================
+ Hits         5347     5350       +3     
- Misses        370      371       +1     
Flag Coverage Δ
notebooktests 0.00% <0.00%> (ø)
unittests 93.51% <98.24%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aladinor aladinor requested a review from kmuehlbauer March 2, 2026 00:22
@aladinor aladinor self-assigned this Mar 2, 2026
Comment thread xradar/io/backends/common.py Outdated
Per reviewer feedback from @kmuehlbauer, rename the parameter
across all backends, tests, and documentation for explicitness.
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@aladinor aladinor requested a review from kmuehlbauer March 2, 2026 17:07
Copy link
Copy Markdown
Collaborator

@kmuehlbauer kmuehlbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are occasions of site_coords replaced which are related to actual site_coords tuple, not the boolean kwarg.

Please revert these part.

Comment thread tests/io/test_furuno.py Outdated
Comment thread tests/io/test_furuno.py Outdated
Comment thread tests/io/test_io.py Outdated
Comment thread tests/io/test_io.py Outdated
Comment thread tests/io/test_io.py Outdated
Comment thread xradar/io/backends/iris.py Outdated
Comment thread xradar/io/backends/odim.py Outdated
Comment thread xradar/io/backends/odim.py Outdated
Comment thread xradar/io/backends/rainbow.py Outdated
Comment thread xradar/io/backends/rainbow.py Outdated
The site_as_coords rename should only apply to the boolean kwarg,
not to the site_coords property that returns the (lon, lat, alt) tuple.
NumPy/xarray may return non-writable arrays from .values and .loc,
causing ValueError when _ipol_time tries to modify them in-place.
@aladinor
Copy link
Copy Markdown
Member Author

Hi @kmuehlbauer, sorry about that. I overlooked the tuple property vs. the boolean kwarg distinction during the find & replace. I've reverted all the site_coords property/tuple references back to their original names. Only the boolean kwarg uses site_as_coords now.

@aladinor aladinor requested a review from kmuehlbauer March 13, 2026 02:52
@aladinor aladinor marked this pull request as draft March 13, 2026 04:10
@aladinor aladinor marked this pull request as ready for review March 13, 2026 04:14
@aladinor aladinor marked this pull request as draft March 13, 2026 04:36
Comment thread xradar/util.py Outdated
Replace da.copy() + in-place mutation with da.where(da.notnull(), interpolated)
to avoid full copy of the DataArray per review suggestion.
…vars

- _assign_root returns (root, cleaned_sweeps) instead of mutating input list
- _attach_sweep_groups drops station vars before attaching to DataTree
- Use drop_vars(_STATION_VARS, errors="ignore") for idiomatic xarray
Comment thread xradar/util.py Outdated
@kmuehlbauer
Copy link
Copy Markdown
Collaborator

@aladinor It would make sense, to get this in before #332. Could you spare some time to finish this up?

- Use to_dataset(inherit="all_coords") in georef tree functions so
  sweeps inherit lat/lon/alt from root
- Pin xarray to git main (unreleased inherit support)
- Bump requires-python to >=3.11 (matches xarray dev)
- Add explicit setuptools package discovery
Sweep datasets no longer carry lat/lon/alt directly — they must
inherit them from the root via to_dataset(inherit="all_coords").
@aladinor aladinor requested a review from kmuehlbauer March 25, 2026 18:35
@aladinor aladinor marked this pull request as ready for review March 25, 2026 18:36
Station coords now live on the root node only. Sweep datasets must
use to_dataset(inherit="all_coords") to access lat/lon/alt.
Station coords are now scalar on root and inherited by sweeps.
fix_sitecoords skips coords that are missing or already scalar.
@kmuehlbauer
Copy link
Copy Markdown
Collaborator

@aladinor You'd need to have the pin of xarray in environment.yml, too. It's used in RTD, without the docs build will always fail, or do I miss something?

Pin xarray to git main in environment.yml (RTD), ci/unittests.yml,
and ci/notebooktests.yml so inherit='all_coords' is available
everywhere.
@aladinor
Copy link
Copy Markdown
Member Author

@kmuehlbauer you are right. I was trying to find the root of the failures. I've added the xarray git pin to environment.yml so RTD picks up inherit='all_coords'. Also pinned it in ci/unittests.yml and ci/notebooktests.yml via the pip section in the conda env files. These pins are temporary until xarray cuts a release

The @map_over_sweeps decorator previously used xr.map_over_datasets
which passes DatasetView without inherited coordinates. Now it uses
apply_to_sweeps which calls to_dataset(inherit="all_coords") so
sweep datasets include lat/lon/alt from root.

Also update Assign_GeoCoords notebook to fix per-ray station coords
on root for mobile radar (DOW) before georeferencing.
@kmuehlbauer
Copy link
Copy Markdown
Collaborator

@aladinor Green lights. Great work! Shall I go ahead and merge?

@aladinor
Copy link
Copy Markdown
Member Author

aladinor commented Mar 25, 2026

yeap. we are ready to go. Still dont have superpowers to merge :)

@kmuehlbauer kmuehlbauer merged commit 5133922 into openradar:main Mar 25, 2026
11 checks passed
@aladinor aladinor deleted the feat/skip-redundant-station-coords-clean branch March 25, 2026 21:25
aladinor added a commit to aladinor/xradar that referenced this pull request Mar 27, 2026
Pop `inherit` from kwargs before forwarding to func, defaulting to
"all_coords" for backward compatibility. This lets callers control
whether root coords (lat/lon/alt) are inherited onto sweep datasets.

Using inherit=False prevents station coords from leaking onto sweeps
after map_over_sweeps, preserving the root-only storage from PR openradar#337.

Closes openradar#343
aladinor added a commit that referenced this pull request Mar 27, 2026
* FIX: allow passing inherit parameter to apply_to_sweeps

Pop `inherit` from kwargs before forwarding to func, defaulting to
"all_coords" for backward compatibility. This lets callers control
whether root coords (lat/lon/alt) are inherited onto sweep datasets.

Using inherit=False prevents station coords from leaking onto sweeps
after map_over_sweeps, preserving the root-only storage from PR #337.

Closes #343

* FIX: address PR review — document inherit values, use fixture

- Document accepted inherit values with xarray cross-reference
- Add cfradial1_sgp_dtree session fixture to conftest.py
- Use fixture in new inherit tests instead of inline fetch+open

* DOC: add PR #344 to changelog
@aladinor aladinor mentioned this pull request Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

ENH: skip redundant station coord reads for subsequent sweeps in DataTree

2 participants