ENH: skip redundant station coord reads for subsequent sweeps in DataTree#337
Conversation
…y_site_coords helper Pass site_coords=False for all sweeps when building a DataTree so that latitude/longitude/altitude are only placed as coordinates on the root node. Sweeps keep them as plain data variables, avoiding redundant I/O and coordinate promotion. Extracts the repeated if/else site_coords logic into a shared _apply_site_coords() helper in common.py, used by all 10 applicable backends. Closes openradar#334
CfRadial1's conform_cfradial2_sweep_group strips station variables (latitude, longitude, altitude) from sweep datasets. Previously these were only re-added when site_coords=True. With the DataTree optimization (site_coords=False), sweeps had no station vars at all, breaking georeference functions that access ds.longitude on sweep datasets. Now station vars are always assigned to sweeps, and _apply_site_coords controls whether they are promoted to coordinates or kept as data vars. Also adds a CfRadial1 test to TestStationCoordsOnRoot.
038f59a to
1eebc0e
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #337 +/- ##
==========================================
- Coverage 93.52% 93.51% -0.02%
==========================================
Files 27 27
Lines 5717 5721 +4
==========================================
+ Hits 5347 5350 +3
- Misses 370 371 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Per reviewer feedback from @kmuehlbauer, rename the parameter across all backends, tests, and documentation for explicitness.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
The site_as_coords rename should only apply to the boolean kwarg, not to the site_coords property that returns the (lon, lat, alt) tuple.
NumPy/xarray may return non-writable arrays from .values and .loc, causing ValueError when _ipol_time tries to modify them in-place.
|
Hi @kmuehlbauer, sorry about that. I overlooked the tuple property vs. the boolean kwarg distinction during the find & replace. I've reverted all the site_coords property/tuple references back to their original names. Only the boolean kwarg uses site_as_coords now. |
Replace da.copy() + in-place mutation with da.where(da.notnull(), interpolated) to avoid full copy of the DataArray per review suggestion.
…vars - _assign_root returns (root, cleaned_sweeps) instead of mutating input list - _attach_sweep_groups drops station vars before attaching to DataTree - Use drop_vars(_STATION_VARS, errors="ignore") for idiomatic xarray
- Use to_dataset(inherit="all_coords") in georef tree functions so sweeps inherit lat/lon/alt from root - Pin xarray to git main (unreleased inherit support) - Bump requires-python to >=3.11 (matches xarray dev) - Add explicit setuptools package discovery
Sweep datasets no longer carry lat/lon/alt directly — they must inherit them from the root via to_dataset(inherit="all_coords").
Station coords now live on the root node only. Sweep datasets must use to_dataset(inherit="all_coords") to access lat/lon/alt.
Station coords are now scalar on root and inherited by sweeps. fix_sitecoords skips coords that are missing or already scalar.
|
@aladinor You'd need to have the pin of xarray in environment.yml, too. It's used in RTD, without the docs build will always fail, or do I miss something? |
Pin xarray to git main in environment.yml (RTD), ci/unittests.yml, and ci/notebooktests.yml so inherit='all_coords' is available everywhere.
|
@kmuehlbauer you are right. I was trying to find the root of the failures. I've added the xarray git pin to environment.yml so RTD picks up inherit='all_coords'. Also pinned it in ci/unittests.yml and ci/notebooktests.yml via the pip section in the conda env files. These pins are temporary until xarray cuts a release |
The @map_over_sweeps decorator previously used xr.map_over_datasets which passes DatasetView without inherited coordinates. Now it uses apply_to_sweeps which calls to_dataset(inherit="all_coords") so sweep datasets include lat/lon/alt from root. Also update Assign_GeoCoords notebook to fix per-ray station coords on root for mobile radar (DOW) before georeferencing.
|
@aladinor Green lights. Great work! Shall I go ahead and merge? |
|
yeap. we are ready to go. Still dont have superpowers to merge :) |
Pop `inherit` from kwargs before forwarding to func, defaulting to "all_coords" for backward compatibility. This lets callers control whether root coords (lat/lon/alt) are inherited onto sweep datasets. Using inherit=False prevents station coords from leaking onto sweeps after map_over_sweeps, preserving the root-only storage from PR openradar#337. Closes openradar#343
* FIX: allow passing inherit parameter to apply_to_sweeps Pop `inherit` from kwargs before forwarding to func, defaulting to "all_coords" for backward compatibility. This lets callers control whether root coords (lat/lon/alt) are inherited onto sweep datasets. Using inherit=False prevents station coords from leaking onto sweeps after map_over_sweeps, preserving the root-only storage from PR #337. Closes #343 * FIX: address PR review — document inherit values, use fixture - Document accepted inherit values with xarray cross-reference - Add cfradial1_sgp_dtree session fixture to conftest.py - Use fixture in new inherit tests instead of inline fetch+open * DOC: add PR #344 to changelog
Summary
Move station coordinates (
latitude,longitude,altitude) to the root node as coordinates and drop them entirely from sweep nodes. This ensures the root is the single authoritative source for station location in a DataTree.site_coordstosite_as_coordsfor clarity (per review)_apply_site_as_coords()helper intocommon.py, replacing duplicated if/else blocks across all backends_assign_root()to return(root, cleaned_sweeps)— no hidden side effect_attach_sweep_groups()for backends using that path_ipol_timeusingda.whereinstead of in-place mutationCloses #334
Dependency
Xarray's
DataTreecurrently does not inherit non-index coordinates from parent nodes. The xarray PR addsinherit="all"support, which will allow sweeps to accesslatitude,longitude, andaltitudefrom the root viato_dataset(inherit="all").Follow-up PR (after xarray PR merges)
Update the georeferencing module to use
to_dataset(inherit="all")sods.latitude,ds.longitude,ds.altitudeare accessible on sweep datasets:xradar/georeference/projection.py:129— usesds.latitude.valuesandds.longitude.valuesxradar/georeference/transforms.py:255— usesds.latitude.valuesandds.altitude.valuesChanges
xradar/io/backends/common.py: Add_apply_site_as_coords()helper; refactor_assign_root()to return cleaned sweeps; drop station vars in_attach_sweep_groups()open_datasetmethods: Use_apply_site_as_coords(ds, site_as_coords)open_*_datatree()functions: Passsite_as_coords=Falsefor sweep datasetsxradar/io/backends/cfradial1.py: Always assign station vars to sweeps before calling_apply_site_as_coordsxradar/io/backends/nexrad_level2.py,xradar/io/backends/uf.py: Usezip()to build sweep entries from cleanedls_dsxradar/util.py: Fix read-only array in_ipol_timewithda.wheredocs/history.md: Add changelog entryTest plan
TestStationCoordsOnRootverifies root has station coords as coordinates and sweeps have none (ODIM, NEXRAD, GAMIC, CfRadial1)ruff checkandblack --checkpass clean