Skip to content

feat: split xml ipd#167

Open
seanmcculloch wants to merge 6 commits intosplit_dataset_r2rfrom
feat/squashed-split-xml-ipd
Open

feat: split xml ipd#167
seanmcculloch wants to merge 6 commits intosplit_dataset_r2rfrom
feat/squashed-split-xml-ipd

Conversation

@seanmcculloch
Copy link
Copy Markdown
Collaborator

No description provided.

* feat: split xml IP detection

- Add split tile shape support to overlap_detection.py
- Add split path construction and crop passthrough to metadata_builder.py
- Add crop slicing to image_reader.py
- Add fetch_local_xml utility function to pipelines/utils.py
- Update xml_to_dataframe.py for split XML support
- Add uv.lock to gitignore

Tests for this feature are in a separate PR.

* test: add split XML IP detection tests

- Add test_xml_to_dataframe tests for split XML parsing
- Add test_image_reader tests for crop slicing
- Add test_metadata_builder tests for split metadata handling
- Add dataset_split.xml test fixture

These tests verify the split XML support added in feat/split-xml-ipd.

* Initial plan

* Update Rhapso/data_prep/xml_to_dataframe.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update Rhapso/data_prep/xml_to_dataframe.py

descriptive error with bad split xmls.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix channel parsing in parse_image_loader_split_zarr to use .ome.zarr suffix

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

* Update Rhapso/detection/metadata_builder.py

ceil crop_max when downsampling

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update Rhapso/data_prep/xml_to_dataframe.py

descriptive error upon bad split xml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Initial plan

* Fix channel extraction, crop validation, path handling, and test signatures

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

* Move import to top level and use calculated level instead of hardcoded '0'

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

* fix: handling of multiscale levels

* fix: prevent skipping ip detection with 1 split tile only

* feat: add overlapping_only flag (default true) to ipd

* Fix non-split zarr path construction in overlap detection (#164)

* Initial plan

* Fix dim_other path to use proper path joining and include level

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for split XML format in interest point detection (IPD), allowing large images to be virtually subdivided into overlapping tiles for more efficient processing. The split format is used in BigDataViewer/BigStitcher workflows where a single source image is divided into multiple view setups with crop bounds.

Changes:

  • Implements split XML parsing that extracts tile definitions with crop bounds from nested ImageLoader structures
  • Adds crop bound handling throughout the detection pipeline (overlap detection, metadata building, image reading)
  • Introduces an overlapping_only parameter to control whether to process only overlapping regions (stitching) or full volumes (registration)
  • Adds comprehensive test coverage for split mode functionality

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_detection/test_overlap_detection.py Resolves merge conflict markers, keeps HEAD version of tests
tests/test_detection/test_metadata_builder.py New tests for split zarr path construction, crop bounds passing, and level-based scaling
tests/test_detection/test_image_reader.py New tests for crop application in zarr and tiff modes
tests/test_data_prep/test_xml_to_dataframe.py Updates test expectations for parse_view_interest_points, adds tests for split XML parsing
tests/XML_test_data/dataset_split.xml New test fixture with 4-tile split configuration
Rhapso/pipelines/utils.py New utility function for reading local XML files with proper error handling
Rhapso/pipelines/ray/interest_point_detection.py Adds overlapping_only parameter with default=True for backward compatibility
Rhapso/detection/overlap_detection.py Adds _split_tile_shape method, implements overlapping_only logic, uses crop bounds when available
Rhapso/detection/metadata_builder.py Implements split mode path construction using zarr_base_path, scales crop bounds by pyramid level
Rhapso/detection/image_reader.py Applies crop bounds after transpose and before downsampling
Rhapso/data_prep/xml_to_dataframe.py Implements parse_image_loader_split_zarr, updates parse_view_setups XPath to avoid inner ViewSetups
.gitignore Adds uv.lock to ignored files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +97 to +107
if crop_min is not None and crop_max is not None:
if len(crop_min) != 3 or len(crop_max) != 3:
raise ValueError(
f"crop_min and crop_max must both be length 3 for 3D cropping; "
f"got crop_min={crop_min}, crop_max={crop_max}"
)
dask_array = dask_array[
crop_min[0]:crop_max[0] + 1,
crop_min[1]:crop_max[1] + 1,
crop_min[2]:crop_max[2] + 1
]
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No validation that crop bounds are within the array dimensions. If crop_min or crop_max values exceed the array shape after transpose, Dask will raise an IndexError. Consider adding validation to provide a clearer error message, for example: check that crop_max[i] < dask_array.shape[i] for each dimension before applying the crop.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

seanmcculloch and others added 3 commits February 14, 2026 16:30
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI commented Feb 15, 2026

@seanmcculloch I've opened a new pull request, #168, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan

* Use os.path.join for path construction in metadata_builder

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

* Use os.path.join consistently for all path construction

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI commented Feb 15, 2026

@seanmcculloch I've opened a new pull request, #169, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan

* Add crop bounds validation with comprehensive tests

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

* Enhance test to verify complete error message with shape

Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: seanmcculloch <86432671+seanmcculloch@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants