Fix/edf d discontinuous detection#13583
Conversation
Detect and raise NotImplementedError when loading EDF+D or BDF+D files that contain actual gaps between data records, instead of silently treating them as continuous data. This prevents incorrect time alignment. Fixes mne-tools#13429
|
Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴 |
scott-huberty
left a comment
There was a problem hiding this comment.
Hey @Arnav1709 thanks for getting this started!
-
This PR adds about 400 LOC, and so more rigorous testing of the added code would be helpful to reviewers. From a quick check, it looks like only 5% of the LOC added in this PR are covered by tests . In MNE-Python we try to keep code coverage at 95% minimum.
-
This PR doesn't actually add a feature to read EDF+D files, rather it detects if such files contain acquisition gaps, and raises an error? It's a step in the right direction, but I would be interested in hearing from other devs wrt whether it is worth merging this check, or whether we should wait for a PR that actually implements reading EDF+D files.
-
Out of curiosity, why did you attribute commit 694fcaf to @larsoner ?
- Validate n_records matches len(record_times), raise ValueError on mismatch - Validate record_times are numeric (no strings or invalid types) - Check for non-finite values (NaN, inf) and raise ValueError - Warn and sort if record_times are not monotonically increasing - Add comprehensive tests for all validation scenarios
cf0ef6b to
dbec29c
Compare
|
Also, was this PR created by / with assistance from an LLM? Please disclose the nature of LLM usage |
Reference issue (if any)
Fixes #13429.
What does this implement/fix?
This PR addresses a critical bug where MNE-Python would silently load EDF+D (discontinuous) and BDF+D files with acquisition gaps between data records, treating them as continuous data. This could lead to:
RawEDF.timesarray would show continuous timestamps, ignoring gapsChanges made:
Detection of EDF+D/BDF+D files: Modified
_read_edf_header()to read and check the 44-byte reserved header field forEDF+DorBDF+Dmarkers.Gap detection: Added two helper functions:
_get_tal_record_times(): Extracts onset times of each data record from TAL (Time-stamped Annotation List) annotations_check_edf_discontinuity(): Compares expected vs. actual record timestamps to detect gapsError handling: Modified both
RawEDFandRawBDFclasses to check for discontinuities after reading TAL data. When gaps are detected, aNotImplementedErroris raised with:Backward compatibility: EDF+D/BDF+D files that are marked as discontinuous but have no actual gaps (e.g., from some Nihon Kohden systems) continue to load normally.
Documentation: Updated docstrings for
read_raw_edf()andread_raw_bdf()to document this limitation.Tests: Added comprehensive tests covering:
Additional information
This is a breaking change in behavior: files that previously loaded (incorrectly) will now raise an error. However, this is intentional as the previous behavior was incorrect and could lead to data analysis errors.
The error message provides actionable guidance, suggesting specialized tools (luna/lunapi) for handling discontinuous files or converting to continuous format if gaps are not significant.
The implementation follows the EDF+ specification for detecting discontinuous files and uses TAL annotations (which are required in EDF+ files) to detect actual gaps.
Related discussion in the GitHub issue suggests that full support for EDF+D would require filling gaps with NaN data (similar to how Eyelink handles gaps), which is a more complex feature that could be implemented in a future PR.