Fix: Rounding error in get_data with tmin/tmax#13635
Fix: Rounding error in get_data with tmin/tmax#13635git-gigi wants to merge 6 commits intomne-tools:mainfrom
Conversation
|
Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴 |
for more information, see https://pre-commit.ci
| ) | ||
| _on_missing(on_empty, msg, error_klass=RuntimeError) | ||
|
|
||
| def _handle_tmin_tmax(self, tmin, tmax): |
There was a problem hiding this comment.
adding a new private method to Epochs that has the same name as an existing utility function is not the right way to go about this. It duplicates code and introduces the possibility of epochs behaving differently than Raw/Evoked for example.
There was a problem hiding this comment.
Thanks for the review @drammock.
I understand the concern about code duplication. I initially tried modifying the shared _handle_tmin_tmax in mixin.py, but adding use_rounding=True there caused regressions in Raw tests (shifting data by one sample on Windows environments due to float precision). That's why I attempted the override in Epochs.
The core issue linked (#13634) is that epochs.crop(tmin=t) includes a sample that epochs.get_data(tmin=t) excludes. Since crop uses rounding internally, users expect get_data to match that behavior for consistency.
If modifying the global mixin.py is risky for Raw backward compatibility, and overriding in Epochs is discouraged, do you have a suggestion on how to reconcile get_data with crop for Epochs specifically? Maybe passing a round_tmin argument to get_data?
There was a problem hiding this comment.
The core issue linked (#13634) is that epochs.crop(tmin=t) includes a sample that epochs.get_data(tmin=t) excludes. Since crop uses rounding internally, users expect get_data to match that behavior for consistency.
yes, that was clear from the issue description.
If modifying the global mixin.py is risky for Raw backward compatibility, and overriding in Epochs is discouraged, do you have a suggestion on how to reconcile get_data with crop for Epochs specifically? Maybe passing a round_tmin argument to get_data?
I suspect that the problem is not limited to Epochs, but also affects Raw and Evoked (and TFR... anything that inherits the mixin). I don't have a suggestion off the top of my head; as I said before, this will need some discussion. Changing how get_data() works (to accord with crop()) --- or vice-versa --- has potentially wide-reaching consequences. I know it's a single sample, but as you've seen it's enough to break our tests, and it's also enough to change the results of user's existing analysis code, or even cause that code to crash if re-run. We don't take that lightly.
| start = 0 if tmin is None else self.time_as_index(tmin, use_rounding=True)[0] | ||
| stop = ( | ||
| n_times if tmax is None else self.time_as_index(tmax, use_rounding=True)[0] | ||
| ) |
There was a problem hiding this comment.
This will need some discussion. When use_rounding was introduced, it was determined that we shouldn't change the default:
mne/epochs.py
Outdated
|
|
||
| # handle tmin/tmax as start and stop indices into data array | ||
| n_times = self.times.size | ||
| # QUI c'è la fix specifica per le Epochs |
There was a problem hiding this comment.
code comments in English please. But also: code comments should be more specific/useful than "here is the fix for Epochs". This one isn't really needed at all IMO
| # QUI c'è la fix specifica per le Epochs |
There was a problem hiding this comment.
Sorry about that! I left a debug comment by mistake. I will remove it in the next commit
for more information, see https://pre-commit.ci
|
Thanks @drammock. I get why this is tricky regarding backward compatibility. I'll stop working on the code for now and wait for the discussion. Just let me know if you want me to check Raw or Evoked in the meantime - happy to help if needed :) |
Description
This PR fixes an off-by-one inconsistency between
epochs.crop()andepochs.get_data()caused by floating point truncation when converting time to indices.Analysis
Previously,
_handle_tmin_tmaxused the default behavior oftime_as_index(which performs a floor operation/truncation). When passing a float timet(e.g., 0.77) that is represented internally as slightly less (e.g., 0.76999...),get_data(tmin=t)would return the sample at 0.76 instead of 0.77.crop()correctly usesuse_rounding=True, leading to inconsistent results for the same input time.Fix
use_rounding=Truetotime_as_indexcalls insidemne/utils/mixin.py.mne/tests/test_epochs.pyreproducing the issue reported in Incrop(tmin)versusget_data(tmin),tminhas a different meaning #13634.Closes #13634