Skip to content

Accelerate collect() by switching to lazy loading#340

Merged
dschwoerer merged 3 commits intomasterfrom
lazy-load-collect
Apr 15, 2026
Merged

Accelerate collect() by switching to lazy loading#340
dschwoerer merged 3 commits intomasterfrom
lazy-load-collect

Conversation

@mikekryjak
Copy link
Copy Markdown
Collaborator

This is another change that Claude came up with after #337. It rewrites xbout.load.collect() to use the new lazyload.lazy_open_boutdataset() from #336. It falls back on the original method if things break. It also makes minor improvements to the code, like not relying on the coordinates not being in a specific order.

Test results:

  • xbout.collect() before change: 24.1s
  • xbout.collect() after change: 0.12s (!!)
  • boutdata.collect(): 0.20s

So we are now 200x faster than before, and also 40% faster than boutdata. I suppose the difference could be using Dask.

For completeness, the same dataset takes 2.5s to load in its entirety using the latest lazy loading.

This PR contains #336 and should be merged after.

Falls back to the original method if this doesn't work.
@mikekryjak mikekryjak added the enhancement New feature or request label Mar 6, 2026
@dschwoerer dschwoerer merged commit 6d58963 into master Apr 15, 2026
13 checks passed
@dschwoerer dschwoerer deleted the lazy-load-collect branch April 15, 2026 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants