-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
n_jobs=None triggers complete parallel setup with ~200x overhead #13770
Description
Description of the problem
According to the docstring, settings n_jobs=None is equal to setting n_jobs=1 unless a joblib context is given that overwrites it.
n_jobsint | None
The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the
joblib package. None (default) is a marker for 'unset' that will be interpreted as n_jobs=1
(sequential execution) unless the call is performed under a joblib:joblib.parallel_config
context manager that sets another value for n_jobs.
However, when n_jobs=None, internally, the whole mne.parallel.parallel_func chain is triggered, including a disk read for the config. This is unnecessary in most cases unless a :joblib.parallel_config context manager is used.
This creates significant overhead, especially if calling functions several times in row and if on a system with a slow hard drive or network drive. (i.e. see https://mne.discourse.group/t/lock-file-access-error-define-location-of-lockfiles/11759).
In my case, I was running a power contour plot with 1000 function calls each of 80x60 cells and the file read was a massive blocker, leading to the CPU utilization saturating at 50% even if running each cell in its own process. This resolved when settings n_jobs=1 explicitly.
PS: Let me know if this problem is too niche to solve. As far as I can see there's quite a bit of logic involved to figure out the right n_jobs if None is passed. I assume it could be just another elif clause that could speed things up if the only option to be considered if a joblib context is active.
If it's just the joblib context detection, I can submit a PR :)
Steps to reproduce
import mne
from mne import parallel
import time
start = time.time()
for i in range(100):
parallel.parallel_func(lambda x:x, n_jobs=1)
print('n_jobs=1', time.time()-start)
# n_jobs=1 0.00015974044799804688
start = time.time()
for i in range(100):
parallel.parallel_func(lambda x:x, n_jobs=None)
print('n_jobs=None', time.time()-start)
# n_jobs=1 0.00015974044799804688
# n_jobs=None 0.032094478607177734Expected results
n_jobs=None resolves quickly to n_jobs=1 while retaining ability to get overwritten via joblib contextmanager. No Parallel pool is started and serial execution is done without joblib
Actual results
n_jobs=None adds overhead via file read. joblib.Parallel(1) is spawned instead of the list approach when explicityl settings n_jobs=1. MNE config is read from disk for each function call.
Additional information
mne.sys_info()
Platform Linux-6.8.0-106-generic-x86_64-with-glibc2.39
Python 3.14.0 | packaged by Anaconda, Inc. | (main, Oct 22 2025, 09:06:03) [GCC 11.2.0]
Executable /home/simon/anaconda3/envs/default/bin/python
CPU AMD Ryzen 7 5700X 8-Core Processor (16 cores)
Memory 31.3 GiB
Core
├☑ mne 1.11.0 (latest release)
├☑ numpy 2.3.5 (OpenBLAS 0.3.30 with 16 threads)
├☑ scipy 1.17.0
└☑ matplotlib 3.10.8 (backend=qtagg)
Numerical (optional)
├☑ sklearn 1.8.0
├☑ numba 0.63.1
├☑ nibabel 5.3.3
├☑ pandas 3.0.0
├☑ h5io 0.2.5
├☑ h5py 3.15.1
└☐ unavailable nilearn, dipy, openmeeg, cupy
Visualization (optional)
├☑ vtk 9.6.0.rc2
├☑ qtpy 2.4.3 (PySide6=6.10.1)
└☐ unavailable pyvista, pyvistaqt, ipympl, pyqtgraph, mne-qt-browser, ipywidgets, trame_client, trame_server, trame_vtk, trame_vuetify
Ecosystem (optional)
├☑ mne-bids 0.18.0
├☑ mne-bids-pipeline 1.9.0
├☑ eeglabio 0.1.3
├☑ edfio 0.4.12
├☑ curryreader 0.1.2
├☑ pybv 0.7.6
├☑ defusedxml 0.7.1
└☐ unavailable mne-nirs, mne-features, mne-connectivity, mne-icalabel, neo, mffpy, antio