-
Notifications
You must be signed in to change notification settings - Fork 243
Remove multiprocessing from NearestNeigbhor metric #4363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I was checking the failed tests (test_compute_pc_metrics_multi_processing). Using algorithm='auto' (which will usually choose 'brute' for small data sizes, say, 10K samples by ~100 features) with mp_context = fork will hang forever (thread deadlock inside BLAS/openMP; essentially child processes think already dead threads from their parents are still locking data and wait indefinitely for the lock to be released). I changed it in 20c1550 to spawn (rather than fork) new processes. I also tested performance on a smallish recording (225 units, ~700 spikes per unit). Here's the summary:
It does not seem like there is much advantage of using multiprocessing (at least with 2-8 processes in my 4-core machine). I would just go for using algorithm='brute' (or let sklearn choose) and not use multiprocessing at all to avoid any possible future issues with fork/spawn (and the kind of hidden spawn default added here); 'brute' is fast enough to be comparable to other quality metrics (and it does not seem to benefit much from multiprocessing anyway, mainly because the numerical libraries used behind the curtains are already multi-threaded). d44542e has these setup (no multiprocessing), you can pick the one you prefer. |
alejoe91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perfect! Thanks @ecobost
|
@samuelgarcia @chrishalcrow this is good to merge for me! Getting rid of the paralleization without any performance hit is great, plus @ecobost made the code much cleaner! |
Co-authored-by: Chris Halcrow <57948917+chrishalcrow@users.noreply.github.com>
Co-authored-by: Chris Halcrow <57948917+chrishalcrow@users.noreply.github.com>
|
Hello, tested on real 1hr NP2 data on a linux machine using 8 cores. Computation went from 45 minutes to 2 minutes - thanks @ecobost !! |
chrishalcrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and works well on real data
|
Give me a chance to read it quickly. |
Sure! You can merge it after you read it! |
ComputeQualityMetrics has a peak_sign argument (similar to the one in ComputeTemplateMetrics), this PR sends it to get_template_extremum_channel in prepare_data. This affects pca_metrics (if set to "pos" or "both") as it will change how the neighborhood of each unit is defined (l.154), Default behavior, peak_sign='neg', is not changed.
As I was checking pca_metrics, I slightly refactor them for legibility (hopefully line -> line equivalence is easy to see). Results are all the same (except for nn_miss_rate and nn_hit_rate, see below)
The only "major" change was to the nearest_neighbor metrics: