Redesign Buscar by axiomcura · Pull Request #74 · WayScience/buscar

axiomcura · 2026-02-13T22:52:22Z

This PR introduces a redesign of the data processing workflow.

The clustering component of the pipeline has been removed, since EMD inherently accounts for the spread of the distribution, making explicit clustering unnecessary.

As a result, Buscar becomes significantly faster.

The retained modules are: signatures, check, and metric.py, which have also been reorganized accordingly.

Once this PR is merged, all downstream analyses will be re-run and documentation updates will come as separate PRs .

Copilot

Pull request overview

This PR implements a major redesign of the Buscar data processing workflow by removing the clustering component and reorganizing the codebase. The change is motivated by the observation that EMD (Earth Mover's Distance) inherently accounts for distribution spread, making explicit clustering unnecessary. This redesign significantly improves performance.

Changes:

Removed clustering-related modules (refinement.py, heterogeneity.py, identify_hits.py) from utils/
Reorganized core modules (signatures, metrics, checks) from utils/ to new buscar/ directory
Updated measure_phenotypic_activity API to work without clustering, now comparing treatments directly to reference/target states

Reviewed changes

Copilot reviewed 13 out of 16 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
utils/refinement.py	Deleted - clustering refinement utilities no longer needed
utils/metrics.py	Deleted - replaced by redesigned buscar/metrics.py
utils/identify_hits.py	Deleted - compound hit identification removed with clustering
utils/heterogeneity.py	Deleted - clustering functionality removed from pipeline
buscar/signatures.py	New file - moved from utils/signatures.py with statistical tests for feature significance
buscar/metrics.py	New file - redesigned phenotypic activity measurement without clustering dependency
buscar/checks.py	New file - moved from utils/checks.py with data validation utilities
tests/test_hit_identifcation.py	Updated imports to reference buscar._identify_hits (non-existent module)
notebooks/4.cpjump1-analysis/nbconverted/4.run_buscar_rankings_base_on_moa.py	Updated imports with incorrect paths
notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-scores.py	Updated imports with incorrect paths
notebooks/1.compound-prioritization/nbconverted/4.measure-phenotypic-activity.py	Updated imports with incorrect paths
notebooks/1.compound-prioritization/nbconverted/1.signatures.py	Correctly updated import to buscar.signatures
notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py	Updated imports with incorrect paths and obsolete heterogeneity reference
notebooks/4.cpjump1-analysis/nbconverted/1.generate-on-off-signatures.py	Correctly updated import to buscar.signatures
.pre-commit-config.yaml	Minor version update for ruff-pre-commit (v0.15.0 → v0.15.1)
.gitignore	Added .vscode/ directory to ignore list

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py

buscar/metrics.py

notebooks/4.cpjump1-analysis/nbconverted/4.run_buscar_rankings_base_on_moa.py

notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-scores.py

notebooks/1.compound-prioritization/nbconverted/4.measure-phenotypic-activity.py

notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py

buscar/metrics.py

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…_base_on_moa.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…otypic-activity.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…ores.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

axiomcura added 8 commits February 12, 2026 11:27

removed cluster refinement module

0094350

removed heterogeneity

7b940d2

updated metrics

a51b553

created buscar module (separated notebook utils and buscar software)

2661eec

updated test module

38d4d33

updated imports to reflect module changs

36abe31

ignore vscode cached files

f6f96ac

updated metrics

5590b4d

axiomcura requested review from Copilot and wli51 and removed request for wli51 February 13, 2026 22:52

Copilot started reviewing on behalf of axiomcura February 13, 2026 22:53 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

axiomcura and others added 9 commits February 13, 2026 16:00

Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-an…

039cd78

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-an…

a927cb4

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update buscar/metrics.py

580ec7e

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/4.cpjump1-analysis/nbconverted/4.run_buscar_rankings…

3a8c5fe

…_base_on_moa.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-an…

71411f1

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update buscar/metrics.py

3a699c4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/1.compound-prioritization/nbconverted/4.measure-phen…

77cade9

…otypic-activity.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-sc…

87a34a9

…ores.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

applied copilot's changes

76e3b77

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign Buscar#74

Redesign Buscar#74
axiomcura wants to merge 17 commits intoWayScience:mainfrom
axiomcura:buscar-redesign

axiomcura commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

axiomcura commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant