Generalize conflict detection to all classification fields by NoopDog · Pull Request #84 · DataBiosphere/meta-disco

NoopDog · 2026-03-27T08:24:42Z

Summary

Extends conflict detection from reference_assembly only to all 5 classification fields (data_modality, data_type, platform, reference_assembly, assay_type)
Tier-aware: higher-tier rules can still override lower-tier values (tier 3 header refining tier 2 filename is expected). Only same-tier disagreements trigger conflicts.
Tracks which tier set each field via _field_set_by_tier dict
Generalizes conflict evidence rule ID from hardcoded conflicting_reference_rules to conflicting_{field}_rules

Fixes the minimap2-vs-basecall-model bug where program_minimap2 (genomic, tier 3) silently overwrote ont_basecall_rna (transcriptomic, tier 3). Now correctly produces not_classified.

HPRC validation: identical results — zero accuracy regressions.

Also filed #83 for a pre-existing missing-evidence bug found during investigation (assay_type inference records no evidence, resulting in 12,913 files at confidence 0.0).

Closes #81

Test plan

325 tests pass (3 new + 1 updated)
HPRC validation unchanged: 100% accuracy on all dimensions
FASTA assembly override still works (tier 1 → tier 2 refinement)
Reference assembly conflict still works (same-tier, existing behavior)
minimap2 + RNA basecall → not_classified (new, was broken)
Illumina + CCS platform conflict → not_classified (new)
Reports regenerated

🤖 Generated with Claude Code

Previously only reference_assembly had conflict detection. Now all 5 classification fields (data_modality, data_type, platform, reference_assembly, assay_type) detect same-tier conflicts. Higher-tier rules can still override lower-tier values — a tier 3 header rule refining a tier 2 filename rule is expected behavior. Two same-tier rules disagreeing on a field produces not_classified. Fixes the minimap2-vs-basecall-model bug where program_minimap2 (genomic, tier 3) silently overwrote ont_basecall_rna (transcriptomic, tier 3). Closes #81 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

No accuracy changes — same HPRC validation results as before. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR extends the rule engine’s conflict detection from reference_assembly to all classification fields (data_modality, data_type, platform, reference_assembly, assay_type) with tier awareness, and updates tests/reports to reflect the new conflict behavior (e.g., minimap2 vs ONT RNA basecall).

Changes:

Generalize same-tier conflict detection to all classification fields and emit conflicting_{field}_rules evidence IDs.
Track which tier last set each field via _field_set_by_tier to allow higher-tier refinement without triggering conflicts.
Add/adjust tests for new conflict cases and regenerate validation/coverage report artifacts.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/meta_disco/rule_engine.py`	Implements tier-aware, per-field conflict detection and conflict evidence IDs.
`tests/test_rule_engine.py`	Updates reference conflict evidence expectation and adds new conflict-related tests.
`tests/test_header_classifier.py`	Adds header-based regression test for ONT RNA basecall vs minimap2; updates platform conflict expectation.
`docs/validation-report.md`	Updates run timestamp.
`docs/validation-dashboard.html`	Updates embedded validation data timestamp.
`docs/anvil-coverage-report.md`	Updates run timestamp and conflict reason wording.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…engine tests Conflict detection now appends the conflict marker to existing evidence instead of replacing it, so both the original rule and the conflict are visible. Unit tests now exercise the actual rule engine with real filenames that trigger same-tier conflicts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Only change: conflict reason text for .xg and .chain files now shows the first rule's reason instead of the conflict marker, since evidence is preserved rather than replaced. Classification values unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

NoopDog and others added 2 commits March 27, 2026 01:04

Regenerate reports after generalizing conflict detection

d20db54

No accuracy changes — same HPRC validation results as before. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 27, 2026 08:24

github-project-automation bot added this to meta-disco Mar 27, 2026

Copilot started reviewing on behalf of NoopDog March 27, 2026 08:25 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Comment thread src/meta_disco/rule_engine.py Outdated

Comment thread tests/test_rule_engine.py Outdated

NoopDog and others added 2 commits March 27, 2026 07:50

Copilot AI review requested due to automatic review settings March 27, 2026 17:36

Copilot started reviewing on behalf of NoopDog March 27, 2026 17:37 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Comment thread src/meta_disco/rule_engine.py

NoopDog mentioned this pull request Mar 27, 2026

Allow higher-tier rules to resolve conflicts from lower tiers #85

Closed

NoopDog merged commit 953ce1d into main Mar 27, 2026
4 checks passed

github-project-automation bot moved this to Done in meta-disco Mar 27, 2026

This was referenced Mar 29, 2026

Surface conflict reason when classification is not_classified due to rule disagreement #88

Open

Consider removing or refactoring consistency checking system #92

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize conflict detection to all classification fields#84

Generalize conflict detection to all classification fields#84
NoopDog merged 4 commits intomainfrom
noopdog/81-generalize-conflict-detection

NoopDog commented Mar 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NoopDog commented Mar 27, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants