Skip to content

ADR-029: External Intelligence Providers for SONA Learning#190

Open
grparry wants to merge 825 commits intoruvnet:mainfrom
grparry:proposal/adr-029-intelligence-providers
Open

ADR-029: External Intelligence Providers for SONA Learning#190
grparry wants to merge 825 commits intoruvnet:mainfrom
grparry:proposal/adr-029-intelligence-providers

Conversation

@grparry
Copy link
Copy Markdown

@grparry grparry commented Feb 20, 2026

Summary

This proposes and implements a trait-based extension point — IntelligenceProvider — that lets external systems feed quality signals into ruvllm's SONA learning loops, embedding classifier, and model router without modifying ruvllm core.

Resolves #191

What's implemented

File Description
src/intelligence/mod.rs IntelligenceProvider trait, FileSignalProvider, IntelligenceProviderLoader registry, signal/weight types
claude_flow/model_router.rs calibration_bias() on TaskComplexityAnalyzer (requires 10+ feedback records)
lib.rs Public re-exports at crate root
docs/adr/ADR-029-* Status updated to "Accepted (Implemented)"

The trait

pub trait IntelligenceProvider: Send + Sync {
    fn name(&self) -> &str;
    fn load_signals(&self) -> Result<Vec<QualitySignal>>;
    fn quality_weights(&self) -> Option<QualityWeights> { None }
}
  • Providers are registered with IntelligenceProviderLoader and called during load_all()
  • A built-in FileSignalProvider covers the common case (non-Rust systems write JSON)
  • Zero overhead when no providers are registered
  • Backward compatible — existing loading is unchanged

Design decisions

  • Follows LlmBackend pattern — trait object behind Box<dyn IntelligenceProvider>
  • Non-fatal provider errors — one provider failing doesn't block others
  • Two JSON formats — bare array [...] or wrapped {"signals": [...]}
  • calibration_bias() requires 10+ records — avoids noisy corrections from sparse data

Test plan

  • cargo check -p ruvllm compiles
  • 7 intelligence module tests pass (file parsing, loader, fault tolerance)
  • 2 existing integration tests still pass
  • Manual: Write a signal JSON file, register FileSignalProvider, verify load_all() returns signals

🤖 Generated with Claude Code

claude and others added 30 commits January 2, 2026 14:42
…ge networks

- Add networks.js with NetworkGenesis, NetworkRegistry, and MultiNetworkManager
- Support for public, private (invite-only), and consortium networks
- Each network has its own genesis block, QDAG ledger, and peer registry
- Network IDs derived from genesis hash for tamper-evident identity
- Invite code generation for private networks with base64url encoding

New CLI options:
  --networks       List all known networks
  --discover       Discover available networks
  --create-network Create a new network with custom name/type
  --network-type   Set network type (public/private/consortium)
  --switch         Switch active network for contributions
  --invite         Provide invite code for private networks

Security features:
- Network isolation with separate storage per network
- Cryptographic network identity from genesis hash
- Invite codes for access control on private networks
- Ed25519 signatures for network announcements

Well-known networks:
- mainnet: Primary public compute network
- testnet: Testing and development network
…nhancements

Analysis module:
- Add complexity analysis (cyclomatic, cognitive, Halstead metrics)
- Add security scanning (SQL injection, XSS, command injection detection)
- Add pattern detection (code smells, design patterns)

Workers module:
- Add native worker implementation for parallel processing
- Add benchmark worker for performance testing
- Add worker type definitions

Core improvements:
- Add adaptive embedder with dynamic model selection
- Add ONNX optimized embeddings with caching
- Update intelligence engine with enhanced learning
- Update parallel workers with better concurrency

Dashboard enhancements:
- Add relay client service for Edge-Net communication
- Update network stats and specialized networks components
- Update network store with improved state management
- Update type definitions

Configuration:
- Add custom workers skill
- Add agentic-flow and ruvector fast scripts
- Update settings and gitignore

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat(dashboard): Edge-Net Time Crystal Dashboard
  Built from commit 282273a

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
Merging Edge-Net join CLI with multi-contributor support
  Built from commit 73a1bea

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
…net#104)

* feat: Add comprehensive dataset discovery framework for RuVector

This commit introduces a powerful dataset discovery framework with
integrations for three high-impact public data sources:

## Core Framework (examples/data/framework/)
- DataIngester: Streaming ingestion with batching and deduplication
- CoherenceEngine: Min-cut based coherence signal computation
- DiscoveryEngine: Pattern detection for emerging structures

## OpenAlex Integration (examples/data/openalex/)
- Research frontier radar: Detect emerging fields via boundary motion
- Cross-domain bridge detection: Find connector subgraphs
- Topic graph construction from citation networks
- Full API client with cursor-based pagination

## Climate Integration (examples/data/climate/)
- NOAA GHCN and NASA Earthdata clients
- Sensor network graph construction
- Regime shift detection using min-cut coherence breaks
- Time series vectorization for similarity search
- Seasonal decomposition analysis

## SEC EDGAR Integration (examples/data/edgar/)
- XBRL financial statement parsing
- Peer network construction
- Coherence watch: Detect fundamental vs narrative divergence
- Filing analysis with sentiment and risk extraction
- Cross-company contagion detection

Each integration leverages RuVector's unique capabilities:
- Vector memory for semantic similarity
- Graph structures for relationship modeling
- Dynamic min-cut for coherence signal computation
- Time series embeddings for pattern matching

Discovery thesis: Detect emerging patterns before they have names,
find non-obvious cross-domain bridges, and map causality chains.

* feat: Add working discovery examples for climate and financial data

- Fix borrow checker issues in coherence analysis modules
- Create standalone workspace for data examples
- Add regime_detector.rs for climate network coherence analysis
- Add coherence_watch.rs for SEC EDGAR narrative-fundamental divergence
- Add frontier_radar.rs template for OpenAlex research discovery
- Update Cargo.toml dependencies for example executability
- Add rand dev-dependency for demo data generation

Examples successfully detect:
- Climate regime shifts via min-cut coherence analysis
- Cross-regional teleconnection patterns
- Fundamental vs narrative divergence in SEC filings
- Sector fragmentation signals in financial data

* feat: Add working discovery examples for climate and financial data

- Add RuVector-native discovery engine with Stoer-Wagner min-cut
- Implement cross-domain pattern detection (climate ↔ finance)
- Add cosine similarity for vector-based semantic matching
- Create cross_domain_discovery example demonstrating:
  - 42% cross-domain edge connectivity
  - Bridge formation detection with 0.73-0.76 confidence
  - Climate and finance correlation hypothesis generation

* perf: Add optimized discovery engine with SIMD and parallel processing

Performance improvements:
- 8.84x speedup for vector insertion via parallel batching
- 2.91x SIMD speedup for cosine similarity (chunked + AVX2)
- Incremental graph updates with adjacency caching
- Early termination in Stoer-Wagner min-cut

Statistical analysis features:
- P-value computation for pattern significance
- Effect size (Cohen's d) calculation
- 95% confidence intervals
- Granger-style temporal causality detection

Benchmark results (248 vectors, 3 domains):
- Cross-domain edges: 34.9% of total graph
- Domain coherence: Climate 0.74, Finance 0.94, Research 0.97
- Detected climate-finance temporal correlations

* feat: Add discovery hunter and comprehensive README tutorial

New features:
- Discovery hunter example with multi-phase pattern detection
- Climate extremes, financial stress, and research data generation
- Cross-domain hypothesis generation
- Anomaly injection testing

Documentation:
- Detailed README with step-by-step tutorial
- API reference for OptimizedConfig and patterns
- Performance benchmarks and best practices
- Troubleshooting guide

* feat: Complete discovery framework with all features

HNSW Indexing (754 lines):
- O(log n) approximate nearest neighbor search
- Configurable M, ef_construction parameters
- Cosine, Euclidean, Manhattan distance metrics
- Batch insertion support

API Clients (888 lines):
- OpenAlex: academic works, authors, topics
- NOAA: climate observations
- SEC EDGAR: company filings
- Rate limiting and retry logic

Persistence (638 lines):
- Save/load engine state and patterns
- Gzip compression (3-10x size reduction)
- Incremental pattern appending

CLI Tool (1,109 lines):
- discover, benchmark, analyze, export commands
- Colored terminal output
- JSON and human-readable formats

Streaming (570 lines):
- Async stream processing
- Sliding and tumbling windows
- Real-time pattern detection
- Backpressure handling

Tests (30 unit tests):
- Stoer-Wagner min-cut verification
- SIMD cosine similarity accuracy
- Statistical significance
- Granger causality
- Cross-domain patterns

Benchmarks:
- CLI: 176 vectors/sec @ 2000 vectors
- SIMD: 6.82M ops/sec (2.06x speedup)
- Vector insertion: 1.61x speedup
- Total: 44.74ms for 248 vectors

* feat: Add visualization, export, forecasting, and real data discovery

Visualization (555 lines):
- ASCII graph rendering with box-drawing characters
- Domain-based ANSI coloring (Climate=blue, Finance=green, Research=yellow)
- Coherence timeline sparklines
- Pattern summary dashboard
- Domain connectivity matrix

Export (650 lines):
- GraphML export for Gephi/Cytoscape
- DOT export for Graphviz
- CSV export for patterns and coherence history
- Filtered export by domain, weight, time range
- Batch export with README generation

Forecasting (525 lines):
- Holt's double exponential smoothing for trend
- CUSUM-based regime change detection (70.67% accuracy)
- Cross-domain correlation forecasting (r=1.000)
- Prediction intervals (95% CI)
- Anomaly probability scoring

Real Data Discovery:
- Fetched 80 actual papers from OpenAlex API
- Topics: climate risk, stranded assets, carbon pricing, physical risk, transition risk
- Built coherence graph: 592 nodes, 1049 edges
- Average min-cut: 185.76 (well-connected research cluster)

* feat: Add medical, real-time, and knowledge graph data sources

New API Clients:
- PubMed E-utilities for medical literature search (NCBI)
- ClinicalTrials.gov v2 API for clinical study data
- FDA OpenFDA for drug adverse events and recalls
- Wikipedia article search and extraction
- Wikidata SPARQL queries for structured knowledge

Real-time Features:
- RSS/Atom feed parsing with deduplication
- News aggregator with multiple source support
- WebSocket and REST polling infrastructure
- Event streaming with configurable windows

Examples:
- medical_discovery: PubMed + ClinicalTrials + FDA integration
- multi_domain_discovery: Climate-health-finance triangulation
- wiki_discovery: Wikipedia/Wikidata knowledge graph
- realtime_feeds: News feed aggregation demo

Tested across 70+ unit tests with all domains integrated.

* feat: Add economic, patent, and ArXiv data source clients

New API Clients:
- FredClient: Federal Reserve economic indicators (GDP, CPI, unemployment)
- WorldBankClient: Global development indicators and climate data
- AlphaVantageClient: Stock market daily prices
- ArxivClient: Scientific preprint search with category and date filters
- UsptoPatentClient: USPTO patent search by keyword, assignee, CPC class
- EpoClient: Placeholder for European patent search

New Domain:
- Domain::Economic for economic/financial indicator data

Updated Exports:
- Domain colors and shapes for Economic in visualization and export

Examples:
- economic_discovery: FRED + World Bank integration demo
- arxiv_discovery: AI/ML/Climate paper search demo
- patent_discovery: Climate tech and AI patent search demo

All 85 tests passing. APIs tested with live endpoints.

* feat: Add Semantic Scholar, bioRxiv/medRxiv, and CrossRef research clients

New Research API Clients:
- SemanticScholarClient: Citation graph analysis, paper search, author lookup
  - Methods: search_papers, get_citations, get_references, search_by_field
  - Builds citation networks for graph analysis

- BiorxivClient: Life sciences preprints
  - Methods: search_recent, search_by_category (neuroscience, genomics, etc.)
  - Automatic conversion to Domain::Research

- MedrxivClient: Medical preprints
  - Methods: search_covid, search_clinical, search_by_date_range
  - Automatic conversion to Domain::Medical

- CrossRefClient: DOI metadata and scholarly communication
  - Methods: search_works, get_work, search_by_funder, get_citations
  - Polite pool support for better rate limits

All clients include:
- Rate limiting respecting API guidelines
- Retry logic with exponential backoff
- SemanticVector conversion with rich metadata
- Comprehensive unit tests

Examples:
- biorxiv_discovery: Fetch neuroscience and clinical research
- crossref_demo: Search publications, funders, datasets

Total: 104 tests passing, ~2,500 new lines of code

* feat: Add MCP server with STDIO/SSE transport and optimized discovery

MCP Server Implementation (mcp_server.rs):
- JSON-RPC 2.0 protocol with MCP 2024-11-05 compliance
- Dual transport: STDIO for CLI, SSE for HTTP streaming
- 22 discovery tools exposing all data sources:
  - Research: OpenAlex, ArXiv, Semantic Scholar, CrossRef, bioRxiv, medRxiv
  - Medical: PubMed, ClinicalTrials.gov, FDA
  - Economic: FRED, World Bank
  - Climate: NOAA
  - Knowledge: Wikipedia, Wikidata SPARQL
  - Discovery: Multi-source, coherence analysis, pattern detection
- Resources: discovery://patterns, discovery://graph, discovery://history
- Pre-built prompts: cross_domain_discovery, citation_analysis, trend_detection

Binary Entry Point (bin/mcp_discovery.rs):
- CLI arguments with clap
- Configurable discovery parameters
- STDIO/SSE mode selection

Optimized Discovery Runner:
- Parallel data fetching with tokio::join!
- SIMD-accelerated vector operations (1.1M comparisons/sec)
- 6-phase discovery pipeline with benchmarking
- Statistical significance testing (p-values)
- Cross-domain correlation analysis
- CSV export and hypothesis report generation

Performance Results:
- 180 vectors from 3 sources in 7.5s
- 686 edges computed in 8ms
- SIMD throughput: 1,122,216 comparisons/sec

All 106 tests passing.

* feat: Add space, genomics, and physics data source clients

Add exotic data source integrations:
- Space clients: NASA (APOD, NEO, Mars, DONKI), Exoplanet Archive, SpaceX API, TNS Astronomy
- Genomics clients: NCBI (genes, proteins, SNPs), UniProt, Ensembl, GWAS Catalog
- Physics clients: USGS Earthquakes, CERN Open Data, Argo Ocean, Materials Project

New domains: Space, Genomics, Physics, Seismic, Ocean

All 106 tests passing, SIMD benchmark: 208k comparisons/sec

* chore: Update export/visualization and output files

* docs: Add API client inventory and reference documentation

* fix: Update API clients for 2025 endpoint changes

- ArXiv: Switch from HTTP to HTTPS (export.arxiv.org)
- USPTO: Migrate to PatentSearch API v2 (search.patentsview.org)
  - Legacy API (api.patentsview.org) discontinued May 2025
  - Updated query format from POST to GET
  - Note: May require API authentication
- FRED: Require API key (mandatory as of 2025)
  - Added error handling for missing API key
  - Added response error field parsing

All tests passing, ArXiv discovery confirmed working

* feat: Implement comprehensive 2025 API client library (11,810 lines)

Add 7 new API client modules implementing 35+ data sources:

Academic APIs (1,328 lines):
- OpenAlexClient, CoreClient, EricClient, UnpaywallClient

Finance APIs (1,517 lines):
- FinnhubClient, TwelveDataClient, CoinGeckoClient, EcbClient, BlsClient

Geospatial APIs (1,250 lines):
- NominatimClient, OverpassClient, GeonamesClient, OpenElevationClient

News & Social APIs (1,606 lines):
- HackerNewsClient, GuardianClient, NewsDataClient, RedditClient

Government APIs (2,354 lines):
- CensusClient, DataGovClient, EuOpenDataClient, UkGovClient
- WorldBankGovClient, UNDataClient

AI/ML APIs (2,035 lines):
- HuggingFaceClient, OllamaClient, ReplicateClient
- TogetherAiClient, PapersWithCodeClient

Transportation APIs (1,720 lines):
- GtfsClient, MobilityDatabaseClient
- OpenRouteServiceClient, OpenChargeMapClient

All clients include:
- Async/await with tokio and reqwest
- Mock data fallback for testing without API keys
- Rate limiting with configurable delays
- SemanticVector conversion for RuVector integration
- Comprehensive unit tests (252 total tests passing)
- Full error handling with FrameworkError

* docs: Add API client documentation for new implementations

Add documentation for:
- Geospatial clients (Nominatim, Overpass, Geonames, OpenElevation)
- ML clients (HuggingFace, Ollama, Replicate, Together, PapersWithCode)
- News clients (HackerNews, Guardian, NewsData, Reddit)
- Finance clients implementation notes

* feat: Implement dynamic min-cut tracking system (SODA 2026)

Based on El-Hayek, Henzinger, Li (SODA 2026) subpolynomial dynamic min-cut algorithm.

Core Components (2,626 lines):
- dynamic_mincut.rs (1,579 lines): EulerTourTree, DynamicCutWatcher, LocalMinCutProcedure
- cut_aware_hnsw.rs (1,047 lines): CutAwareHNSW, CoherenceZones, CutGatedSearch

Key Features:
- O(log n) connectivity queries via Euler-tour trees
- n^{o(1)} update time when λ ≤ 2^{(log n)^{3/4}} (vs O(n³) Stoer-Wagner)
- Cut-gated HNSW search that respects coherence boundaries
- Real-time cut monitoring with threshold-based deep evaluation
- Thread-safe structures with Arc<RwLock>

Performance (benchmarked):
- 75x speedup over periodic recomputation
- O(1) min-cut queries vs O(n³) recompute
- ~25µs per edge update

Tests & Benchmarks:
- 36+ unit tests across both modules
- 5 benchmark suites comparing periodic vs dynamic
- Integration with existing OptimizedDiscoveryEngine

This enables real-time coherence tracking in RuVector, transforming
min-cut from an expensive periodic computation to a maintained invariant.

---------

Co-authored-by: Claude <noreply@anthropic.com>
  Built from commit b07fb3e

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
  Built from commit 39277a4

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
  Built from commit b5b4858

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
  Built from commit ae4d5db

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
…es (ruvnet#106)

## Summary
- Add PowerInfer-style sparse inference engine with precision lanes
- Add memory module with QuantizedWeights and NeuronCache
- Fix compilation and test issues
- Demonstrated 2.9-8.7x speedup at typical sparsity levels
- Published to crates.io as ruvector-sparse-inference v0.1.30

## Key Features
- Low-rank predictor using P·Q matrix factorization for fast neuron selection
- Sparse FFN kernels that only compute active neurons
- SIMD optimization for AVX2, SSE4.1, NEON, and WASM SIMD
- GGUF parser with full quantization support (Q4_0 through Q6_K)
- Precision lanes (3/5/7-bit layered quantization)
- π integration for low-precision systems

🤖 Generated with [Claude Code](https://claude.com/claude-code)
  Built from commit 76cec56

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
…ions

Key optimizations in v0.1.31:
- W2 matrix stored transposed for contiguous row access during sparse accumulation
- SIMD GELU/SiLU using AVX2+FMA polynomial approximations
- Cached SIMD feature detection with OnceLock (eliminates runtime CPUID calls)
- SIMD axpy for vectorized weight accumulation

Benchmark results (512 input, 2048 hidden):
- 10% active: 130µs (83% reduction, 52× vs dense)
- 30% active: 383µs (83% reduction, 18× vs dense)
- 50% active: 651µs (83% reduction, 10× vs dense)
- 70% active: 912µs (83% reduction, 7× vs dense)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
  Built from commit 253faf3

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
…mbeddings (ruvnet#107)

## New Features
- HNSW Integration: O(log n) similarity search replaces O(n²) brute force (10-50x speedup)
- Similarity Cache: 2-3x speedup for repeated similarity queries
- Batch ONNX Embeddings: Chunked processing with progress callbacks
- Shared Utils Module: cosine_similarity, euclidean_distance, normalize_vector
- Auto-connect by Embeddings: CoherenceEngine creates edges from vector similarity

## Performance Improvements
- 8.8x faster batch vector insertion (parallel processing)
- 10-50x faster similarity search (HNSW vs brute force)
- 2.9x faster similarity computation (SIMD acceleration)
- 2-3x faster repeated queries (similarity cache)

## Files Changed
- coherence.rs: HNSW integration, new CoherenceConfig fields
- optimized.rs: Similarity cache implementation
- utils.rs: New shared utility functions
- api_clients.rs: Batch embedding methods (embed_batch_chunked, embed_batch_with_progress)
- README.md: Documented all new features and configuration options

Published as ruvector-data-framework v0.3.0 on crates.io

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
  Built from commit 1a8ab83

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
)

Merge PR ruvnet#109: feat(math): Add ruvector-math crate with advanced algorithms

Includes:
- ruvector-math: Optimal Transport, Information Geometry, Product Manifolds, Tropical Algebra, Tensor Networks, Spectral Methods, Persistent Homology, Polynomial Optimization
- ruvector-attention: 7-theory attention mechanisms
- ruvector-math-wasm: WASM bindings
- publish-all.yml: Build & publish workflow for all platforms

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
  Built from commit 4489e68

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
- Badges (npm, crates.io, license, WASM)
- Feature overview
- Installation instructions
- Quick start examples (Browser & Node.js)
- Use cases: Distribution comparison, Vector search, Image comparison, Natural gradient
- API reference
- Performance benchmarks
- TypeScript support
- Build instructions
- Related packages

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
  Built from commit 1da4ff9

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
- Rename npm package from ruvector-math-wasm to @ruvector/math-wasm
- Update README with correct scoped package name
- Update workflow to publish with scoped name
- Add scripts/test-wasm.mjs for WASM package testing
- Consistent with @ruvector/attention-* naming convention

Published:
- @ruvector/math-wasm@0.1.31 on npm

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
  Built from commit ab97151

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
  Built from commit 5834cd0

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
claude and others added 25 commits February 11, 2026 05:03
Complete README rewrite reflecting the final state of the project:
- Added "What It Does" section showing actual 8-stage demo output
- Added RVDNA AI-native format section with format comparison table
- Added real gene data section (HBB, TP53, BRCA1, CYP2D6, INS)
- Added actual Criterion benchmark numbers (155ns SNP, 12ms full pipeline)
- Fixed Quick Start to match working binary commands
- Added collapsible module guides with accurate line counts
- Added test suite summary (87 tests, zero mocks)
- Added project structure tree with all 13 source files
- Added 13 ADR index table
- Updated architecture diagram to include RVDNA output stage

https://claude.ai/code/session_013B6stXbYwAkWHbE16sjUrq
- Affine gap scoring: 3-matrix Smith-Waterman (H/E/F) with flat 1D
  arrays for cache-friendly access, direct slice indexing
- Indel detection: call_indel() for insertion/deletion from pileup data
- VCF output: VCFv4.3 format with proper CHROM/POS/REF/ALT/QUAL columns
- CYP2C19 pharmacogenomics: star allele calling (*1/*2/*3/*17),
  phenotype prediction, drug recommendations (clopidogrel, voriconazole)
- Cancer signal detection: methylation entropy + extreme ratio scoring,
  CancerSignalDetector with configurable risk threshold
- Molecular weight: monoisotopic Da for all 20 amino acids
- Isoelectric point: Henderson-Hasselbalch bisection with sidechain pKa
- K-mer encoding: zero-allocation canonical hashing (hash both strands,
  take min) eliminates O(n) Vec allocs per sliding window
- CRC32: lookup table replaces bit-by-bit (~8x faster header checksums)
- Benchmarks: added RVDNA, epigenomics, protein analysis groups

95 tests pass (54 lib + 12 kmer + 17 pipeline + 12 security)

https://claude.ai/code/session_013B6stXbYwAkWHbE16sjUrq
Smith-Waterman: rolling 2-row DP replaces 3 full (Q+1)*(R+1) matrices.
Only prev+curr rows for H/E, single scalar for F. Memory drops from
~600KB to ~12KB for 100x500bp alignment, fitting L1 cache. Traceback
matrix retained (tb==0 encodes stop condition, no full H needed).

K-mer encoding: zero-allocation canonical hashing eliminates Vec alloc
per k-mer in MinHash::sketch() via dual MurmurHash3 (fwd + rc strands).

types.rs to_kmer_vector: rolling polynomial hash computes O(1) per
k-mer instead of O(k). Removes leading nucleotide, shifts, adds
trailing in constant time using precomputed 5^(k-1).

Benchmarks (100bp query x 500bp ref / k=11):
  kmer/encode_1kb:    4.1µs → 2.3µs  (1.78x)
  kmer/encode_100kb:  364µs → 199µs  (1.83x)
  smith_waterman:     416µs → 386µs  (1.08x, 10x less memory)
  full pipeline:      1.98ms → 1.52ms (1.30x end-to-end)

95 tests pass, zero failures.

https://claude.ai/code/session_013B6stXbYwAkWHbE16sjUrq
  Built from commit b427e9c

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
  Built from commit 7549158

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
Rename dna-analyzer-example to rvdna across all source files, tests,
and benchmarks. Add crates.io metadata (repository, docs, keywords).
Publish rvdna v0.1.0 to crates.io and @ruvector/rvdna v0.1.0 to npm
with NAPI-RS platform loader, JS fallbacks, and TypeScript definitions.

Also publishes workspace deps at v2.0.2: ruvector-math, ruvector-core,
ruvector-filter, ruvector-collections, ruvector-graph, ruvector-gnn.

Co-Authored-By: claude-flow <ruv@ruv.net>
Tailored README with JS/TS code examples, API reference tables,
WASM tutorial (browser + bundler setup), platform support table,
format comparison, and speed benchmarks. Bump to v0.1.1.

Co-Authored-By: claude-flow <ruv@ruv.net>
- Add "Why This Exists" section: AI for instant, private, free
  genomic diagnostics available to everyone
- Add install table with crates.io and npm links
- Add full npm API table with JS examples and NAPI-RS platform matrix
- Replace ASCII architecture with 4 mermaid diagrams in collapsed
  sections: pipeline, .rvdna format layout, data flow, WASM deployment
- Add collapsed rvDNA section to root README.md

Co-Authored-By: claude-flow <ruv@ruv.net>
  Built from commit 2c68b14

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
- Capabilities: new "Genomics & Health" section (items 22-25)
- Installation table: cargo add rvdna, npm install @ruvector/rvdna
- npm Packages: @ruvector/rvdna under "Genomics & Health"
- Rust Crates: rvdna with crates.io badge and feature summary
- Updated capability count from 30+ to 34

Co-Authored-By: claude-flow <ruv@ruv.net>
  Built from commit 4a020a0

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
  Built from commit 2239d7c

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
Crates added:
- ruvector-delta-core, delta-graph, delta-index, delta-consensus,
  delta-wasm (behavioral change tracking subsystem)
- profiling (real-time coherence diagnostics)

Examples added:
- dna (rvDNA genomic analysis)
- delta-behavior (change tracking math)
- data (dataset discovery framework)
- prime-radiant (coherence engine demos)
- benchmarks (temporal reasoning benchmarks)
- vwm-viewer (visual vector world model viewer)

Updated counts: 70 crates, 34 examples, 34 capabilities.

Co-Authored-By: claude-flow <ruv@ruv.net>
  Built from commit 1fc3beb

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
Transforms ruqu from classical coherence monitor into full-stack quantum execution intelligence engine (~2K to ~24K lines).

New: StateVector, Stabilizer, TensorNetwork, Clifford+T, and Hardware simulation backends. Cost-model planner, surface code decoder (union-find O(n*alpha(n))), QEC scheduler, noise models, OpenQASM 3.0 export, deterministic replay, and cross-backend verification.

PR ruvnet#161
  Built from commit 1fc6a1a

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
Published to crates.io: ruqu-core, ruqu-algorithms, ruqu-exotic, ruqu-wasm
Published to npm: @ruvector/ruqu-wasm@2.0.4

Co-Authored-By: claude-flow <ruv@ruv.net>
  Built from commit 384e67e

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
- Updated ruqu-core README with 5 simulation backends, cost-model planner,
  QEC control plane, OpenQASM 3.0, cryptographic witnesses, transpiler
- Fixed ruqu-wasm npm badge and imports to use @ruvector/ruqu-wasm scope
- Published to crates.io: ruqu-core, ruqu-algorithms, ruqu-exotic, ruqu-wasm
- Published to npm: @ruvector/ruqu-wasm@2.0.5

Co-Authored-By: claude-flow <ruv@ruv.net>
  Built from commit bc9ba4d

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
…vnet#163)

* feat(ospipe): implement OSpipe screenpipe integration with WASM + TypeScript SDK

Adds the OSpipe crate providing a quantum-enhanced screenpipe integration layer:
- Rust core library (7 modules): capture, storage, search, pipeline, safety, config, wasm
- WASM bindings via wasm-bindgen for browser deployment
- TypeScript SDK (@ruvector/ospipe) with SSE streaming and hybrid search
- Frame deduplication, PII safety gate, query routing, cosine similarity search
- 56 tests passing (24 unit + 32 integration), builds for native + wasm32
- Comprehensive ADR with Windows/macOS/Linux/WASM integration plans
- CI stub for cross-platform matrix builds (Linux, Windows, macOS, WASM)

Co-Authored-By: claude-flow <ruv@ruv.net>

* chore(ospipe): add README, fix clippy warnings, optimize dedup and pipeline

- Add comprehensive README.md with features, comparison tables, quick
  start guides, collapsed configuration reference, and API docs
- Fix all default clippy warnings (auto-fix + manual)
- Replace Vec with VecDeque in FrameDeduplicator for O(1) eviction
- Remove redundant frame.clone() in ingestion pipeline (move instead)
- Add is_empty() to WASM OsPipeWasm type
- Fix broken intra-doc link for cfg-gated bindings module
- Remove unused imports in integration tests (FrameContent, SearchConfig)

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(ospipe): integrate graph, attention, GNN, and quantum crates (Phase 2-4)

Add four new OSpipe modules integrating RuVector crates:

- graph: KnowledgeGraph wrapping ruvector-graph with heuristic entity
  extraction (URLs, emails, @mentions, capitalized phrases), entity/
  relationship CRUD, and frame entity ingestion
- search/reranker: AttentionReranker using ruvector-attention scaled
  dot-product attention for result re-ranking (0.6*attention + 0.4*cosine)
- learning: SearchLearner with EWC (ruvector-gnn) for continual learning
  without catastrophic forgetting, ReplayBuffer for feedback, and
  EmbeddingQuantizer for age-based vector compression
- quantum: QuantumSearch using ruqu-algorithms QAOA for diversity selection,
  Grover-inspired amplitude boosting, and optimal iteration estimation

All modules use cfg-gated dual implementations (native + WASM stub).
60 tests passing (59 integration + 1 doc-test), native + WASM builds clean.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(ospipe): complete all 15 gap items — HNSW, persistence, REST API, MMR, safety fixes

Implements all remaining OSpipe features from the gap analysis:

High — Core functionality:
- HNSW indexing via ruvector-core with O(log n) ANN search (HnswVectorStore)
- EmbeddingModel trait + RuvectorEmbeddingModel for pluggable embedding backends
- JSON-file persistence layer (PersistenceLayer) for frames and config
- Axum REST API server matching TypeScript SDK endpoints (/search, /graph, /health, /stats, /route)
- Enhanced search pipeline wired into ingestion (router -> rerank -> quantum diversity)

Medium — Correctness:
- WASM/native routing consistency (aligned keyword sets and priority order)
- WASM/native safety consistency (email detection, deny keywords, CC/SSN patterns)
- MMR (Maximal Marginal Relevance) reranker for diversity vs relevance tradeoff
- Delete and update_metadata APIs on VectorStore and HnswVectorStore
- Email redaction preserves surrounding whitespace (tabs, newlines, multi-space)

Lower — Polish:
- TypeScript SDK: fetchWithRetry with exponential backoff, timeout, AbortSignal
- console_error_panic_hook init in WASM module
- WASM test scaffold (tests/wasm.rs)
- Quantization tiers in config (None -> Scalar -> Product -> Binary by age)
- All clippy warnings resolved (0 warnings)

82 tests passing, 1 doc-test passing, 0 clippy warnings.

Co-Authored-By: claude-flow <ruv@ruv.net>

* chore: update Cargo.lock after OSpipe dependency changes

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(ospipe): add server binary, WASM build, version-pin deps for publishing

- Add ospipe-server binary with CLI args (--port, --data-dir, --help, --version)
- Add tracing-subscriber for structured logging
- Version-pin all 9 path dependencies for crates.io readiness
- Fix ref -> ref mut for KnowledgeGraph mutable borrow in pipeline
- Fix redundant rustdoc link in embedding.rs
- Update ospipe-wasm package.json to match wasm-pack output filenames
- WASM build produces 145KB binary with full browser API

Build artifacts (not committed, in dist/):
- ospipe-server-linux-x86_64 (1.8MB)
- ospipe-server-linux-arm64 (1.6MB)
- ospipe-server-windows-x86_64.exe (3.9MB)
- ospipe_bg.wasm (145KB)
- @ruvector/ospipe npm tarball (13.9KB)

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: add OSpipe to root README, publish ospipe + deps to crates.io

Add OSpipe personal AI memory section to root README with features,
comparison table, install commands, and Rust quickstart.

Published to registries:
- ospipe v0.1.0 (crates.io)
- ruvector-delta-core v0.1.0 (crates.io)
- ruvector-cluster v2.0.2 (crates.io)
- ruvector-router-core v2.0.2 (crates.io)
- @ruvector/ospipe v0.1.0 (npm)
- @ruvector/ospipe-wasm v0.1.0 (npm)

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: add uuid dev-dep for tests, bump rvlite to 0.2.1

- Add uuid to OSpipe dev-dependencies to fix version mismatch in
  integration tests
- Bump rvlite npm package to 0.2.1 (0.2.0 blocked by npm)

Co-Authored-By: claude-flow <ruv@ruv.net>
  Built from commit 5b2edc4

  Platforms updated:
  - linux-x64-gnu
  - linux-arm64-gnu
  - darwin-x64
  - darwin-arm64
  - win32-x64-msvc

  🤖 Generated by GitHub Actions
ADR-029 proposes a trait-based extension point that lets external systems
(workflow engines, CI/CD pipelines, coding assistants) feed quality signals
into ruvllm's learning loops without modifying ruvllm core.

Motivated by the gap between ADR-002's vision (quality data feeding SONA)
and the current reality (no ingestion interface for external systems).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ruvnet added a commit that referenced this pull request Feb 21, 2026
…Learning

Implement trait-based IntelligenceProvider extension point for external
quality signals. Addresses PR #190 proposal (renumbered from ADR-029 to
avoid collision with existing ADR-029-rvf-canonical-format).

- IntelligenceProvider trait with load_signals() and quality_weights()
- FileSignalProvider built-in for JSON file-based signal exchange
- IntelligenceLoader for multi-provider registration and aggregation
- QualitySignal, QualityFactors, ProviderQualityWeights types
- calibration_bias() on TaskComplexityAnalyzer for router feedback
- 12 unit tests (all passing)

Co-Authored-By: claude-flow <ruv@ruv.net>
Implements the trait-based extension point proposed in ADR-029:

- IntelligenceProvider trait (name, load_signals, quality_weights)
- FileSignalProvider built-in (reads JSON array or {signals:[...]} format)
- IntelligenceProviderLoader registry with load_all() and per-provider stats
- QualitySignal, QualityFactors, QualityWeights types with serde support
- calibration_bias() on TaskComplexityAnalyzer (requires 10+ feedback records)
- 7 unit tests covering file parsing, loader orchestration, and fault tolerance

ADR status updated from "Proposal (Draft)" to "Accepted (Implemented)".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ADR-029: External Intelligence Providers for SONA Learning

3 participants