Skip to content

Make RemoteRegistryClient backoff non-blocking via shared scheduler#4199

Open
Copilot wants to merge 4 commits intodevelopfrom
copilot/ml-based-anomaly-detection
Open

Make RemoteRegistryClient backoff non-blocking via shared scheduler#4199
Copilot wants to merge 4 commits intodevelopfrom
copilot/ml-based-anomaly-detection

Conversation

Copy link
Contributor

Copilot AI commented Mar 13, 2026

Description

RemoteRegistryClient retries were blocking the calling thread with sleep_for, conflicting with the async back-off requirement for registry access. This PR introduces a lightweight shared backoff scheduler so retries yield the caller thread while preserving jittered back-off timing.

  • Backoff scheduler: add a singleton BackoffScheduler (background jthread, priority-queued one-shot tasks) to schedule retry wakeups without per-retry threads.
  • async backoff integration: asyncBackoffSleep now queues a wakeup on the scheduler and waits on a shared promise/future, keeping retry logic unchanged while removing direct sleeps.
  • Maintenance: document scheduler lifecycle, ensure stop-aware wakeups, and add required headers.

Example backoff scheduling:

// inside httpGet retry loop
const int sleep_ms = std::min(backoff_ms, remaining_ms);
asyncBackoffSleep(sleep_ms); // now queues on BackoffScheduler instead of blocking sleep_for

Type of Change

  • Refactoring
  • Bug fix
  • New feature
  • Documentation
  • Other:

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed

📚 Research & Knowledge (wenn applicable)

  • Diese PR basiert auf wissenschaftlichen Paper(s) oder Best Practices?
    • Falls JA: Research-Dateien in /docs/research/ angelegt?
    • Falls JA: Im Modul-README unter "Wissenschaftliche Grundlagen" verlinkt?
    • Falls JA: In /docs/research/implementation_influence/ eingetragen?

Relevante Quellen:

  • Paper:
  • Best Practice:
  • Architecture Decision:

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Documentation updated (if needed)
  • No new warnings introduced
Original prompt

This section details on the original issue you should resolve

<issue_title>ML-Based Anomaly Detection</issue_title>
<issue_description>### Context

This issue implements the roadmap item 'ML-Based Anomaly Detection' for the observability domain. It is sourced from the consolidated roadmap under 🟠 High Priority — Near-term (v1.5.0 – v1.8.0) and targets milestone v1.7.0.

Primary detail section: Machine Learning-Based Anomaly Detection

Goal

Deliver the scoped changes for ML-Based Anomaly Detection in src/observability/ and complete the linked detail section in a release-ready state for v1.7.0.

Detailed Scope

Machine Learning-Based Anomaly Detection

Priority: High
Target Version: v1.7.0

Automated detection of performance anomalies using ML models.

Features:

  • Time-series forecasting (ARIMA, Prophet)
  • Outlier detection (Isolation Forest, DBSCAN)
  • Seasonal pattern recognition
  • Change point detection

Implementation:

class MLAnomalyDetector {
public:
    explicit MLAnomalyDetector(const MLConfig& config);
    
    // Train model on historical data
    void train(const std::vector<TimeSeries>& training_data);
    
    // Detect anomalies in real-time
    std::vector<Anomaly> detectAnomalies(const TimeSeries& current_data);
    
    // Predict future values
    TimeSeries forecast(std::chrono::hours horizon);
    
    // Explain anomaly (feature importance)
    AnomalyExplanation explainAnomaly(const Anomaly& anomaly);
};

struct Anomaly {
    std::chrono::system_clock::time_point timestamp;
    std::string metric_name;
    double actual_value;
    double expected_value;
    double confidence_score;  // 0-1
    std::string severity;     // low, medium, high, critical
    std::vector<std::string> contributing_factors;
};

// Example usage
MLAnomalyDetector detector(config);
detector.train(historical_query_latencies);

auto anomalies = detector.detectAnomalies(current_query_latencies);
for (const auto& anomaly : anomalies) {
    if (anomaly.confidence_score > 0.8) {
        alertmanager.sendAlert({
            .alert_name = "MLAnomalyDetected",
            .severity = AlertSeverity::WARNING,
            .message = "Unusual pattern detected: " + anomaly.metric_name,
            .annotations = {
                {"expected", std::to_string(anomaly.expected_value)},
                {"actual", std::to_string(anomaly.actual_value)},
                {"confidence", std::to_string(anomaly.confidence_score)}
            }
        });
    }
}

Acceptance Criteria

  • Time-series forecasting (ARIMA, Prophet)
  • Outlier detection (Isolation Forest, DBSCAN)
  • Seasonal pattern recognition
  • Change point detection

Relationships

References

  • src/ROADMAP.md
  • src/observability/FUTURE_ENHANCEMENTS.md#machine-learning-based-anomaly-detection
  • Source key: roadmap:83:observability:v1.7.0:machine-learning-based-anomaly-detection

Generated from the consolidated source roadmap. Keep the roadmap and issue in sync when scope changes.

</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Copilot AI requested a review from makr-code March 13, 2026 19:20
Copilot AI and others added 2 commits March 13, 2026 20:20
Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement ML-based anomaly detection for observability Make RemoteRegistryClient backoff non-blocking via shared scheduler Mar 13, 2026
@makr-code makr-code marked this pull request as ready for review March 14, 2026 06:15
@makr-code makr-code requested a review from Copilot March 14, 2026 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ML-Based Anomaly Detection

2 participants