Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@ venv/
ENV/
.venv

# Test files and outputs
tests/
# Test outputs
*_updated*.xml
*_applied*.xml
*_diffs/
Expand Down
128 changes: 128 additions & 0 deletions docs/agents_monitoring_handoff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Long-Running Agents Monitoring Handoff

## Summary

This handoff documents the implemented V1 monitoring capability:

- New **Agents** tab in Electron UI for starting/stopping long-running monitoring.
- Continuous Python worker (`anomaly_monitor.py`) with:
- deterministic historical-deviation scoring,
- quality/staleness gates,
- optional LLM triage,
- Neo4j persistence for `AgentRun` and `AnomalyEvent`,
- event dedup and retention cleanup.
- IPC surface and stream channels from Electron main to renderer:
- `agents:start`, `agents:status`, `agents:stop`,
- `agents:list-events`, `agents:get-event`, `agents:ack-event`, `agents:cleanup`,
- channels: `agent-status`, `agent-event`, `agent-error`, `agent-complete`.
- Graph drill-down integration with anomaly node support.

## Files Changed

### Electron

- `electron-ui/index.html`
- Added **Agents** nav button.
- Added `tab-agents` page shell with controls, filters, feed, and detail panel.
- Added graph filter option for anomaly layer.

- `electron-ui/styles.css`
- Added Agents tab styles (`agents-*`, `status-chip`, feed cards, detail panel).

- `electron-ui/preload.js`
- Added `agents*` API bridge methods.
- Added event listeners for `agent-status/event/error/complete`.

- `electron-ui/main.js`
- Added background agent runtime management (`activeAgentRun`).
- Added stream parser for monitor stdout markers (`[AGENT_STATUS]`, etc.).
- Added full `agents:*` IPC handlers.
- Added graceful stop handling on app shutdown.

- `electron-ui/renderer.js`
- Added Agents tab state management.
- Added start/stop/refresh/cleanup/ack handlers.
- Added realtime feed updates from agent channels.
- Added event detail rendering and graph drill-down action.

### Python backend

- `scripts/anomaly_rules.py` (new)
- Deterministic scoring logic (`z`, `MAD`, rate, drift trend, flatline).
- Quality/staleness helpers and dedup key generator.

- `scripts/anomaly_monitor.py` (new)
- Long-running monitoring worker with CLI subcommands:
- `run`, `status`, `list-events`, `get-event`, `ack-event`, `cleanup`, `replay-fixtures`.
- Neo4j persistence + dedup + retention cleanup.
- Optional LLM triage with structured JSON fallback.

- `scripts/ignition_api_client.py`
- Added `query_tag_history(...)` and local-time-to-UTC conversion helper.

- `scripts/neo4j_ontology.py`
- Added monitoring schema constraints/indexes for `AgentRun` / `AnomalyEvent`.
- Added helper methods: list/get/cleanup anomaly events.
- Added CLI commands:
- `init-agent-schema`
- `list-anomaly-events`
- `get-anomaly-event`
- `cleanup-anomaly-events`

- `scripts/graph_api.py`
- Added node groups/colors for `AgentRun` and `AnomalyEvent`.
- Extended neighbor center-node lookup to support `event_id` and `run_id`.

### Fixtures

- `scripts/fixtures/anomaly_replay_cases.json` (new)
- Deterministic replay cases:
- normal baseline,
- sudden spike,
- slow drift,
- flatline/stuck.

## Runtime Commands

### Deterministic replay validation

```bash
python3 scripts/anomaly_monitor.py replay-fixtures --fixture-file scripts/fixtures/anomaly_replay_cases.json
```

### Monitor worker manual run

```bash
python3 scripts/anomaly_monitor.py run --run-id demo-run --config-json '{"pollIntervalMs":1000}'
```

### Event operations

```bash
python3 scripts/anomaly_monitor.py list-events --limit 50
python3 scripts/anomaly_monitor.py get-event --event-id <event_id>
python3 scripts/anomaly_monitor.py ack-event --event-id <event_id> --note "Reviewed by operator"
python3 scripts/anomaly_monitor.py cleanup --retention-days 14
```

## Known Environment Requirements

The Python environment must include packages from `requirements.txt`:

- `neo4j`
- `anthropic` (for LLM triage; deterministic fallback works without API key)
- `python-dotenv`
- `requests`

If `ANTHROPIC_API_KEY` is absent, triage automatically falls back to deterministic explanations.

## Validation Status

- Syntax checks passed:
- Python (`py_compile`) for all modified scripts.
- JS syntax checks (`node --check`) for Electron files.
- Fixture replay passed:
- `4/4` deterministic scenarios.

Live end-to-end validation against actual Ignition + Neo4j + Anthropic requires connected runtime services.

114 changes: 113 additions & 1 deletion electron-ui/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="Content-Security-Policy" content="default-src 'self'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src https://fonts.gstatic.com; script-src 'self' 'unsafe-inline' https://cdnjs.cloudflare.com https://unpkg.com;">
<meta http-equiv="Content-Security-Policy" content="default-src 'self'; img-src 'self' data: https:; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src https://fonts.gstatic.com; script-src 'self' 'unsafe-inline' https://cdnjs.cloudflare.com https://unpkg.com;">
<title>Axilon</title>
<link rel="stylesheet" href="styles.css">
</head>
Expand Down Expand Up @@ -36,6 +36,13 @@
</svg>
<span class="nav-label">Assist</span>
</button>
<button class="nav-btn" data-tab="agents" title="Long-Running Agents">
<svg class="nav-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M4 5h7v7H4zM13 5h7v7h-7zM4 14h7v7H4z"/>
<path d="M16.5 14.5l1.5 1.5 3-3"/>
</svg>
<span class="nav-label">Agents</span>
</button>
<button class="nav-btn" data-tab="database" title="Database">
<svg class="nav-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<ellipse cx="12" cy="5" rx="9" ry="3"/>
Expand Down Expand Up @@ -532,6 +539,108 @@ <h2>Troubleshooting Assistant</h2>
</div>
</section>

<!-- Agents Tab -->
<section class="tab-content" id="tab-agents">
<header class="tab-header">
<h2>Long-Running Agents</h2>
<p>Continuously monitor live process data and triage anomalies with ontology context</p>
</header>

<div class="agents-topbar">
<div class="agents-run-controls">
<button class="btn btn-primary" id="btn-agents-start">Start Monitoring</button>
<button class="btn btn-secondary" id="btn-agents-stop" disabled>Stop</button>
<button class="btn btn-ghost" id="btn-agents-refresh">Refresh Events</button>
<button class="btn btn-ghost" id="btn-agents-cleanup">Cleanup Old</button>
</div>
<div class="agents-run-status">
<span class="status-chip" id="agents-status-chip">Idle</span>
<span class="agents-status-text" id="agents-status-text">No active run</span>
</div>
</div>

<div class="agents-config-row">
<label>Poll (ms)</label>
<input class="input input-sm" id="agents-config-poll-ms" type="number" min="1000" step="1000" value="1000">
<label>History (min)</label>
<input class="input input-sm" id="agents-config-history-min" type="number" min="10" step="10" value="360">
<label>Min Points</label>
<input class="input input-sm" id="agents-config-min-points" type="number" min="10" step="5" value="30">
<label class="agents-toggle-label">
<input type="checkbox" id="agents-config-auto-llm"> Auto LLM
</label>
<label>Max/Cycle</label>
<input class="input input-sm" id="agents-config-max-llm" type="number" min="1" step="1" value="5">
<label>Z</label>
<input class="input input-sm" id="agents-config-threshold-z" type="number" min="0.5" step="0.5" value="3">
<label>MAD</label>
<input class="input input-sm" id="agents-config-threshold-mad" type="number" min="0.5" step="0.5" value="3.5">
<label>Stale (sec)</label>
<input class="input input-sm" id="agents-config-staleness-sec" type="number" min="10" step="10" value="120">
</div>

<div class="agents-metrics-row">
<div class="metric-card"><span class="metric-label">Cycle (ms)</span><span class="metric-value" id="agents-metric-cycle">0</span></div>
<div class="metric-card"><span class="metric-label">Candidates</span><span class="metric-value" id="agents-metric-candidates">0</span></div>
<div class="metric-card"><span class="metric-label">Triaged</span><span class="metric-value" id="agents-metric-triaged">0</span></div>
<div class="metric-card"><span class="metric-label">Emitted</span><span class="metric-value" id="agents-metric-emitted">0</span></div>
<div class="metric-card"><span class="metric-label">Last heartbeat</span><span class="metric-value" id="agents-metric-heartbeat">n/a</span></div>
</div>

<div class="agents-health-section">
<div class="agents-health-header">
<h3>Subsystem Health</h3>
<div class="agents-health-actions">
<button class="btn btn-ghost btn-sm" id="btn-agents-clear-subsystem" style="display:none">Show All</button>
</div>
</div>
<div class="agents-health-grid" id="agents-health-grid">
<div class="agents-health-empty">Start monitoring to see subsystem health.</div>
</div>
</div>

<div class="agents-main">
<aside class="agents-feed-panel">
<div class="agents-feed-header">
<h3>Anomaly Feed</h3>
<div class="agents-feed-filters">
<select class="input input-sm" id="agents-filter-state">
<option value="">All states</option>
<option value="open">Open</option>
<option value="acknowledged">Acknowledged</option>
<option value="cleared">Cleared</option>
</select>
<select class="input input-sm" id="agents-filter-severity">
<option value="">All severity</option>
<option value="critical">Critical</option>
<option value="high">High</option>
<option value="medium">Medium</option>
<option value="low">Low</option>
</select>
<input class="input input-sm" id="agents-filter-search" placeholder="Search tag/equipment">
</div>
</div>
<div class="agents-event-list" id="agents-event-list">
<div class="agents-empty">No anomaly events yet.</div>
</div>
</aside>

<section class="agents-detail-panel">
<div class="agents-detail-header">
<h3>Event Details</h3>
<div class="agents-detail-actions">
<button class="btn btn-sm btn-primary" id="btn-agents-deep-analyze" disabled>Deep Analyze</button>
<button class="btn btn-sm btn-secondary" id="btn-agents-open-graph" disabled>Open in Graph</button>
<button class="btn btn-sm btn-ghost" id="btn-agents-ack" disabled>Acknowledge</button>
</div>
</div>
<div class="agents-detail-content" id="agents-event-detail">
<p class="text-muted">Select an anomaly event from the feed.</p>
</div>
</section>
</div>
</section>

<!-- Database Tab -->
<section class="tab-content" id="tab-database">
<header class="tab-header">
Expand Down Expand Up @@ -630,6 +739,7 @@ <h2>Ontology Graph</h2>
<option value="siemens-hmi">Siemens HMI</option>
<option value="mes">MES Layer</option>
<option value="troubleshooting">Troubleshooting</option>
<option value="anomaly">Anomalies</option>
<option value="flows">Flows</option>
</select>
</div>
Expand Down Expand Up @@ -1427,6 +1537,8 @@ <h3 id="graph-modal-title">Graph: Node</h3>
<script src="https://cdnjs.cloudflare.com/ajax/libs/cytoscape/3.28.1/cytoscape.min.js"></script>
<script src="https://unpkg.com/dagre@0.8.5/dist/dagre.min.js"></script>
<script src="https://unpkg.com/cytoscape-dagre@2.5.0/cytoscape-dagre.js"></script>
<script src="https://unpkg.com/layout-base@2.0.1/layout-base.js"></script>
<script src="https://unpkg.com/cose-base@2.2.0/cose-base.js"></script>
<script src="https://unpkg.com/cytoscape-fcose@2.2.0/cytoscape-fcose.js"></script>

<script src="graph-renderer.js"></script>
Expand Down
Loading