Security Architecture

This document describes Hyperagent's security model in detail.

Overview

Hyperagent implements defense-in-depth security through multiple layers. No single layer is sufficient — they work together to provide comprehensive protection.

Security Layers

Layer 1: Tool Gating

The Copilot SDK provides built-in tools (bash, edit, grep, read, write) that would allow arbitrary code execution. Hyperagent blocks most of them, allowing only safe tools like ask_user (for user interaction) and report_intent (protocol).

Implementation (src/agent/tool-gating.ts):

Intercepts all tool calls from the LLM
Maintains an allowlist of custom tools plus safe SDK tools
Rejects any tool not on the allowlist
Logs blocked attempts for debugging

Effect: The LLM cannot escape the sandbox by calling SDK tools. Even if prompted to "run bash", the tool call is rejected.

Layer 2: Hyperlight Micro-VMs

JavaScript executes in hardware-isolated micro-VMs powered by Hyperlight.

Isolation Properties:

No filesystem access (no fs, no __dirname)
No network access (no fetch, no net)
No process access (no child_process, no exec)
No environment access (no process.env)

Resource Limits:

Resource	Default	Configurable
CPU time	1000ms	`--cpu-timeout`
Wall clock	5000ms	`--wall-timeout`
Heap size	16MB	`--heap-size`
Scratch size	16MB	`--scratch-size`

Hypervisor Support:

KVM (Linux)
MSHV (Azure Linux)
WHP (Windows 11/ Windows Server 2025 Hyper-V)

Layer 3: Code Validation

LLM-generated code is validated before execution in an isolated Rust guest.

Validation Steps:

Parse with QuickJS (same parser as runtime)
Check syntax validity
Validate imports against available modules
Verify handler structure (default export function)

Implementation (src/agent/analysis-guest.ts):

Wraps the native Rust validation sandbox
Runs in its own Hyperlight micro-VM
Returns structured validation results

Why Validate?:

Catch syntax errors before execution
Ensure imports will resolve
Same parser = perfect fidelity
Validation itself is sandboxed

See VALIDATION.md for details.

Layer 4: Plugin Security

Plugins extend sandbox capabilities with host functions. Since plugins run on the host with full privileges, they require careful security.

Static Analysis:

Pattern matching via Rust (plugin_scan.rs)
Linear-time regex (ReDoS-safe)
Detects dangerous APIs: eval, exec, spawn, require

LLM Deep Audit:

Full source analysis by LLM
Risk classification (LOW → CRITICAL)
Description accuracy verification
Finding aggregation

Anti-Prompt-Injection:

Canary injection with unique UUIDs
LLM must report canaries as injections
Missing canaries = compromised audit
Hallucinated canaries = unreliable audit

Approval Persistence:

SHA-256 content hash stored with approval
Approval revoked if source changes
Forces re-audit on modification

See PLUGINS.md for full details.

Layer 5: Path Jailing

Filesystem plugins restrict access to configured directories.

Implementation (plugins/shared/path-jail.ts):

Resolves symlinks before checking
Validates path is within allowed directories
Blocks traversal attacks (../)
Prevents access outside jail

Configuration:

/plugin enable fs-read basePath=/home/user/data

Only files under /home/user/data are accessible.

Layer 6: SSRF Protection

The fetch plugin validates network requests to prevent Server-Side Request Forgery.

DNS Validation:

Resolves hostname before connecting
Blocks private IPs (10.x, 172.16-31.x, 192.168.x)
Blocks localhost and link-local
Blocks IPv6 private ranges

Post-Connect Validation:

Re-validates after connection
Catches DNS rebinding attacks
Validates the actual connected IP

Domain Allowlist:

/plugin enable fetch allowedDomains=api.github.com,example.com

Only requests to allowed domains succeed.

Threat Model

Threat 1: Malicious LLM Output

The LLM might be manipulated (via prompt injection) to generate harmful code.

Mitigations:

Tool gating blocks dangerous tools
Sandbox isolates code execution
Resource limits prevent DoS
Validation catches errors early

Threat 2: Malicious Plugin Author

A plugin might intentionally do harm.

Mitigations:

Static analysis flags dangerous patterns
LLM audit classifies risk
Canary verification detects manipulation
Human review before enable
Approval requires explicit action

Threat 3: Plugin Prompt Injection

Plugin source might contain strings that manipulate the LLM auditor.

Mitigations:

Canary injection with unique UUIDs
Auditor must report both canaries
Missing canaries = CRITICAL risk
Full source visible (no hiding)

Threat 4: Host Escape

Guest code might attempt to access host resources.

Mitigations:

Hardware isolation via hypervisor
No filesystem/network access
Plugins are audited before enable
Path jailing for file plugins
SSRF protection for network plugins

Threat 5: Supply Chain Attack

External packages might contain malicious code.

Mitigations:

Plugins must not import npm packages
Static analysis flags external imports
Only Node.js builtins allowed
Builtin modules are vendored

Trust Boundaries

┌──────────────────────────────────────────────────────────────┐
│                         UNTRUSTED                            │
│                                                              │
│  LLM Output       Guest JavaScript       Plugin Source       │
│  (any prompt)     (any code)             (any content)       │
└──────────────────────────────────────────────────────────────┘
                             │
           ┌─────────────────┴─────────────────┐
           │        SECURITY BOUNDARY          │
           │                                   │
           │  Tool Gating                      │
           │  Code Validation                  │
           │  Hardware Isolation               │
           │  Plugin Auditing                  │
           │  Path Jailing                     │
           │  SSRF Protection                  │
           └─────────────────┬─────────────────┘
                             │
┌──────────────────────────────────────────────────────────────┐
│                          TRUSTED                             │
│                                                              │
│  Agent Code        Copilot SDK         Node.js Runtime       │
│  (reviewed)        (GitHub)            (system)              │
└──────────────────────────────────────────────────────────────┘

Defense Summary

Attack Vector	Defense
Arbitrary command execution	Tool gating blocks SDK tools
Filesystem access	Sandbox isolation + path jailing
Network access	Sandbox isolation + SSRF protection
Resource exhaustion	CPU/memory limits
Malicious plugin	Static + LLM audit + approval
Prompt injection in plugin	Canary verification
Code injection	Validation before execution
Supply chain	No external packages allowed

Configuration Security

Environment Variables

Sensitive configuration should use environment variables:

GITHUB_TOKEN - GitHub authentication
Other secrets via process.env (not in code)

Plugin Configuration

Plugin configuration is session-scoped:

Not persisted across sessions
Must be re-entered each time
User controls what plugins can access

Approval Storage

Plugin approvals stored in ~/.hyperagent/approved-plugins.json:

Content hash invalidates on change
Machine-local (not shared)
Can be cleared manually

Limitations

Pre-Release Software

Hyperagent is pre-release software:

Not audited for production use
Security model is experimental
Run in a container for additional isolation
Docker image runs as non-root user (defence in depth against VM escape)

LLM Reliability

LLM auditing is not perfect:

May miss subtle issues
Canaries provide verification
Human review is important

Plugin Trust

Plugins run with host privileges:

Only enable plugins you trust
Review source before approval
Understand what you're enabling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security Architecture

Overview

Security Layers

Layer 1: Tool Gating

Layer 2: Hyperlight Micro-VMs

Layer 3: Code Validation

Layer 4: Plugin Security

Layer 5: Path Jailing

Layer 6: SSRF Protection

Threat Model

Threat 1: Malicious LLM Output

Threat 2: Malicious Plugin Author

Threat 3: Plugin Prompt Injection

Threat 4: Host Escape

Threat 5: Supply Chain Attack

Trust Boundaries

Defense Summary

Configuration Security

Environment Variables

Plugin Configuration

Approval Storage

Limitations

Pre-Release Software

LLM Reliability

Plugin Trust

See Also

FilesExpand file tree

SECURITY.md

Latest commit

History

SECURITY.md

File metadata and controls

Security Architecture

Overview

Security Layers

Layer 1: Tool Gating

Layer 2: Hyperlight Micro-VMs

Layer 3: Code Validation

Layer 4: Plugin Security

Layer 5: Path Jailing

Layer 6: SSRF Protection

Threat Model

Threat 1: Malicious LLM Output

Threat 2: Malicious Plugin Author

Threat 3: Plugin Prompt Injection

Threat 4: Host Escape

Threat 5: Supply Chain Attack

Trust Boundaries

Defense Summary

Configuration Security

Environment Variables

Plugin Configuration

Approval Storage

Limitations

Pre-Release Software

LLM Reliability

Plugin Trust

See Also