diff --git a/docs/README.hooks.md b/docs/README.hooks.md index e23200323..819e50616 100644 --- a/docs/README.hooks.md +++ b/docs/README.hooks.md @@ -35,3 +35,4 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-hooks) for guidelines on how to | [Secrets Scanner](../hooks/secrets-scanner/README.md) | Scans files modified during a Copilot coding agent session for leaked secrets, credentials, and sensitive data | sessionEnd | `hooks.json`
`scan-secrets.sh` | | [Session Auto-Commit](../hooks/session-auto-commit/README.md) | Automatically commits and pushes changes when a Copilot coding agent session ends | sessionEnd | `auto-commit.sh`
`hooks.json` | | [Session Logger](../hooks/session-logger/README.md) | Logs all Copilot coding agent session activity for audit and analysis | sessionStart, sessionEnd, userPromptSubmitted | `hooks.json`
`log-prompt.sh`
`log-session-end.sh`
`log-session-start.sh` | +| [Tool Guardian](../hooks/tool-guardian/README.md) | Blocks dangerous tool operations (destructive file ops, force pushes, DB drops) before the Copilot coding agent executes them | preToolUse | `guard-tool.sh`
`hooks.json` | diff --git a/hooks/tool-guardian/README.md b/hooks/tool-guardian/README.md new file mode 100644 index 000000000..028b74739 --- /dev/null +++ b/hooks/tool-guardian/README.md @@ -0,0 +1,183 @@ +--- +name: 'Tool Guardian' +description: 'Blocks dangerous tool operations (destructive file ops, force pushes, DB drops) before the Copilot coding agent executes them' +tags: ['security', 'safety', 'preToolUse', 'guardrails'] +--- + +# Tool Guardian Hook + +Blocks dangerous tool operations before a GitHub Copilot coding agent executes them, acting as a safety net against destructive commands, force pushes, database drops, and other high-risk actions. + +## Overview + +AI coding agents can autonomously execute shell commands, file operations, and database queries. Without guardrails, a misinterpreted instruction could lead to irreversible damage. This hook intercepts every tool invocation at the `preToolUse` event and scans it against ~20 threat patterns across 6 categories: + +- **Destructive file ops**: `rm -rf /`, deleting `.env` or `.git` +- **Destructive git ops**: `git push --force` to main/master, `git reset --hard` +- **Database destruction**: `DROP TABLE`, `DROP DATABASE`, `TRUNCATE`, `DELETE FROM` without `WHERE` +- **Permission abuse**: `chmod 777`, recursive world-writable permissions +- **Network exfiltration**: `curl | bash`, `wget | sh`, uploading files via `curl --data @` +- **System danger**: `sudo`, `npm publish` + +## Features + +- **Two guard modes**: `block` (exit non-zero to prevent execution) or `warn` (log only) +- **Safer alternatives**: Every blocked pattern includes a suggestion for a safer command +- **Allowlist support**: Skip specific patterns via `TOOL_GUARD_ALLOWLIST` +- **Structured logging**: JSON Lines output for integration with monitoring tools +- **Fast execution**: 10-second timeout; no external network calls +- **Zero dependencies**: Uses only standard Unix tools (`grep`, `sed`); optional `jq` for input parsing + +## Installation + +1. Copy the hook folder to your repository: + + ```bash + cp -r hooks/tool-guardian .github/hooks/ + ``` + +2. Ensure the script is executable: + + ```bash + chmod +x .github/hooks/tool-guardian/guard-tool.sh + ``` + +3. Create the logs directory and add it to `.gitignore`: + + ```bash + mkdir -p logs/copilot/tool-guardian + echo "logs/" >> .gitignore + ``` + +4. Commit the hook configuration to your repository's default branch. + +## Configuration + +The hook is configured in `hooks.json` to run on the `preToolUse` event: + +```json +{ + "version": 1, + "hooks": { + "preToolUse": [ + { + "type": "command", + "bash": ".github/hooks/tool-guardian/guard-tool.sh", + "cwd": ".", + "env": { + "GUARD_MODE": "block" + }, + "timeoutSec": 10 + } + ] + } +} +``` + +### Environment Variables + +| Variable | Values | Default | Description | +|----------|--------|---------|-------------| +| `GUARD_MODE` | `warn`, `block` | `block` | `warn` logs threats only; `block` exits non-zero to prevent tool execution | +| `SKIP_TOOL_GUARD` | `true` | unset | Disable the guardian entirely | +| `TOOL_GUARD_LOG_DIR` | path | `logs/copilot/tool-guardian` | Directory where guard logs are written | +| `TOOL_GUARD_ALLOWLIST` | comma-separated | unset | Patterns to skip (e.g., `git push --force,npm publish`) | + +## How It Works + +1. Before the Copilot coding agent executes a tool, the hook receives the tool invocation as JSON on stdin +2. Extracts `toolName` and `toolInput` fields (via `jq` if available, regex fallback otherwise) +3. Checks the combined text against the allowlist — if matched, skips all scanning +4. Scans combined text against ~20 regex threat patterns across 6 severity categories +5. Reports findings with category, severity, matched text, and a safer alternative +6. Writes a structured JSON log entry for audit purposes +7. In `block` mode, exits non-zero to prevent the tool from executing +8. In `warn` mode, logs the threat and allows execution to proceed + +## Threat Categories + +| Category | Severity | Key Patterns | Suggestion | +|----------|----------|-------------|------------| +| `destructive_file_ops` | critical | `rm -rf /`, `rm -rf ~`, `rm -rf .`, delete `.env`/`.git` | Use targeted paths or `mv` to back up | +| `destructive_git_ops` | critical/high | `git push --force` to main/master, `git reset --hard`, `git clean -fd` | Use `--force-with-lease`, `git stash`, dry-run | +| `database_destruction` | critical/high | `DROP TABLE`, `DROP DATABASE`, `TRUNCATE`, `DELETE FROM` without WHERE | Use migrations, backups, add WHERE clause | +| `permission_abuse` | high | `chmod 777`, `chmod -R 777` | Use `755` for dirs, `644` for files | +| `network_exfiltration` | critical/high | `curl \| bash`, `wget \| sh`, `curl --data @file` | Download first, review, then execute | +| `system_danger` | high | `sudo`, `npm publish` | Use least privilege; `--dry-run` first | + +## Examples + +### Safe command (exit 0) + +```bash +echo '{"toolName":"bash","toolInput":"git status"}' | bash hooks/tool-guardian/guard-tool.sh +``` + +### Blocked command (exit 1) + +```bash +echo '{"toolName":"bash","toolInput":"git push --force origin main"}' | \ + GUARD_MODE=block bash hooks/tool-guardian/guard-tool.sh +``` + +``` +🛡️ Tool Guardian: 1 threat(s) detected in 'bash' invocation + + CATEGORY SEVERITY MATCH SUGGESTION + -------- -------- ----- ---------- + destructive_git_ops critical git push --force origin main Use 'git push --force-with-lease' or push to a feature branch + +🚫 Operation blocked: resolve the threats above or adjust TOOL_GUARD_ALLOWLIST. + Set GUARD_MODE=warn to log without blocking. +``` + +### Warn mode (exit 0, threat logged) + +```bash +echo '{"toolName":"bash","toolInput":"rm -rf /"}' | \ + GUARD_MODE=warn bash hooks/tool-guardian/guard-tool.sh +``` + +### Allowlisted command (exit 0) + +```bash +echo '{"toolName":"bash","toolInput":"git push --force origin main"}' | \ + TOOL_GUARD_ALLOWLIST="git push --force" bash hooks/tool-guardian/guard-tool.sh +``` + +## Log Format + +Guard events are written to `logs/copilot/tool-guardian/guard.log` in JSON Lines format: + +```json +{"timestamp":"2026-03-16T10:30:00Z","event":"threats_detected","mode":"block","tool":"bash","threat_count":1,"threats":[{"category":"destructive_git_ops","severity":"critical","match":"git push --force origin main","suggestion":"Use 'git push --force-with-lease' or push to a feature branch"}]} +``` + +```json +{"timestamp":"2026-03-16T10:30:00Z","event":"guard_passed","mode":"block","tool":"bash"} +``` + +```json +{"timestamp":"2026-03-16T10:30:00Z","event":"guard_skipped","reason":"allowlisted","tool":"bash"} +``` + +## Customization + +- **Add custom patterns**: Edit the `PATTERNS` array in `guard-tool.sh` to add project-specific threat patterns +- **Adjust severity**: Change severity levels for patterns that need different treatment +- **Allowlist known commands**: Use `TOOL_GUARD_ALLOWLIST` for commands that are safe in your context +- **Change log location**: Set `TOOL_GUARD_LOG_DIR` to route logs to your preferred directory + +## Disabling + +To temporarily disable the guardian: + +- Set `SKIP_TOOL_GUARD=true` in the hook environment +- Or remove the `preToolUse` entry from `hooks.json` + +## Limitations + +- Pattern-based detection; does not perform semantic analysis of command intent +- May produce false positives for commands that match patterns in safe contexts (use the allowlist to suppress these) +- Scans the text representation of tool input; cannot detect obfuscated or encoded commands +- Requires tool invocations to be passed as JSON on stdin with `toolName` and `toolInput` fields diff --git a/hooks/tool-guardian/guard-tool.sh b/hooks/tool-guardian/guard-tool.sh new file mode 100755 index 000000000..f3639ba59 --- /dev/null +++ b/hooks/tool-guardian/guard-tool.sh @@ -0,0 +1,202 @@ +#!/bin/bash + +# Tool Guardian Hook +# Blocks dangerous tool operations (destructive file ops, force pushes, DB drops, +# etc.) before the Copilot coding agent executes them. +# +# Environment variables: +# GUARD_MODE - "warn" (log only) or "block" (exit non-zero on threats) (default: block) +# SKIP_TOOL_GUARD - "true" to disable entirely (default: unset) +# TOOL_GUARD_LOG_DIR - Directory for guard logs (default: logs/copilot/tool-guardian) +# TOOL_GUARD_ALLOWLIST - Comma-separated patterns to skip (default: unset) + +set -euo pipefail + +# --------------------------------------------------------------------------- +# Early exit if disabled +# --------------------------------------------------------------------------- +if [[ "${SKIP_TOOL_GUARD:-}" == "true" ]]; then + exit 0 +fi + +# --------------------------------------------------------------------------- +# Read tool invocation from stdin (JSON with toolName + toolInput) +# --------------------------------------------------------------------------- +INPUT=$(cat) + +MODE="${GUARD_MODE:-block}" +LOG_DIR="${TOOL_GUARD_LOG_DIR:-logs/copilot/tool-guardian}" +TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ") + +mkdir -p "$LOG_DIR" +LOG_FILE="$LOG_DIR/guard.log" + +# --------------------------------------------------------------------------- +# Extract tool name and input text +# --------------------------------------------------------------------------- +TOOL_NAME="" +TOOL_INPUT="" + +if command -v jq &>/dev/null; then + TOOL_NAME=$(printf '%s' "$INPUT" | jq -r '.toolName // empty' 2>/dev/null || echo "") + TOOL_INPUT=$(printf '%s' "$INPUT" | jq -r '.toolInput // empty' 2>/dev/null || echo "") +fi + +# Fallback: extract with grep/sed if jq unavailable or fields empty +if [[ -z "$TOOL_NAME" ]]; then + TOOL_NAME=$(printf '%s' "$INPUT" | grep -oE '"toolName"\s*:\s*"[^"]*"' | head -1 | sed 's/.*"toolName"\s*:\s*"//;s/"//') +fi +if [[ -z "$TOOL_INPUT" ]]; then + TOOL_INPUT=$(printf '%s' "$INPUT" | grep -oE '"toolInput"\s*:\s*"[^"]*"' | head -1 | sed 's/.*"toolInput"\s*:\s*"//;s/"//') +fi + +# Combine for pattern matching +COMBINED="${TOOL_NAME} ${TOOL_INPUT}" + +# --------------------------------------------------------------------------- +# Parse allowlist +# --------------------------------------------------------------------------- +ALLOWLIST=() +if [[ -n "${TOOL_GUARD_ALLOWLIST:-}" ]]; then + IFS=',' read -ra ALLOWLIST <<< "$TOOL_GUARD_ALLOWLIST" +fi + +is_allowlisted() { + local text="$1" + for pattern in "${ALLOWLIST[@]}"; do + pattern=$(printf '%s' "$pattern" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') + [[ -z "$pattern" ]] && continue + if [[ "$text" == *"$pattern"* ]]; then + return 0 + fi + done + return 1 +} + +# Check allowlist early — if the combined text matches, skip all scanning +if [[ ${#ALLOWLIST[@]} -gt 0 ]] && is_allowlisted "$COMBINED"; then + printf '{"timestamp":"%s","event":"guard_skipped","reason":"allowlisted","tool":"%s"}\n' \ + "$TIMESTAMP" "$TOOL_NAME" >> "$LOG_FILE" + exit 0 +fi + +# --------------------------------------------------------------------------- +# Threat patterns (6 categories, ~20 patterns) +# +# Each entry: "CATEGORY:::SEVERITY:::REGEX:::SUGGESTION" +# Uses ::: as delimiter to avoid conflicts with regex pipe characters +# --------------------------------------------------------------------------- +PATTERNS=( + # Destructive file operations + "destructive_file_ops:::critical:::rm -rf /:::Use targeted 'rm' on specific paths instead of root" + "destructive_file_ops:::critical:::rm -rf ~:::Use targeted 'rm' on specific paths instead of home directory" + "destructive_file_ops:::critical:::rm -rf \.:::Use targeted 'rm' on specific files instead of current directory" + "destructive_file_ops:::critical:::rm -rf \.\.:::Never remove parent directories recursively" + "destructive_file_ops:::critical:::(rm|del|unlink).*\.env:::Use 'mv' to back up .env files before removing" + "destructive_file_ops:::critical:::(rm|del|unlink).*\.git[^i]:::Never delete .git directory — use 'git' commands to manage repo state" + + # Destructive git operations + "destructive_git_ops:::critical:::git push --force.*(main|master):::Use 'git push --force-with-lease' or push to a feature branch" + "destructive_git_ops:::critical:::git push -f.*(main|master):::Use 'git push --force-with-lease' or push to a feature branch" + "destructive_git_ops:::high:::git reset --hard:::Use 'git stash' to preserve changes, or 'git reset --soft'" + "destructive_git_ops:::high:::git clean -fd:::Use 'git clean -n' (dry run) first to preview what will be deleted" + + # Database destruction + "database_destruction:::critical:::DROP TABLE:::Use 'ALTER TABLE' or create a migration with rollback support" + "database_destruction:::critical:::DROP DATABASE:::Create a backup first; consider revoking DROP privileges" + "database_destruction:::critical:::TRUNCATE:::Use 'DELETE FROM ... WHERE' with a condition for safer data removal" + "database_destruction:::high:::DELETE FROM [a-zA-Z_]+ *;:::Add a WHERE clause to 'DELETE FROM' to avoid deleting all rows" + + # Permission abuse + "permission_abuse:::high:::chmod 777:::Use 'chmod 755' for directories or 'chmod 644' for files" + "permission_abuse:::high:::chmod -R 777:::Use specific permissions ('chmod -R 755') and limit scope" + + # Network exfiltration + "network_exfiltration:::critical:::curl.*\|.*bash:::Download the script first, review it, then execute" + "network_exfiltration:::critical:::wget.*\|.*sh:::Download the script first, review it, then execute" + "network_exfiltration:::high:::curl.*--data.*@:::Review what data is being sent before using 'curl --data @file'" + + # System danger + "system_danger:::high:::sudo :::Avoid 'sudo' — run commands with the least privilege needed" + "system_danger:::high:::npm publish:::Use 'npm publish --dry-run' first to verify package contents" +) + +# --------------------------------------------------------------------------- +# Escape a string for safe JSON embedding +# --------------------------------------------------------------------------- +json_escape() { + printf '%s' "$1" | sed 's/\\/\\\\/g; s/"/\\"/g; s/ /\\t/g' +} + +# --------------------------------------------------------------------------- +# Scan combined text against threat patterns +# --------------------------------------------------------------------------- +THREATS=() +THREAT_COUNT=0 + +for entry in "${PATTERNS[@]}"; do + category="${entry%%:::*}" + rest="${entry#*:::}" + severity="${rest%%:::*}" + rest="${rest#*:::}" + regex="${rest%%:::*}" + suggestion="${rest#*:::}" + + if printf '%s\n' "$COMBINED" | grep -qiE "$regex" 2>/dev/null; then + local_match=$(printf '%s\n' "$COMBINED" | grep -oiE "$regex" 2>/dev/null | head -1) + THREATS+=("${category} ${severity} ${local_match} ${suggestion}") + THREAT_COUNT=$((THREAT_COUNT + 1)) + fi +done + +# --------------------------------------------------------------------------- +# Output and logging +# --------------------------------------------------------------------------- +if [[ $THREAT_COUNT -gt 0 ]]; then + echo "" + echo "🛡️ Tool Guardian: $THREAT_COUNT threat(s) detected in '$TOOL_NAME' invocation" + echo "" + printf " %-24s %-10s %-40s %s\n" "CATEGORY" "SEVERITY" "MATCH" "SUGGESTION" + printf " %-24s %-10s %-40s %s\n" "--------" "--------" "-----" "----------" + + # Build JSON findings array + FINDINGS_JSON="[" + FIRST=true + for threat in "${THREATS[@]}"; do + IFS=$'\t' read -r category severity match suggestion <<< "$threat" + + # Truncate match for display + display_match="$match" + if [[ ${#match} -gt 38 ]]; then + display_match="${match:0:35}..." + fi + printf " %-24s %-10s %-40s %s\n" "$category" "$severity" "$display_match" "$suggestion" + + if [[ "$FIRST" != "true" ]]; then + FINDINGS_JSON+="," + fi + FIRST=false + FINDINGS_JSON+="{\"category\":\"$(json_escape "$category")\",\"severity\":\"$(json_escape "$severity")\",\"match\":\"$(json_escape "$match")\",\"suggestion\":\"$(json_escape "$suggestion")\"}" + done + FINDINGS_JSON+="]" + + echo "" + + # Write structured log entry + printf '{"timestamp":"%s","event":"threats_detected","mode":"%s","tool":"%s","threat_count":%d,"threats":%s}\n' \ + "$TIMESTAMP" "$MODE" "$(json_escape "$TOOL_NAME")" "$THREAT_COUNT" "$FINDINGS_JSON" >> "$LOG_FILE" + + if [[ "$MODE" == "block" ]]; then + echo "🚫 Operation blocked: resolve the threats above or adjust TOOL_GUARD_ALLOWLIST." + echo " Set GUARD_MODE=warn to log without blocking." + exit 1 + else + echo "⚠️ Threats logged in warn mode. Set GUARD_MODE=block to prevent dangerous operations." + fi +else + # Log clean result + printf '{"timestamp":"%s","event":"guard_passed","mode":"%s","tool":"%s"}\n' \ + "$TIMESTAMP" "$MODE" "$(json_escape "$TOOL_NAME")" >> "$LOG_FILE" +fi + +exit 0 diff --git a/hooks/tool-guardian/hooks.json b/hooks/tool-guardian/hooks.json new file mode 100644 index 000000000..bd0e54653 --- /dev/null +++ b/hooks/tool-guardian/hooks.json @@ -0,0 +1,16 @@ +{ + "version": 1, + "hooks": { + "preToolUse": [ + { + "type": "command", + "bash": ".github/hooks/tool-guardian/guard-tool.sh", + "cwd": ".", + "env": { + "GUARD_MODE": "block" + }, + "timeoutSec": 10 + } + ] + } +} diff --git a/logs/copilot/tool-guardian/guard.log b/logs/copilot/tool-guardian/guard.log new file mode 100644 index 000000000..197d0b898 --- /dev/null +++ b/logs/copilot/tool-guardian/guard.log @@ -0,0 +1,2 @@ +{"timestamp":"2026-03-17T08:25:48Z","event":"guard_passed","mode":"block","tool":"bash"} +{"timestamp":"2026-03-17T08:25:49Z","event":"threats_detected","mode":"block","tool":"bash","threat_count":1,"threats":[{"category":"destructive_file_ops","severity":"critical","match":"rm -rf /","suggestion":"Use targeted 'rm' on specific paths instead of root"}]}