The command safety agent is a specialized security layer that provides intelligent analysis of shell commands before they are executed. Unlike traditional pattern-based filtering, it uses LLM-powered analysis to understand command context and intent, providing more nuanced and comprehensive protection against dangerous operations.
The command safety agent is not a regular subagent that users can call directly. It's an internal security mechanism that:
- Is automatically invoked whenever
execute_commandis used - Analyzes commands in full context (including working directory)
- Uses conservative security policies to "err on the side of caution"
- Blocks commands that could cause harm, data loss, or security issues
- Works alongside existing pattern-based safety checks
- Executor Integration: The safety checker is integrated into
ActionExecutorwhen an LLM provider is available - Pre-execution Analysis: Commands are analyzed before reaching the actual
execute_commandfunction - Context-aware: Considers both the command string and working directory
- Fail-safe: If the safety check fails, commands are blocked by default
The safety agent follows these principles:
- Ultra-conservative: Better to block a safe command than allow a dangerous one
- Context-aware: Same command may be safe in one directory but dangerous in another
- LLM-powered: Uses language model intelligence to understand command semantics
- Comprehensive: Blocks categories beyond simple patterns
The safety agent blocks commands that:
- Delete files/directories (
rm,rmdir,shred) especially recursive operations - Format disks or filesystems (
mkfs,fdisk,format) - Overwrite critical files with redirects
- Modify files in
/etc/,/boot/,/sys/,/proc/ - Change kernel modules
- Modify sensitive system file permissions (
chmod,chown)
- Install software without explicit consent (
apt,yum,pip,npm,cargo) - Package manager operations
- Download and execute code (
curl | bash,wget | sh) - Network attacks or scanning (
nmap,netcat) - Access or compromise credentials/API keys
- Commands with
sudounless clearly necessary and safe - System disruption (fork bombs, killing system processes)
- Overwrite block devices (
ddto/dev/sda) - Modify
/dev/nullor other special files
When using clippy-code with an LLM provider, the safety agent is automatically enabled:
# In ActionExecutor.__init__
llm_provider = LLMProvider(api_key="...", model="gpt-4")
executor = ActionExecutor(permission_manager, llm_provider=llm_provider)The LLM provider can be updated after initialization:
executor.set_llm_provider(new_provider)This is automatically called when:
- Agent switches models using
/modelcommand - Agent loads saved conversations
If no LLM provider is available, the system falls back to basic pattern matching:
- Existing dangerous pattern detection still works
- No LLM-powered analysis is performed
- Commands execute if they don't match known dangerous patterns
The safety agent uses a highly specialized system prompt that:
- Emphasizes extreme caution
- Provides clear examples of blocked vs allowed commands
- Requires exact "ALLOW:" or "BLOCK:" response format
- Includes working directory context in analysis
Example of the system prompt structure:
You are a specialized shell command security agent...
ERR ON THE SIDE OF CAUTION...
You must BLOCK commands that:
- Delete files/directories...
Respond with EXACTLY one line:
ALLOW: [brief reason if safe] or
BLOCK: [specific security concern]
The safety agent includes comprehensive tests covering:
- Safe commands being allowed
- Dangerous commands being blocked
- LLM failure handling (fail-safe blocking)
- Working directory context awareness
- Integration with executor
def test_dangerous_command_blocked():
mock_provider = Mock()
mock_provider.get_streaming_response.return_value = ["BLOCK: Too dangerous"]
executor = ActionExecutor(permission_manager, llm_provider=mock_provider)
success, message, result = executor.execute(
"execute_command", {"command": "rm -rf .", "working_dir": "."}
)
assert success is False
assert "blocked by safety agent" in message.lower()The safety check adds minimal overhead:
- Typically < 1 second for LLM analysis
- Parallelizable with other security checks
- No impact on non-command tools
- If LLM provider is unavailable, falls back to pattern matching
- Network failures or timeouts result in command blocking
- No risk of executing commands due to safety check failures
The safety agent prioritizes security over convenience:
- False positives (blocking safe commands) are preferred over false negatives
- Users can override with YOLO mode if needed (at their own risk)
- Patterns and prompts are conservative by design
Potential improvements being considered:
- Git repository awareness (don't delete .git)
- Project file analysis (don't delete important source files)
- User permission context
- User feedback integration
- Adaptive risk assessment
- Personalized safety profiles
- Container security awareness
- Cloud service specific protections
- CI/CD pipeline safety
- Check the error message: It includes the safety agent's reasoning
- Verify working directory: Same command may be safe in different contexts
- Review command construction: Try safer alternatives
- Use YOLO mode: For trusted environments (use with caution)
If safety checks are failing completely:
- Check LLM provider status: Ensure API keys are valid
- Network connectivity: Verify internet access for API calls
- Provider configuration: Check model availability and settings
- Fallback to pattern mode: System will work without LLM if needed
- Not perfect: May have false positives
- Context dependent: Same command在不同目录可能有不同风险评估
- LLM dependent: Requires working LLM provider for enhanced protection
- English commands: Optimized for English command analysis
When improving the safety agent:
- Test thoroughly: Use comprehensive test cases
- Stay conservative: Don't reduce safety checks
- Document changes: Update prompts and examples
- 考虑安全性: Always prioritize security over convenience