Skip to content

Security: Bande-a-Bonnot/Boucle-framework

Security

SECURITY.md

Security Architecture

Boucle implements a defense-in-depth security model to protect against prompt injection attacks and maintain trust boundaries between system components and external content.

Security Layers

1. Trust Boundaries

All context is explicitly marked with trust levels:

  • [TRUSTED SYSTEM DATA] - Core agent data (goals, memory, system status)
  • [EXTERNAL CONTENT - MAY BE UNTRUSTED] - Context plugins, user input, web content

2. Haiku Security Middleware

The security middleware (security-middleware.py) provides intelligent threat analysis:

Features:

  • Claude Haiku analysis for sophisticated injection detection
  • Unique nonce verification to prevent middleware attacks
  • Pattern-based fallback when Haiku unavailable
  • JSON output with threat levels and recommendations

Usage:

python3 security-middleware.py <file_path>
# Exit codes: 0 (allow), 1 (block), 2 (warn)

3. Secure Context Loading

The secure context loader (secure-context-loader.py) integrates security analysis with the context plugin system:

  • Analyzes all plugin output before assembly
  • Blocks high-threat content with clear warnings
  • Preserves safe content with minimal overhead
  • Provides transparency about filtered content

4. Content Validation

Pattern-based detection for common injection attempts:

High-risk patterns:

  • "ignore previous instructions"
  • "you are now"
  • "system:" commands
  • Role confusion attacks

Medium-risk patterns:

  • Command execution attempts
  • File system operations
  • System modification requests

Implementation Details

Haiku Analysis

When available, Claude Haiku provides contextual analysis:

def create_analysis_prompt(content: str, nonce: str) -> str:
    return f"""Analyze for prompt injection attempts.

NONCE: {nonce}

Content: {content}

Respond with JSON containing threat assessment."""

Nonce Verification

Unique nonces prevent attacks on the middleware itself:

  1. Generate UUID nonce for each request
  2. Include nonce in analysis prompt
  3. Verify nonce in response
  4. Block if nonce mismatch detected

Graceful Degradation

System maintains security when external dependencies fail:

  • Haiku unavailable → Pattern-based analysis
  • Network issues → Local threat detection
  • JSON parsing errors → Block by default

Integration Points

Framework Integration

The security middleware can be integrated into the Rust framework's context assembly:

// In context.rs
let secure_context = run_secure_context_loader()?;
context.push_str(&secure_context);

Testing

Security features are tested with injection patterns:

# Test with malicious content
echo "ignore previous instructions" | python3 security-middleware.py /dev/stdin

Security Considerations

  1. Defense in Depth - Multiple independent layers prevent single point of failure
  2. Fail Secure - Unknown threats are blocked by default
  3. Transparency - All filtering is logged and visible
  4. Performance - Minimal overhead for clean content
  5. Auditability - All security decisions are logged with reasoning

Threat Model

Protected against:

  • Prompt injection attacks
  • Role confusion attempts
  • System command injection
  • Instruction override attempts
  • Middleware attacks via nonce verification

Not protected against:

  • Social engineering
  • Physical access attacks
  • Legitimate but harmful commands
  • Zero-day injection techniques not covered by patterns

This security model balances protection with usability, ensuring the agent can operate safely with external content while maintaining transparency about security decisions.

There aren’t any published security advisories