mcp-bbs

mcp-bbs
.provide

INTERVENTION_SYSTEM.md•9.92 KiB

# LLM Intervention System Implementation ## Overview The intervention system provides intelligent oversight for TW2002 autonomous bots, detecting behavioral anomalies and missed opportunities through pattern recognition and LLM-powered analysis. ## Architecture ### Core Components ``` TradingBot └─ AIStrategy (decision-making) ├─ InterventionTrigger (coordination) │ ├─ InterventionDetector (pattern detection) │ └─ SessionLogger (event recording) └─ InterventionAdvisor (LLM analysis) └─ LLMManager (LLM calls) ``` ### Files Created **Intervention System** (`src/bbsbot/games/tw2002/interventions/`): - `detector.py` (458 LOC) - Detection algorithms - `trigger.py` (229 LOC) - Coordination logic - `advisor.py` (275 LOC) - LLM prompt building - `__init__.py` (29 LOC) - Package exports **Configuration**: - `config.py` - Added `InterventionConfig` class (35 lines) **MCP Tools**: - `mcp_tools_intervention.py` (218 LOC) - External control tools **Integration**: - `ai_strategy.py` - Added intervention hooks (151 lines added) **Verification**: - `scripts/verify_intervention_system.py` (363 LOC) - Automatic testing ## Detection Algorithms ### Anomalies (High Priority) 1. **Action Loop** (confidence: 0.85) - Same action repeated 3+ times - Alternating action patterns (A-B-A-B) 2. **Sector Loop** (confidence: 0.8) - Same sector visited 4+ times in window 3. **Goal Stagnation** (confidence: 0.75) - Credits change <5% over 15 turns 4. **Performance Decline** (confidence: 0.7) - Profit/turn drops >50% comparing halves 5. **Turn Waste** (confidence: 0.65) - >30% of turns with zero/negative profit ### Opportunities (Medium Priority) 1. **High-Value Trade** (configurable) - Available trade worth >5000 credits within 3 hops 2. **Combat Readiness** (confidence: 0.8) - Fighters >50 AND shields >100 - Current goal not "combat" 3. **Banking Optimal** (confidence: 0.85) - Carrying >100k credits - Not in FedSpace ## LLM Integration ### System Prompt The intervention advisor uses a specialized system prompt focused on anomaly detection: ```python INTERVENTION_SYSTEM_PROMPT = """You are an expert Trade Wars 2002 gameplay analyst monitoring autonomous bot performance. YOUR ROLE: Detect behavioral anomalies, performance degradation, and missed opportunities. ANALYSIS FRAMEWORK: 1. PATTERN RECOGNITION - Stuck behaviors (repeated actions, location loops) - Efficiency drops (declining profit velocity, wasted turns) - Goal misalignment (actions inconsistent with stated goal) - Decision quality issues (poor reasoning, invalid assumptions) 2. OPPORTUNITY DETECTION - Overlooked profitable trades - Combat readiness gaps - Banking failures (excess credits not secured) - Exploration inefficiencies 3. SEVERITY ASSESSMENT - CRITICAL: Bot stuck, ship at risk, or major capital loss - WARNING: Performance declining, suboptimal patterns - INFO: Minor inefficiencies, optimization opportunities ``` ### Context Building Each intervention analysis receives rich context: - Current game state (location, credits, ship status) - Goal history with metrics (start/end credits, turns, transitions) - Performance statistics (profit/turn, trades executed) - Recent decisions with LLM reasoning - Detected anomalies with evidence - Identified opportunities with evidence ### Response Format The LLM returns structured JSON: ```json { "severity": "critical|warning|info", "category": "stuck_pattern|performance_decline|opportunity_missed|goal_misalignment", "observation": "Concise issue description", "evidence": ["Supporting fact 1", "Supporting fact 2"], "recommendation": "continue|adjust_goal|manual_review|direct_intervention", "suggested_action": { "type": "change_goal|reset_strategy|force_move|none", "parameters": {...} }, "reasoning": "Why this recommendation", "confidence": 0.0-1.0 } ``` ## Configuration ### InterventionConfig ```python class InterventionConfig(BaseModel): # Enable/disable enabled: bool = True # Detection thresholds loop_action_threshold: int = 3 loop_sector_threshold: int = 4 stagnation_turns: int = 15 profit_decline_ratio: float = 0.5 turn_waste_threshold: float = 0.3 # Opportunity thresholds high_value_trade_min: int = 5000 combat_ready_fighters: int = 50 combat_ready_shields: int = 100 banking_threshold: int = 100000 # Intervention behavior auto_apply: bool = False min_priority: str = "medium" cooldown_turns: int = 5 max_per_session: int = 20 # LLM settings analysis_temperature: float = 0.3 analysis_max_tokens: int = 800 ``` ## MCP Tools ### get_intervention_status() Returns current intervention system status: ```python { "enabled": true, "interventions_this_session": 3, "budget_remaining": 17, "last_intervention_turn": 245, "recent_anomalies": [...], "recent_opportunities": [...], "turn_history_summary": {...} } ``` ### trigger_manual_intervention(analysis_prompt?) Forces immediate intervention analysis: ```python result = await trigger_manual_intervention( analysis_prompt="Why is the bot stuck in sector 100?" ) ``` ## Session Logging Interventions are logged to the session JSONL: ```json { "ts": 1707300123.456, "event": "llm.intervention", "session_id": 12345, "data": { "turn": 245, "intervention_number": 3, "trigger_type": "anomaly", "priority": "HIGH", "category": "action_loop", "observation": "Repeating MOVE action 3+ times", "evidence": ["Last 5 actions: MOVE→MOVE→MOVE→WAIT→MOVE"], "recommendation": "adjust_goal", "suggested_action": {"type": "change_goal", "parameters": {"goal": "exploration"}}, "reasoning": "Bot appears stuck in navigation. Switching to exploration may help discover new routes.", "confidence": 0.85, "auto_applied": false, "llm_duration_ms": 1234.5 } } ``` ## Code Quality All files pass: - ✅ **Ruff format/check** (120 char line length) - ✅ **Mypy type checking** (strict mode) - ✅ **Bandit security scan** (no issues) - ✅ **File size limits** (all <500 LOC) ### Type Safety - Uses `StrEnum` for enumerations - Pydantic `BaseModel` for all data structures - Full type annotations throughout - Generic type hints with `list[T]` syntax ### Security - No command injection vectors - Safe JSON parsing with error handling - Input validation via Pydantic - No hardcoded credentials ## Integration Points ### AIStrategy Initialization ```python # Intervention system self._intervention_trigger = InterventionTrigger( enabled=config.intervention.enabled, min_priority=config.intervention.min_priority, cooldown_turns=config.intervention.cooldown_turns, max_interventions_per_session=config.intervention.max_per_session, session_logger=self._session_logger, ) self._intervention_advisor = InterventionAdvisor( config=config, llm_manager=self.llm_manager, ) ``` ### Decision Loop ```python # Check for intervention triggers should_intervene, reason, context = self._intervention_trigger.should_intervene( current_turn=self._current_turn, state=state, strategy=self, ) if should_intervene: recommendation = await self._intervention_advisor.analyze( state=state, recent_decisions=self._get_recent_decisions(), strategy_stats=self.stats, goal_phases=self._goal_phases, anomalies=context.get("anomalies", []), opportunities=context.get("opportunities", []), trigger_reason=reason, ) await self._intervention_trigger.log_intervention(...) if config.intervention.auto_apply: action, params = self._apply_intervention(recommendation, state) ``` ### Result Tracking ```python def record_action_result(self, action: TradeAction, profit_delta: int, state: GameState): """Record action result for intervention detection.""" self._intervention_trigger.update_detector( turn=self._current_turn, state=state, action=action.name, profit_delta=profit_delta, strategy=self, ) ``` ## Verification ### Automatic Testing The `verify_intervention_system.py` script provides end-to-end testing: 1. **Game Discovery**: Connects to localhost:2002 and discovers available games 2. **Multi-Game Testing**: Tests intervention system across multiple games 3. **Anomaly Monitoring**: Tracks all anomalies and opportunities detected 4. **Statistics**: Reports interventions triggered, anomalies found, turns played 5. **Results**: Saves detailed JSON results for analysis Usage: ```bash python scripts/verify_intervention_system.py ``` Output: - Console logs with real-time progress - `verification_output.log` with full output - `verification_results.json` with detailed data ## Performance Characteristics - **Detection Overhead**: Minimal - runs on existing action history - **LLM Calls**: Controlled by cooldown (default 5 turns) and budget (default 20/session) - **Memory**: Fixed-size rolling window (default 10 turns) - **Latency**: Async LLM calls don't block decision loop ## Future Enhancements 1. **Learning from Interventions** - Track intervention outcomes - Adjust detection thresholds based on results - Build pattern library from successful interventions 2. **Advanced Detection** - Machine learning for pattern recognition - Temporal pattern analysis (time-series) - Multi-bot coordination patterns 3. **Visualization** - Real-time intervention dashboard - Anomaly heatmaps over time - Goal transition graphs with interventions 4. **Integration** - Prometheus metrics for monitoring - Alert webhooks for critical interventions - Strategy replay with intervention points marked ## Dependencies - **Pydantic**: Data validation and serialization - **LLMManager**: Existing LLM abstraction - **SessionLogger**: Event persistence - **GameState**: Current state tracking - **GoalPhase**: Goal history tracking ## License Same as bbsbot project (MIT)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/livingstaccato/mcp-bbs'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

INTERVENTION_SYSTEM.md•9.92 KiB