Skip to main content
Glama
agent-reasoning-visibility.md7.29 kB
# Deep Agent Reasoning Visibility with Streaming ## Understanding the Goal You want to see the actual LLM reasoning process (thinking tokens) for each agent, streamed in real-time to debug logs, similar to how you see UV's debug output. ## Proposed Implementation ### 1. Enable OpenAI Agents SDK Streaming & Tracing ```python from agents import Runner, RunConfig from agents.streaming import StreamingRunResult import logging # Configure logging for agent traces logging.basicConfig( level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' ) logger = logging.getLogger('reddit_research_agent') async def execute_reddit_research(query: str): # Enable tracing and streaming run_config = RunConfig( model="gpt-4", trace_metadata={"query": query}, workflow_name="reddit_research", # Enable detailed tracing trace_include_sensitive_data=True, ) # Use streaming runner for real-time output logger.debug(f"🎯 ORCHESTRATOR starting for query: {query}") # Stream orchestrator reasoning orchestrator_stream = await Runner.run_streamed( orchestrator, query, run_config=run_config ) # Process streaming events async for event in orchestrator_stream.stream_events(): if event.type == "reasoning": logger.debug(f"[ORCHESTRATOR THINKING] {event.content}") elif event.type == "tool_call": logger.debug(f"[ORCHESTRATOR ACTION] Calling: {event.tool_name}") search_plan = orchestrator_stream.final_output_as(SearchTaskPlan) ``` ### 2. Add Custom Context Wrapper for Reasoning Capture ```python class ReasoningCapture: """Capture and log agent reasoning in real-time""" def __init__(self, agent_name: str): self.agent_name = agent_name self.logger = logging.getLogger(f'agent.{agent_name}') async def wrap_agent_run(self, agent, input_data, context=None): self.logger.debug(f"[{self.agent_name}] Starting with input: {input_data[:100]}...") # Run with streaming to capture reasoning result = await Runner.run_streamed(agent, input_data, context=context) reasoning_tokens = [] async for event in result.stream_events(): if event.type in ["reasoning", "thinking"]: reasoning_tokens.append(event.content) self.logger.debug(f"[{self.agent_name} REASONING] {event.content}") elif event.type == "output": self.logger.debug(f"[{self.agent_name} OUTPUT] {event.content[:200]}...") # Log summary self.logger.info(f"[{self.agent_name}] Complete. Reasoning tokens: {len(reasoning_tokens)}") return result ``` ### 3. Environment Variable for Debug Mode ```python import os # Add debug mode toggle DEBUG_AGENTS = os.getenv("DEBUG_AGENTS", "false").lower() == "true" STREAM_REASONING = os.getenv("STREAM_REASONING", "false").lower() == "true" async def get_reddit_instance(debug=False): # Only show auth debug if DEBUG_AGENTS is enabled if debug and DEBUG_AGENTS: print(f"🔐 Reddit Auth Debug:...") ``` ### 4. Run Script with Debug Flags ```bash # In the script header, add environment variable support #!/usr/bin/env -S DEBUG_AGENTS=true STREAM_REASONING=true uv run --verbose --script # Or run with: DEBUG_AGENTS=true STREAM_REASONING=true uv run --verbose reddit_research_agent.py ``` ### 5. Structured Logging Output ```python # Configure different log levels for different components logging.getLogger('agent.orchestrator').setLevel(logging.DEBUG) logging.getLogger('agent.search_worker').setLevel(logging.INFO) logging.getLogger('agent.discovery_worker').setLevel(logging.INFO) logging.getLogger('agent.validation_worker').setLevel(logging.INFO) logging.getLogger('agent.synthesizer').setLevel(logging.DEBUG) logging.getLogger('asyncpraw').setLevel(logging.WARNING) # Reduce Reddit noise ``` ### 6. Custom Debug Output Format ```python class AgentDebugFormatter(logging.Formatter): """Custom formatter for agent debug output""" COLORS = { 'DEBUG': '\033[36m', # Cyan 'INFO': '\033[32m', # Green 'WARNING': '\033[33m', # Yellow 'ERROR': '\033[31m', # Red 'REASONING': '\033[35m', # Magenta } RESET = '\033[0m' def format(self, record): # Add colors for terminal output if hasattr(record, 'reasoning'): color = self.COLORS.get('REASONING', '') record.msg = f"{color}[THINKING] {record.msg}{self.RESET}" return super().format(record) # Apply formatter handler = logging.StreamHandler() handler.setFormatter(AgentDebugFormatter( '%(asctime)s | %(name)s | %(message)s' )) logging.root.addHandler(handler) ``` ## Expected Output with Deep Visibility: ``` $ DEBUG_AGENTS=true STREAM_REASONING=true uv run --verbose reddit_research_agent.py 2024-01-15 10:23:45 | agent.orchestrator | [ORCHESTRATOR THINKING] The user is asking about Trump and Putin in Alaska. I need to identify: 2024-01-15 10:23:45 | agent.orchestrator | [ORCHESTRATOR THINKING] 1. Core entities: Trump (person), Putin (person), Alaska (location) 2024-01-15 10:23:46 | agent.orchestrator | [ORCHESTRATOR THINKING] 2. These are political figures, so political subreddits would be relevant 2024-01-15 10:23:46 | agent.orchestrator | [ORCHESTRATOR THINKING] 3. For direct searches, I'll use single terms like "trump", "putin", "alaska" 2024-01-15 10:23:47 | agent.orchestrator | [ORCHESTRATOR OUTPUT] SearchTaskPlan(direct_searches=['trump', 'putin', 'alaska'], ...) 2024-01-15 10:23:48 | agent.search_worker | [SEARCH_WORKER THINKING] I received terms: trump, putin, alaska 2024-01-15 10:23:48 | agent.search_worker | [SEARCH_WORKER THINKING] These are potential subreddit names. I'll search each one. 2024-01-15 10:23:49 | agent.search_worker | [SEARCH_WORKER ACTION] Calling search_subreddits_tool(query='trump') 2024-01-15 10:23:50 | reddit.api | Searching for communities matching: 'trump' 2024-01-15 10:23:51 | reddit.api | Found 24 communities ``` ## Benefits: 1. **Real thinking tokens**: See actual LLM reasoning, not just formatted output 2. **Streaming visibility**: Watch agents think in real-time 3. **Debug control**: Toggle verbosity with environment variables 4. **Performance metrics**: Track reasoning token usage per agent 5. **Structured logs**: Filter by agent or log level 6. **UV integration**: Works alongside UV's --verbose flag ## Alternative: OpenAI Tracing Dashboard The OpenAI Agents SDK also supports sending traces to their dashboard: ```python # Traces will appear at https://platform.openai.com/traces run_config = RunConfig( workflow_name="reddit_research", trace_id=f"reddit_{timestamp}", trace_metadata={"query": query, "version": "1.0"} ) ``` This gives you a web UI to explore agent reasoning after execution. ## Implementation Priority 1. Start with environment variable debug flags (easiest) 2. Add structured logging with custom formatter 3. Implement streaming for orchestrator and synthesizer (most valuable) 4. Add streaming for worker agents if needed 5. Consider OpenAI dashboard for production monitoring

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/king-of-the-grackles/dialog-reddit-tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server