Master reasoning patterns that let agents think step-by-step and solve complex problems
π Complete all tutorials to earn your Free AI Agents Certificate
Shareable on LinkedIn β’ Verified by AITutorials.site β’ No signup fee
The breakthrough insight of modern agents is simple: Let AI models think step-by-step. Instead of demanding immediate answers, agents break problems into reasoning steps, explore options, and reconsider decisions.
This module teaches you the reasoning patterns that make agents intelligent.
Chain-of-Thought (CoT) is the foundational reasoning pattern. Instead of jumping to an answer, the model articulates its thinking:
β Without Chain-of-Thought:
User: "If a store sells apples at $2 each, and I buy 5, paying with $20, how much change?"
Agent: $10
β With Chain-of-Thought:
User: "If a store sells apples at $2 each, and I buy 5, paying with $20, how much change?"
Agent Thinks:
1. Price per apple: $2
2. Number of apples: 5
3. Total cost: $2 Γ 5 = $10
4. Amount paid: $20
5. Change: $20 - $10 = $10
Answer: $10
The answer is the same, but the step-by-step reasoning makes it:
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
# CoT prompt template
cot_prompt = PromptTemplate(
input_variables=["question"],
template="""You are a reasoning assistant.
Question: {question}
Let's think through this step by step:
1. First, identify what we know
2. Break down the problem
3. Work through each step
4. Verify the answer
Your reasoning:"""
)
llm = OpenAI(temperature=0.3)
chain = LLMChain(llm=llm, prompt=cot_prompt)
result = chain.run(question="What is 15% of 200?")
print(result)
ReAct (Reasoning + Acting) combines thinking with external tool use. The agent alternates between:
Task: "What is the current temperature in San Francisco and the weather forecast?"
Agent Thought: I need to find current temperature and forecast. I should use a weather tool.
Action: get_weather(city="San Francisco")
Observation: {"temp": 72Β°F, "forecast": "Sunny tomorrow, rain in 3 days"}
Agent Thought: Got it! I have the information needed to answer.
Final Answer: "It's currently 72Β°F in San Francisco with sunny weather tomorrow."
from langchain.agents import Tool, initialize_agent, AgentType
from langchain.llms import OpenAI
import json
def weather_tool(city: str) -> str:
"""Get weather for a city"""
# Simulated weather data
weather_db = {
"San Francisco": {"temp": 72, "condition": "Sunny"},
"New York": {"temp": 45, "condition": "Rainy"}
}
return json.dumps(weather_db.get(city, {"error": "City not found"}))
# Create tool
tools = [
Tool(
name="GetWeather",
func=weather_tool,
description="Get current weather for a city"
)
]
# Initialize ReAct agent
agent = initialize_agent(
tools,
OpenAI(temperature=0),
agent=AgentType.REACT_DOCSTRING,
verbose=True # Shows Thought β Action β Observation loop
)
# Agent will think, act, observe, and adapt
result = agent.run("What's the weather in San Francisco?")
Sometimes one reasoning path isn't enough. Tree-of-Thought lets agents explore multiple approaches:
Standard Thinking (Linear):
Start β Step 1 β Step 2 β Step 3 β Answer
Tree-of-Thought (Branching):
Start ββ Path A: Try approach 1 β ββ Step 1A β ββ Step 2A β Dead end β ββ Backtrack ββ Path B: Try approach 2 β ββ Step 1B β ββ Step 2B β ββ Step 3B β Good result! ββ Path C: Try approach 3 ββ Quick rejection Best: Path B β Answer
Tree-of-Thought is powerful for complex problems where exploration matters. The agent:
Multiple solution approaches for math/logic problems
Draft multiple versions, evaluate, refine best one
Explore options, compare trade-offs, choose best
Agents improve by reflecting on their own work:
1. Generate: Agent produces initial answer
2. Critique: Agent reviews its own work, identifies flaws
3. Refine: Agent improves based on critique
4. Verify: Check if refined answer is better
5. Iterate: Repeat until satisfactory
def refine_answer(question: str, initial_answer: str) -> str:
"""Agent refines its answer through self-critique"""
# Step 1: Generate
answer = llm(question)
# Step 2: Critique
critique_prompt = f"""
Question: {question}
Initial Answer: {answer}
Critique this answer. What's wrong? How could it be better?
"""
critique = llm(critique_prompt)
# Step 3: Refine
refine_prompt = f"""
Question: {question}
Initial Answer: {answer}
Critique: {critique}
Based on the critique, provide an improved answer.
"""
refined_answer = llm(refine_prompt)
return refined_answer
# Example
question = "Explain photosynthesis simply"
refined = refine_answer(question, "Plants make food from sun")
# Agent will critique, then improve its explanation
Complex tasks need hierarchical decomposition. Agents break big goals into subgoals:
Goal: "Write a blog post about AI agents"
ββ Subgoal 1: Research AI agent applications
ββ Search for use cases
ββ Compile findings
ββ Subgoal 2: Outline the blog post
ββ Define sections
ββ Create heading structure
ββ Subgoal 3: Write draft sections
ββ Introduction
ββ Main content
ββ Conclusion
ββ Subgoal 4: Edit and polish
ββ Grammar check
ββ Final review
Hierarchical planning helps agents:
from dataclasses import dataclass
from typing import List, Optional
from enum import Enum
class TaskStatus(Enum):
NOT_STARTED = "not_started"
IN_PROGRESS = "in_progress"
COMPLETED = "completed"
FAILED = "failed"
@dataclass
class Subgoal:
id: str
description: str
status: TaskStatus
dependencies: List[str] # IDs of subtasks that must complete first
subtasks: List['Subgoal']
result: Optional[str] = None
class HierarchicalPlanner:
def __init__(self, llm):
self.llm = llm
self.plan = None
def create_plan(self, goal: str) -> Subgoal:
"""Decompose goal into hierarchical subgoals"""
prompt = f"""
Break down this goal into a hierarchical plan:
Goal: {goal}
Create a tree of subgoals. For each subgoal:
1. Give it a clear description
2. List any dependencies (what must finish first)
3. Break it into smaller subtasks if needed
Format as:
Subgoal 1: [description]
Dependencies: []
Subtasks:
- 1.1: [description]
- 1.2: [description]
Your hierarchical plan:
"""
plan_text = self.llm(prompt)
plan = self._parse_plan(plan_text)
self.plan = plan
return plan
def execute_plan(self, plan: Subgoal) -> str:
"""Execute plan by traversing tree"""
print(f"π― Executing: {plan.description}\n")
# Check dependencies
for dep_id in plan.dependencies:
dep = self._find_subgoal(dep_id)
if dep.status != TaskStatus.COMPLETED:
print(f"βΈοΈ Waiting for dependency: {dep.description}")
return "Dependencies not met"
# Execute subtasks first
if plan.subtasks:
for subtask in plan.subtasks:
result = self.execute_plan(subtask)
if subtask.status == TaskStatus.FAILED:
plan.status = TaskStatus.FAILED
return "Subtask failed"
# Execute this task
plan.status = TaskStatus.IN_PROGRESS
result = self._execute_task(plan.description)
if "error" in result.lower():
plan.status = TaskStatus.FAILED
else:
plan.status = TaskStatus.COMPLETED
plan.result = result
print(f"β
Completed: {plan.description}\n")
return result
def _execute_task(self, task: str) -> str:
"""Execute a single task"""
# In production, this would call appropriate tools
return f"Executed: {task}"
def _parse_plan(self, plan_text: str) -> Subgoal:
"""Parse LLM output into Subgoal tree"""
# Simplified parsing (production would be more robust)
return Subgoal(
id="root",
description="Main goal",
status=TaskStatus.NOT_STARTED,
dependencies=[],
subtasks=[]
)
def _find_subgoal(self, goal_id: str) -> Optional[Subgoal]:
"""Find subgoal by ID in tree"""
# Traverse plan tree to find subgoal
pass
# Usage
planner = HierarchicalPlanner(llm=openai_llm)
plan = planner.create_plan("Research and write a technical blog post on AI agents")
result = planner.execute_plan(plan)
Beyond the core patterns, several advanced techniques enhance agent reasoning:
Show agents examples of good reasoning to guide their thinking:
few_shot_prompt = """
Here are examples of good reasoning:
Example 1:
Q: How many days until Christmas from Oct 15?
Reasoning:
- October has 31 days
- Days left in October: 31 - 15 = 16 days
- November: 30 days
- December: 25 days until Christmas
- Total: 16 + 30 + 25 = 71 days
Answer: 71 days
Example 2:
Q: If I save $50/month, how much in 1 year?
Reasoning:
- Savings per month: $50
- Months in year: 12
- Total: $50 Γ 12 = $600
Answer: $600
Now solve this:
Q: {user_question}
Reasoning:
"""
Agents ask themselves probing questions to deepen reasoning:
def socratic_reasoning(question: str) -> str:
"""Agent reasons through Socratic self-questioning"""
prompts = [
f"What is the core problem in: {question}",
"What do I already know about this?",
"What assumptions am I making?",
"What would disprove my current thinking?",
"What's a simpler version of this problem?",
"How can I verify my answer?"
]
reasoning_chain = []
for prompt in prompts:
response = llm(prompt)
reasoning_chain.append(f"Q: {prompt}\nA: {response}\n")
# Final answer based on complete reasoning
final_prompt = f"""
Based on this reasoning chain:
{''.join(reasoning_chain)}
Original question: {question}
Your final answer:
"""
return llm(final_prompt)
Agents consider "what if" scenarios to test their reasoning:
Agents draw parallels to similar problems they've solved:
analogical_prompt = f"""
Problem: {current_problem}
This problem is similar to: [identify analogous problem]
In that problem, we solved it by: [describe solution approach]
Applying same logic here:
Step 1: [adapt approach]
Step 2: [continue adaptation]
...
Solution: [final answer]
"""
Let's build a complete, production-quality ReAct agent with error handling, logging, and retry logic:
import openai
from typing import List, Dict, Callable
import json
import logging
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class ProductionReActAgent:
def __init__(self, api_key: str, tools: Dict[str, Callable]):
self.client = openai.OpenAI(api_key=api_key)
self.tools = tools
self.max_iterations = 15
self.max_retries = 3
self.conversation_history = []
def run(self, task: str) -> Dict:
"""Execute task using ReAct pattern"""
logger.info(f"Starting task: {task}")
observation = f"Task: {task}"
iteration = 0
while iteration < self.max_iterations:
iteration += 1
logger.info(f"\n--- Iteration {iteration} ---")
# THOUGHT: Reason about what to do
thought = self._think(observation, task)
logger.info(f"Thought: {thought}")
# Check if task complete
if "FINAL ANSWER:" in thought:
answer = thought.split("FINAL ANSWER:")[1].strip()
return {
"status": "success",
"answer": answer,
"iterations": iteration,
"history": self.conversation_history
}
# ACTION: Parse and execute action
action, params = self._parse_action(thought)
logger.info(f"Action: {action}({params})")
# Execute with retries
observation = self._execute_with_retry(action, params)
logger.info(f"Observation: {observation[:200]}...")
# Store in history
self.conversation_history.append({
"iteration": iteration,
"thought": thought,
"action": action,
"observation": observation,
"timestamp": datetime.now().isoformat()
})
return {
"status": "incomplete",
"reason": "max_iterations_reached",
"iterations": iteration,
"history": self.conversation_history
}
def _think(self, observation: str, task: str) -> str:
"""ReAct thinking step"""
context = self._build_context(observation, task)
system_prompt = """You are a ReAct agent. For each turn:
1. THOUGHT: Reason about the observation and what to do next
2. ACTION: Choose a tool and parameters
3. Wait for OBSERVATION
When you have the final answer, respond with:
FINAL ANSWER: [your answer]
Available tools:
""" + self._format_tool_descriptions()
response = self.client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": context}
],
temperature=0.1
)
return response.choices[0].message.content
def _build_context(self, observation: str, task: str) -> str:
"""Build context from history and current observation"""
context = f"Task: {task}\n\n"
# Add recent history
if self.conversation_history:
context += "Recent history:\n"
for entry in self.conversation_history[-3:]:
context += f"Thought: {entry['thought'][:100]}...\n"
context += f"Action: {entry['action']}\n"
context += f"Observation: {entry['observation'][:100]}...\n\n"
context += f"Current Observation: {observation}\n\n"
context += "Your thought and next action:"
return context
def _parse_action(self, thought: str) -> tuple:
"""Parse action from thought"""
# Simple parsing (production would use regex or structured output)
if "ACTION:" in thought:
action_line = thought.split("ACTION:")[1].split("\n")[0].strip()
# Extract tool name and parameters
if "(" in action_line:
tool_name = action_line.split("(")[0].strip()
params_str = action_line.split("(")[1].split(")")[0]
# Parse parameters
try:
params = json.loads("{" + params_str + "}")
except:
params = {"query": params_str}
return (tool_name, params)
return ("continue", {})
def _execute_with_retry(self, action: str, params: Dict,
retries: int = 0) -> str:
"""Execute action with retry logic"""
try:
if action not in self.tools:
return f"Error: Unknown tool '{action}'. Available: {list(self.tools.keys())}"
result = self.tools[action](**params)
return str(result)
except Exception as e:
logger.error(f"Tool execution error: {str(e)}")
if retries < self.max_retries:
logger.info(f"Retrying ({retries + 1}/{self.max_retries})...")
return self._execute_with_retry(action, params, retries + 1)
return f"Error after {self.max_retries} retries: {str(e)}"
def _format_tool_descriptions(self) -> str:
"""Format tool descriptions for prompt"""
descriptions = []
for name, func in self.tools.items():
doc = func.__doc__ or "No description"
descriptions.append(f"- {name}: {doc}")
return "\n".join(descriptions)
# Example tools
def search_web(query: str) -> str:
"""Search the web for information"""
return f"Search results for: {query}"
def calculate(expression: str) -> float:
"""Evaluate a mathematical expression"""
return eval(expression) # Use safely in production!
def send_email(to: str, subject: str, body: str) -> str:
"""Send an email"""
return f"Email sent to {to}"
# Usage
agent = ProductionReActAgent(
api_key="your-key",
tools={
"search": search_web,
"calculate": calculate,
"email": send_email
}
)
result = agent.run("What's 15% of the population of Tokyo? Email the answer to research@example.com")
print(json.dumps(result, indent=2))
Advanced agents use memory to improve reasoning over time:
Store past reasoning episodes and retrieve similar ones:
import chromadb
from sentence_transformers import SentenceTransformer
class EpisodicMemory:
def __init__(self):
self.encoder = SentenceTransformer('all-MiniLM-L6-v2')
self.client = chromadb.Client()
self.collection = self.client.create_collection("reasoning_episodes")
def store_episode(self, task: str, reasoning: str, outcome: str):
"""Store a reasoning episode"""
episode = {
"task": task,
"reasoning": reasoning,
"outcome": outcome,
"timestamp": datetime.now().isoformat()
}
# Embed and store
embedding = self.encoder.encode(task)
self.collection.add(
embeddings=[embedding.tolist()],
documents=[json.dumps(episode)],
ids=[f"episode_{datetime.now().timestamp()}"]
)
def recall_similar(self, current_task: str, n: int = 3) -> List[Dict]:
"""Retrieve similar past episodes"""
query_embedding = self.encoder.encode(current_task)
results = self.collection.query(
query_embeddings=[query_embedding.tolist()],
n_results=n
)
episodes = [json.loads(doc) for doc in results['documents'][0]]
return episodes
def reason_with_memory(self, task: str) -> str:
"""Use past episodes to inform current reasoning"""
similar_episodes = self.recall_similar(task)
memory_context = "Similar problems I've solved:\n"
for i, episode in enumerate(similar_episodes, 1):
memory_context += f"\n{i}. Task: {episode['task']}\n"
memory_context += f" Approach: {episode['reasoning'][:200]}...\n"
memory_context += f" Outcome: {episode['outcome']}\n"
prompt = f"""
{memory_context}
Current task: {task}
Based on past experience, how should I approach this?
"""
return llm(prompt)
# Usage
memory = EpisodicMemory()
# Store past reasoning
memory.store_episode(
task="Calculate compound interest",
reasoning="Used formula: A = P(1+r)^t, broke down each variable...",
outcome="Success - correct calculation"
)
# Later, use memory to help with similar task
approach = memory.reason_with_memory("Calculate loan repayment amount")
# Agent sees similar past tasks and adapts their approach
Agents that remember failed approaches avoid repeating mistakes:
class MistakeMemory:
def __init__(self):
self.mistakes = []
def record_failure(self, action: str, context: str, error: str):
"""Record a failed action"""
self.mistakes.append({
"action": action,
"context": context,
"error": error,
"timestamp": datetime.now()
})
def check_before_action(self, proposed_action: str, context: str) -> tuple:
"""Check if this action failed before in similar context"""
for mistake in self.mistakes:
if mistake['action'] == proposed_action:
# Similar context? Warn agent
similarity = self._compute_similarity(context, mistake['context'])
if similarity > 0.7:
return (False, f"Warning: This action failed before with error: {mistake['error']}")
return (True, "No known issues")
def _compute_similarity(self, text1: str, text2: str) -> float:
"""Compute similarity between contexts"""
# Use embedding similarity in production
return 0.5
Advanced agents can reason about their own reasoning process:
class MetaReasoningAgent:
def __init__(self, llm):
self.llm = llm
def reason_with_meta_awareness(self, problem: str) -> Dict:
"""Reason while monitoring reasoning quality"""
# Step 1: Initial reasoning
initial_reasoning = self.llm(f"Solve: {problem}")
# Step 2: Meta-reasoning - evaluate the reasoning
meta_prompt = f"""
Problem: {problem}
My reasoning: {initial_reasoning}
Meta-questions:
1. Is this reasoning strategy appropriate for this problem?
2. What's my confidence level (0-100)?
3. What could I be missing?
4. Should I use a different approach?
5. What are alternative solutions?
Your meta-analysis:
"""
meta_analysis = self.llm(meta_prompt)
# Step 3: Decide if reasoning is good enough
if "low confidence" in meta_analysis.lower() or "different approach" in meta_analysis.lower():
# Try alternative approach
alternative_prompt = f"""
Problem: {problem}
First attempt: {initial_reasoning}
Issues identified: {meta_analysis}
Try a completely different approach:
"""
alternative_reasoning = self.llm(alternative_prompt)
return {
"answer": alternative_reasoning,
"confidence": "improved",
"meta_analysis": meta_analysis,
"attempts": 2
}
return {
"answer": initial_reasoning,
"confidence": "high",
"meta_analysis": meta_analysis,
"attempts": 1
}
Real-world reasoning often involves incomplete information. Agents must handle uncertainty:
Agents should express confidence in their reasoning:
def reason_with_confidence(question: str) -> Dict:
"""Reason and provide confidence score"""
prompt = f"""
Question: {question}
Provide your reasoning and a confidence score (0-100).
Format:
REASONING: [your step-by-step thinking]
ANSWER: [your answer]
CONFIDENCE: [0-100]
UNCERTAINTY SOURCES: [what could make this wrong?]
"""
response = llm(prompt)
# Parse response
reasoning = extract_section(response, "REASONING")
answer = extract_section(response, "ANSWER")
confidence = int(extract_section(response, "CONFIDENCE"))
uncertainties = extract_section(response, "UNCERTAINTY SOURCES")
return {
"answer": answer,
"reasoning": reasoning,
"confidence": confidence,
"uncertainties": uncertainties,
"should_human_review": confidence < 70
}
For decisions with uncertain outcomes, agents can reason probabilistically:
Create a research agent that uses multiple reasoning patterns:
Different problems benefit from different reasoning patterns:
| Pattern | Best For | Example |
|---|---|---|
| Chain-of-Thought | Linear reasoning, math, logic | Calculate tax on a purchase |
| ReAct | Tasks requiring tools/APIs | Look up current stock prices, then analyze |
| Tree-of-Thought | Complex exploration, creative tasks | Design multiple solutions, pick best |
| Self-Refinement | Quality improvement, iterative work | Writing, code generation, analysis |
| Hierarchical | Large, multi-step projects | Complete a full research report |
Even with good reasoning patterns, agents can fall into traps:
| Pitfall | What Happens | Solution |
|---|---|---|
| Reasoning Loops | Agent repeats same reasoning, gets stuck | Detect repeated states, force new approach after N iterations |
| Over-confidence | Agent commits to wrong answer confidently | Force confidence scoring, verify high-confidence claims |
| Premature Commitment | Agent stops reasoning too early | Require minimum reasoning steps, use verification phase |
| Reasoning Shortcuts | Agent skips important steps | Enforce step-by-step format, reject incomplete reasoning |
| Tool Over-reliance | Agent uses tools unnecessarily | Teach when to reason vs. when to use tools |
class LoopDetector:
def __init__(self, window=5, threshold=0.8):
self.recent_states = []
self.window = window
self.threshold = threshold
def check_loop(self, current_state: str) -> bool:
"""Detect if agent is in a reasoning loop"""
# Add current state
self.recent_states.append(current_state)
# Keep only recent window
if len(self.recent_states) > self.window:
self.recent_states.pop(0)
# Check for repetition
if len(self.recent_states) >= 3:
# Compare current to previous states
similarities = []
for prev_state in self.recent_states[:-1]:
sim = self._similarity(current_state, prev_state)
similarities.append(sim)
# If highly similar to recent states, likely a loop
if max(similarities) > self.threshold:
return True
return False
def _similarity(self, s1: str, s2: str) -> float:
"""Compute similarity between reasoning states"""
# Simple implementation - use embeddings in production
s1_words = set(s1.lower().split())
s2_words = set(s2.lower().split())
intersection = s1_words & s2_words
union = s1_words | s2_words
return len(intersection) / len(union) if union else 0
# Usage in agent
loop_detector = LoopDetector()
for iteration in range(max_iterations):
reasoning = agent.think(observation)
if loop_detector.check_loop(reasoning):
logger.warning("Loop detected! Forcing new approach")
reasoning = agent.think_differently(observation)
# Continue with reasoning...
Production agents need to balance reasoning quality with speed and cost:
Use simple reasoning for easy problems, deep reasoning for hard ones:
class AdaptiveReasoningAgent:
def reason(self, problem: str) -> str:
# Quick assessment of problem difficulty
difficulty = self._assess_difficulty(problem)
if difficulty < 3:
# Simple problem - direct answer
return self.simple_reasoning(problem)
elif difficulty < 7:
# Medium - Chain-of-Thought
return self.cot_reasoning(problem)
else:
# Hard - Tree-of-Thought with exploration
return self.tot_reasoning(problem)
def _assess_difficulty(self, problem: str) -> int:
"""Rate problem difficulty 1-10"""
assessment_prompt = f"""
Rate problem difficulty (1-10):
Problem: {problem}
Consider:
- Number of steps required
- Domain knowledge needed
- Ambiguity level
- Need for external information
Difficulty (1-10):
"""
score = llm(assessment_prompt)
return int(score.strip())
Cache successful reasoning approaches for similar problems:
import hashlib
from functools import lru_cache
class ReasoningCache:
def __init__(self):
self.cache = {}
def get_cached_reasoning(self, problem: str) -> Optional[str]:
"""Check if similar problem was solved before"""
problem_hash = self._hash_problem(problem)
if problem_hash in self.cache:
cached = self.cache[problem_hash]
# Verify cache freshness
if (datetime.now() - cached['timestamp']).days < 7:
return cached['reasoning_template']
return None
def cache_reasoning(self, problem: str, reasoning: str):
"""Cache successful reasoning approach"""
problem_hash = self._hash_problem(problem)
self.cache[problem_hash] = {
'reasoning_template': reasoning,
'timestamp': datetime.now()
}
def _hash_problem(self, problem: str) -> str:
"""Create hash of problem for cache key"""
# Normalize problem
normalized = problem.lower().strip()
return hashlib.md5(normalized.encode()).hexdigest()
For complex problems, explore multiple reasoning paths in parallel:
import asyncio
from concurrent.futures import ThreadPoolExecutor
class ParallelReasoningAgent:
def __init__(self, llm):
self.llm = llm
self.executor = ThreadPoolExecutor(max_workers=3)
async def reason_parallel(self, problem: str) -> Dict:
"""Try multiple reasoning approaches simultaneously"""
approaches = [
self._approach_analytical,
self._approach_analogical,
self._approach_creative
]
# Run all approaches in parallel
tasks = [
asyncio.create_task(self._reason_async(approach, problem))
for approach in approaches
]
results = await asyncio.gather(*tasks)
# Evaluate and pick best result
best = self._select_best(results)
return {
"answer": best['answer'],
"approach": best['approach'],
"all_results": results
}
async def _reason_async(self, approach, problem):
"""Run reasoning approach asynchronously"""
loop = asyncio.get_event_loop()
return await loop.run_in_executor(
self.executor,
approach,
problem
)
def _select_best(self, results: List[Dict]) -> Dict:
"""Select best reasoning result"""
# Score each result
scored = []
for result in results:
score = self._score_reasoning(result)
scored.append((score, result))
# Return highest scoring
return max(scored, key=lambda x: x[0])[1]
When agents reason incorrectly, you need tools to diagnose the problem:
class ReasoningDebugger:
def __init__(self):
self.trace = []
def log_reasoning_step(self, step: Dict):
"""Log each reasoning step"""
self.trace.append({
**step,
'timestamp': datetime.now(),
'stack_depth': len(self.trace)
})
def visualize_trace(self):
"""Create visual representation of reasoning"""
print("\n" + "="*60)
print("REASONING TRACE")
print("="*60 + "\n")
for i, step in enumerate(self.trace, 1):
indent = " " * step['stack_depth']
print(f"{indent}{i}. {step['type']}")
print(f"{indent} Input: {step['input'][:50]}...")
print(f"{indent} Output: {step['output'][:50]}...")
print(f"{indent} Duration: {step.get('duration', 0):.2f}s")
print()
def export_trace(self, filename: str):
"""Export trace for analysis"""
import json
with open(filename, 'w') as f:
json.dump(self.trace, f, indent=2, default=str)
def find_error_point(self) -> Optional[int]:
"""Identify where reasoning went wrong"""
for i, step in enumerate(self.trace):
if 'error' in step.get('output', '').lower():
return i
return None
# Usage
debugger = ReasoningDebugger()
# In agent loop
debugger.log_reasoning_step({
'type': 'thought',
'input': observation,
'output': reasoning,
'duration': 0.5
})
# After completion
debugger.visualize_trace()
error_step = debugger.find_error_point()
if error_step:
print(f"Error detected at step {error_step}")
Automatically verify reasoning correctness:
class ReasoningValidator:
def validate(self, reasoning: str, answer: str) -> Dict:
"""Validate reasoning quality"""
checks = {
'has_steps': self._check_steps(reasoning),
'logical_flow': self._check_logic(reasoning),
'evidence_based': self._check_evidence(reasoning),
'conclusion_follows': self._check_conclusion(reasoning, answer),
'no_contradictions': self._check_consistency(reasoning)
}
score = sum(checks.values()) / len(checks)
return {
'valid': score > 0.7,
'score': score,
'checks': checks,
'issues': [k for k, v in checks.items() if not v]
}
def _check_steps(self, reasoning: str) -> bool:
"""Verify reasoning has clear steps"""
# Look for numbered steps or logical progression
indicators = ['step', '1.', '2.', 'first', 'then', 'next']
return any(ind in reasoning.lower() for ind in indicators)
def _check_logic(self, reasoning: str) -> bool:
"""Check for logical flow"""
# Use LLM to verify logic
prompt = f"""
Analyze this reasoning for logical flow:
{reasoning}
Is the logic sound? (Yes/No):
"""
response = llm(prompt).strip().lower()
return 'yes' in response
Essential practices for production reasoning agents:
Store every reasoning step. When things go wrong, traces are invaluable for debugging.
Don't let agents reason forever. Set hard limits: max iterations, max time, max tokens.
For high-stakes decisions, use multiple reasoning methods and compare results.
Force agents to express confidence. Use low-confidence as trigger for human review.
Test reasoning with ambiguous inputs, missing data, contradictory information.
Not every decision needs GPT-4. Use GPT-3.5 or Claude Instant for simple reasoning to save cost.
Q1: What is the main benefit of Chain-of-Thought (CoT) reasoning?
Q2: What does the "ReAct" pattern stand for?
Q3: In the ReAct loop, what comes after "Thought"?
Q4: What is the key advantage of Tree-of-Thought (ToT) over Chain-of-Thought?
Q5: When should you use hierarchical planning in agents?