π― Project Overview
Code reviews are critical but time-consuming. What if you had a team of AI expertsβeach specializing in security, performance, or code styleβworking together to review your code automatically? In this project, you'll build exactly that using multi-agent collaboration!
The Agent Team
π Security Agent
Specializes in finding vulnerabilities: SQL injection, XSS, hardcoded secrets, insecure dependencies
β‘ Performance Agent
Identifies bottlenecks: inefficient algorithms, memory leaks, blocking operations, database N+1 queries
π¨ Style Agent
Enforces best practices: naming conventions, code structure, documentation, maintainability
π Manager Agent
Coordinates the team, synthesizes findings, assigns severity, generates final report
π Why Multi-Agent? Instead of one generalist agent, specialized agents provide deeper, more accurate reviews. They collaborate like a real development team!
System Architecture
ββββββββββββββββββββββββββββββββββββββββ
β Code Submission (GitHub PR) β
ββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Manager Agent β β Coordinates team
ββββββββββ¬βββββββββββββ
β
ββββββββΌβββββββ
β β β
βΌ βΌ βΌ
ββββββ ββββββ ββββββ
βπ β ββ‘ β βπ¨ β β Specialist agents
βSec β βPerfβ βStyleβ (parallel review)
βββ¬βββ βββ¬βββ βββ¬βββ
β β β
ββββββββΌβββββββ
β
βΌ
βββββββββββββββββββββββ
β Manager Agent β β Synthesize findings
β (Final Report) β
βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Review Report β
β β’ Security: 2 issuesβ
β β’ Performance: 1 β
β β’ Style: 5 issues β
β β’ Overall: APPROVED β
βββββββββββββββββββββββ
π οΈ Setup & Dependencies
1 Install CrewAI Framework
# Install CrewAI and dependencies
pip install crewai crewai-tools
pip install langchain langchain-openai
pip install python-dotenv
# For GitHub integration
pip install PyGithub gitpython
# For syntax analysis
pip install pylint pyflakes radon
2 Configure API Keys
Create .env file:
# OpenAI API key (required)
OPENAI_API_KEY=your_openai_key_here
# GitHub token (optional - for PR integration)
GITHUB_TOKEN=your_github_token_here
π‘ Get API Keys:
- OpenAI: platform.openai.com
- GitHub: Settings β Developer settings β Personal access tokens
3 Project Structure
code-review-system/
βββ .env # API keys
βββ agents/
β βββ security_agent.py # Security specialist
β βββ performance_agent.py # Performance specialist
β βββ style_agent.py # Style specialist
β βββ manager_agent.py # Coordinator
βββ tools/
β βββ code_analyzer.py # Static analysis tools
β βββ github_tools.py # GitHub integration
βββ crew.py # Multi-agent crew setup
βββ main.py # Main entry point
βββ test_files/ # Sample code to review
π» Building the Multi-Agent System
Step 1: Define Specialist Agents
"""
Security specialist agent
"""
from crewai import Agent
from langchain_openai import ChatOpenAI
class SecurityAgent:
"""Agent specialized in code security"""
def __init__(self, llm):
self.agent = Agent(
role="Security Expert",
goal="Identify security vulnerabilities and risks in code",
backstory="""You are an expert security engineer with 15 years of experience.
You've found critical vulnerabilities in major projects and are passionate about
secure coding practices. You specialize in:
- SQL injection & XSS detection
- Authentication & authorization flaws
- Hardcoded secrets & credentials
- Insecure dependencies
- OWASP Top 10 vulnerabilities""",
verbose=True,
allow_delegation=False,
llm=llm
)
def get_agent(self):
return self.agent
# Sample security checks
SECURITY_CHECKS = """
When reviewing code, check for:
1. **Injection Flaws:**
- SQL injection (unsafe string concatenation in queries)
- Command injection (os.system with user input)
- LDAP/XPath injection
2. **Authentication Issues:**
- Weak password requirements
- Missing authentication checks
- Insecure session management
3. **Sensitive Data:**
- Hardcoded API keys, passwords, secrets
- Unencrypted sensitive data
- Logging sensitive information
4. **Dependencies:**
- Outdated libraries with known CVEs
- Imports from untrusted sources
5. **Access Control:**
- Missing authorization checks
- Insecure direct object references
For each issue found, provide:
- Severity: CRITICAL/HIGH/MEDIUM/LOW
- Location: File and line number
- Description: What's wrong
- Recommendation: How to fix
"""
"""
Performance specialist agent
"""
from crewai import Agent
class PerformanceAgent:
"""Agent specialized in performance optimization"""
def __init__(self, llm):
self.agent = Agent(
role="Performance Engineer",
goal="Identify performance bottlenecks and optimization opportunities",
backstory="""You are a performance optimization expert who has scaled
systems to handle millions of requests. You understand algorithms,
data structures, and system architecture deeply. You specialize in:
- Time complexity analysis (O(n), O(nΒ²), etc.)
- Memory leak detection
- Database query optimization (N+1 queries)
- Async/await usage
- Caching opportunities
- Resource management""",
verbose=True,
allow_delegation=False,
llm=llm
)
def get_agent(self):
return self.agent
PERFORMANCE_CHECKS = """
When reviewing code, check for:
1. **Algorithm Efficiency:**
- Nested loops (potential O(nΒ²) or worse)
- Inefficient data structures (lists vs sets)
- Redundant computations
2. **Database Issues:**
- N+1 query problems
- Missing indexes
- SELECT * instead of specific columns
- Loading too much data at once
3. **Memory:**
- Memory leaks (unclosed files, connections)
- Large objects in memory
- Inefficient string concatenation
4. **Async Operations:**
- Blocking operations in async code
- Missing await keywords
- Sequential operations that could be parallel
5. **Caching:**
- Repeated expensive operations
- Missing caching layer
- Cache invalidation issues
For each issue, provide:
- Severity: HIGH/MEDIUM/LOW
- Impact: Estimated performance impact
- Location: File and line number
- Recommendation: Specific optimization
"""
"""
Code style specialist agent
"""
from crewai import Agent
class StyleAgent:
"""Agent specialized in code quality and style"""
def __init__(self, llm):
self.agent = Agent(
role="Code Quality Expert",
goal="Ensure code follows best practices and maintainability standards",
backstory="""You are a software architect who values clean, readable code.
You've mentored hundreds of developers on writing maintainable code.
You specialize in:
- Naming conventions (PEP 8 for Python)
- Code structure and organization
- Documentation and comments
- DRY principle (Don't Repeat Yourself)
- SOLID principles
- Error handling patterns""",
verbose=True,
allow_delegation=False,
llm=llm
)
def get_agent(self):
return self.agent
STYLE_CHECKS = """
When reviewing code, check for:
1. **Naming:**
- Variables: descriptive snake_case
- Functions: verb-based, snake_case
- Classes: PascalCase
- Constants: UPPER_SNAKE_CASE
- Avoid single-letter names (except i, j in loops)
2. **Structure:**
- Function length (< 50 lines ideal)
- Class complexity (single responsibility)
- Module organization
- Proper imports (grouped: std, 3rd party, local)
3. **Documentation:**
- Missing docstrings
- Unclear comments
- TODO/FIXME without context
- No type hints
4. **Code Smells:**
- Duplicated code
- Magic numbers (use constants)
- Long parameter lists (> 4 parameters)
- Deeply nested conditionals (> 3 levels)
5. **Error Handling:**
- Bare except clauses
- Swallowed exceptions
- Missing error messages
For each issue, provide:
- Severity: HIGH/MEDIUM/LOW
- Location: File and line number
- Current code: What needs improvement
- Recommendation: Better approach
"""
Step 2: Create Manager Agent
"""
Manager agent that coordinates the review team
"""
from crewai import Agent
class ManagerAgent:
"""Manager agent that coordinates specialist agents"""
def __init__(self, llm):
self.agent = Agent(
role="Code Review Manager",
goal="Coordinate the review team and synthesize findings into actionable report",
backstory="""You are a technical lead who has managed code reviews
for major software projects. You understand how to balance security,
performance, and maintainability. You excel at:
- Prioritizing issues by severity and impact
- Synthesizing multiple perspectives
- Communicating clearly with developers
- Making approval/rejection decisions
- Providing actionable feedback""",
verbose=True,
allow_delegation=True, # Can delegate to specialist agents
llm=llm
)
def get_agent(self):
return self.agent
Step 3: Define Review Tasks
"""
Multi-agent code review crew
"""
import os
from dotenv import load_dotenv
from crewai import Crew, Task
from langchain_openai import ChatOpenAI
from agents.security_agent import SecurityAgent, SECURITY_CHECKS
from agents.performance_agent import PerformanceAgent, PERFORMANCE_CHECKS
from agents.style_agent import StyleAgent, STYLE_CHECKS
from agents.manager_agent import ManagerAgent
load_dotenv()
class CodeReviewCrew:
"""Multi-agent code review system"""
def __init__(self):
# Initialize LLM
self.llm = ChatOpenAI(
model="gpt-4-turbo-preview",
temperature=0.3,
openai_api_key=os.getenv("OPENAI_API_KEY")
)
# Initialize agents
self.security_agent = SecurityAgent(self.llm).get_agent()
self.performance_agent = PerformanceAgent(self.llm).get_agent()
self.style_agent = StyleAgent(self.llm).get_agent()
self.manager_agent = ManagerAgent(self.llm).get_agent()
def review_code(self, code: str, filename: str = "code.py") -> dict:
"""Review code using multi-agent crew"""
# Task 1: Security Review
security_task = Task(
description=f"""Review this code for security vulnerabilities:
Filename: {filename}
Code:
```python
{code}
```
{SECURITY_CHECKS}
Provide detailed findings with severity levels.""",
agent=self.security_agent,
expected_output="Detailed security analysis with vulnerabilities found, severity, and recommendations"
)
# Task 2: Performance Review
performance_task = Task(
description=f"""Review this code for performance issues:
Filename: {filename}
Code:
```python
{code}
```
{PERFORMANCE_CHECKS}
Identify bottlenecks and optimization opportunities.""",
agent=self.performance_agent,
expected_output="Performance analysis with bottlenecks, complexity analysis, and optimization suggestions"
)
# Task 3: Style Review
style_task = Task(
description=f"""Review this code for style and maintainability:
Filename: {filename}
Code:
```python
{code}
```
{STYLE_CHECKS}
Check adherence to Python best practices and PEP 8.""",
agent=self.style_agent,
expected_output="Code quality analysis with style issues, refactoring suggestions, and best practices"
)
# Task 4: Final Report (Manager synthesizes all findings)
synthesis_task = Task(
description=f"""You are reviewing code file: {filename}
Your specialist team has completed their reviews. Synthesize their findings into a comprehensive code review report.
IMPORTANT: You must delegate to the security, performance, and style agents to get their findings.
After collecting all reviews:
1. Summarize key findings by category
2. Assign overall priority to each issue
3. Count issues by severity (CRITICAL/HIGH/MEDIUM/LOW)
4. Make final decision: APPROVED / APPROVED_WITH_CHANGES / REJECTED
5. Provide top 3 action items for the developer
Format as a professional code review report.""",
agent=self.manager_agent,
expected_output="Comprehensive code review report with final decision and prioritized action items",
context=[security_task, performance_task, style_task] # Dependencies
)
# Create crew
crew = Crew(
agents=[
self.security_agent,
self.performance_agent,
self.style_agent,
self.manager_agent
],
tasks=[
security_task,
performance_task,
style_task,
synthesis_task
],
verbose=True
)
# Run review
print(f"\nπ Starting code review for {filename}...")
print("=" * 60)
result = crew.kickoff()
print("\n" + "=" * 60)
print("β
Code review complete!")
return {
"filename": filename,
"report": result,
"success": True
}
# Test with sample code
if __name__ == "__main__":
# Sample vulnerable code
test_code = '''
import sqlite3
import os
def get_user(username):
# Vulnerable to SQL injection!
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
query = f"SELECT * FROM users WHERE username = '{username}'"
cursor.execute(query)
result = cursor.fetchall()
return result
def process_data(data_list):
# Performance issue: O(nΒ²) complexity
result = []
for i in data_list:
for j in data_list:
if i != j:
result.append((i, j))
return result
def x(a, b, c, d, e, f):
# Style issues: poor naming, too many params
return a + b + c + d + e + f
API_KEY = "sk-1234567890abcdef" # Security: hardcoded secret!
'''
# Run review
crew = CodeReviewCrew()
result = crew.review_code(test_code, "vulnerable_code.py")
print("\n" + "=" * 60)
print("FINAL REPORT")
print("=" * 60)
print(result["report"])
β Checkpoint: Test the Crew
Run the code review system:
python crew.py
You should see:
- Security agent finding SQL injection and hardcoded API key
- Performance agent identifying O(nΒ²) complexity
- Style agent noting poor naming and parameter issues
- Manager synthesizing findings into final report
π Advanced Features
GitHub Integration
"""
GitHub integration for automatic PR reviews
"""
from github import Github
import os
class GitHubReviewer:
"""Integrate code review with GitHub PRs"""
def __init__(self):
self.github = Github(os.getenv("GITHUB_TOKEN"))
def review_pull_request(self, repo_name: str, pr_number: int):
"""Review a GitHub pull request"""
# Get repository and PR
repo = self.github.get_repo(repo_name)
pr = repo.get_pull(pr_number)
# Get changed files
files = pr.get_files()
# Review each file
from crew import CodeReviewCrew
crew = CodeReviewCrew()
all_reviews = []
for file in files:
if file.filename.endswith('.py'):
# Get file content
content = repo.get_contents(file.filename, ref=pr.head.sha)
code = content.decoded_content.decode('utf-8')
# Review
review = crew.review_code(code, file.filename)
all_reviews.append(review)
# Post combined review as PR comment
combined_report = self._combine_reviews(all_reviews)
pr.create_issue_comment(combined_report)
return all_reviews
def _combine_reviews(self, reviews):
"""Combine multiple file reviews"""
report = "## π€ Automated Code Review\\n\\n"
for review in reviews:
report += f"### {review['filename']}\\n"
report += review['report'] + "\\n\\n"
return report
# Usage
reviewer = GitHubReviewer()
reviewer.review_pull_request("yourusername/yourrepo", 42)
Static Analysis Tools Integration
"""
Integrate static analysis tools
"""
import pylint.lint
import radon.complexity as radon_cc
from pyflakes import api as pyflakes_api
import io
import sys
def run_pylint(code: str) -> str:
"""Run pylint on code"""
# Write code to temp file
with open('temp.py', 'w') as f:
f.write(code)
# Run pylint
stdout = io.StringIO()
sys.stdout = stdout
pylint.lint.Run(['temp.py'], exit=False)
sys.stdout = sys.__stdout__
return stdout.getvalue()
def calculate_complexity(code: str) -> list:
"""Calculate cyclomatic complexity"""
return radon_cc.cc_visit(code)
def run_pyflakes(code: str) -> str:
"""Run pyflakes for syntax errors"""
stdout = io.StringIO()
sys.stdout = stdout
pyflakes_api.check(code, 'code.py')
sys.stdout = sys.__stdout__
return stdout.getvalue()
# Add to agent tools
class EnhancedSecurityAgent(SecurityAgent):
def analyze_with_tools(self, code: str):
"""Combine AI with static analysis"""
# Get static analysis results
pylint_results = run_pylint(code)
pyflakes_results = run_pyflakes(code)
complexity = calculate_complexity(code)
# Provide to agent for enhanced analysis
enhanced_prompt = f"""
Code to review:
{code}
Static Analysis Results:
Pylint: {pylint_results}
Pyflakes: {pyflakes_results}
Complexity: {complexity}
Combine these automated findings with your security expertise.
"""
return enhanced_prompt
Custom Review Rules
"""
Add custom review rules
"""
import yaml
class CustomRules:
"""Load and apply custom review rules"""
def __init__(self, rules_file: str = "review_rules.yaml"):
with open(rules_file) as f:
self.rules = yaml.safe_load(f)
def get_rules_prompt(self) -> str:
"""Generate prompt from custom rules"""
prompt = "Additional company-specific rules:\\n\\n"
for category, rules in self.rules.items():
prompt += f"{category}:\\n"
for rule in rules:
prompt += f"- {rule}\\n"
prompt += "\\n"
return prompt
# review_rules.yaml
"""
security:
- Never use pickle for untrusted data
- All API endpoints must have authentication
- Secrets must use AWS Secrets Manager
performance:
- Database queries must have EXPLAIN ANALYZE comment
- API calls must have timeout < 5s
- Batch operations for > 100 items
style:
- All functions must have type hints
- Docstrings must follow Google style
- Max line length: 100 characters
"""
πͺ Challenges & Extensions
π₯ Challenge 1: Add Test Coverage Agent
Create an agent that checks test coverage and suggests missing tests.
class TestAgent:
"""Agent specialized in test coverage"""
def __init__(self, llm):
self.agent = Agent(
role="Test Engineer",
goal="Ensure comprehensive test coverage",
backstory="Expert in unit testing, integration testing, TDD...",
llm=llm
)
# TODO: Implement checks for:
# - Missing test files
# - Untested functions
# - Edge cases not covered
# - Test quality (assertions, mocking)
π₯ Challenge 2: Learning from Past Reviews
Store reviews in a database and use past feedback to improve future reviews.
import chromadb
class ReviewMemory:
"""Store and retrieve past reviews"""
def __init__(self):
self.client = chromadb.Client()
self.collection = self.client.create_collection("reviews")
def store_review(self, code, review, metadata):
"""Store review for future reference"""
self.collection.add(
documents=[code],
metadatas=[metadata],
ids=[metadata['id']]
)
def get_similar_reviews(self, code, n=3):
"""Find similar past reviews"""
results = self.collection.query(
query_texts=[code],
n_results=n
)
return results
π₯ Challenge 3: Multi-Language Support
Extend the system to review JavaScript, Go, Java, etc.
class LanguageRouter:
"""Route to language-specific agents"""
LANGUAGE_AGENTS = {
'python': PythonReviewCrew,
'javascript': JavaScriptReviewCrew,
'go': GoReviewCrew
}
def detect_language(self, filename: str) -> str:
"""Detect programming language"""
ext = filename.split('.')[-1]
mapping = {
'py': 'python',
'js': 'javascript',
'go': 'go'
}
return mapping.get(ext, 'unknown')
def review(self, code: str, filename: str):
"""Route to appropriate crew"""
lang = self.detect_language(filename)
crew_class = self.LANGUAGE_AGENTS.get(lang)
if crew_class:
crew = crew_class()
return crew.review_code(code, filename)
else:
return {"error": f"Unsupported language: {lang}"}
π Production Deployment
1. Set Up CI/CD Integration
# .github/workflows/code-review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run AI Code Review
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: python crew.py --pr ${{ github.event.pull_request.number }}
2. Add Cost Controls
from langchain.callbacks import get_openai_callback
def review_with_budget(crew, code, max_cost=1.0):
"""Review with cost limit"""
with get_openai_callback() as cb:
result = crew.review_code(code)
if cb.total_cost > max_cost:
print(f"β οΈ Review exceeded budget: ${cb.total_cost}")
result['cost'] = cb.total_cost
return result
3. Rate Limiting
from ratelimit import limits, sleep_and_retry
@sleep_and_retry
@limits(calls=10, period=60) # 10 reviews per minute
def rate_limited_review(crew, code):
"""Rate-limited review"""
return crew.review_code(code)
β οΈ Production Considerations:
- Costs: Each review uses ~5000-10000 tokens ($0.10-$0.30)
- Time: Full review takes 30-60 seconds
- Rate Limits: OpenAI has TPM (tokens per minute) limits
- False Positives: AI can make mistakes - human review still important
- Privacy: Don't send proprietary code to external APIs without approval
π― Key Takeaways
- Specialization: Multiple specialized agents > one generalist
- Collaboration: Agents work together through delegation and context sharing
- Task Design: Clear, specific tasks with expected outputs improve results
- Manager Pattern: Coordinator agent synthesizes findings effectively
- Tool Integration: Combine AI with static analysis for best results
- Iterative: Start simple, add agents and features progressively
π Next Steps
- π Project 1: AI Research Assistant - Build autonomous research agents
- π Project 3: Business Process Automation - Automate workflows
- π Multi-Agent Systems - Learn coordination patterns
- π Production Agent Systems - Scale your agents
π Congratulations!
You've built a multi-agent code review system with specialized AI agents collaborating like a real development team!
β Back to AI Agents Course