Project: Multi-Agent Code Review System

🎯 Project Overview

Code reviews are critical but time-consuming. What if you had a team of AI experts—each specializing in security, performance, or code style—working together to review your code automatically? In this project, you'll build exactly that using multi-agent collaboration!

The Agent Team

🔒 Security Agent

Specializes in finding vulnerabilities: SQL injection, XSS, hardcoded secrets, insecure dependencies

⚡ Performance Agent

Identifies bottlenecks: inefficient algorithms, memory leaks, blocking operations, database N+1 queries

🎨 Style Agent

Enforces best practices: naming conventions, code structure, documentation, maintainability

📊 Manager Agent

Coordinates the team, synthesizes findings, assigns severity, generates final report

🚀 Why Multi-Agent? Instead of one generalist agent, specialized agents provide deeper, more accurate reviews. They collaborate like a real development team!

System Architecture

┌──────────────────────────────────────┐
│      Code Submission (GitHub PR)     │
└──────────────┬───────────────────────┘
               │
               ▼
    ┌─────────────────────┐
    │   Manager Agent     │ ← Coordinates team
    └────────┬────────────┘
             │
      ┌──────┼──────┐
      │      │      │
      ▼      ▼      ▼
   ┌────┐ ┌────┐ ┌────┐
   │🔒  │ │⚡  │ │🎨  │  ← Specialist agents
   │Sec │ │Perf│ │Style│    (parallel review)
   └─┬──┘ └─┬──┘ └─┬──┘
     │      │      │
     └──────┼──────┘
            │
            ▼
   ┌─────────────────────┐
   │  Manager Agent      │ ← Synthesize findings
   │  (Final Report)     │
   └─────────────────────┘
            │
            ▼
   ┌─────────────────────┐
   │  Review Report      │
   │  • Security: 2 issues│
   │  • Performance: 1    │
   │  • Style: 5 issues   │
   │  • Overall: APPROVED │
   └─────────────────────┘

🛠️ Setup & Dependencies

1 Install CrewAI Framework

# Install CrewAI and dependencies
pip install crewai crewai-tools
pip install langchain langchain-openai
pip install python-dotenv

# For GitHub integration
pip install PyGithub gitpython

# For syntax analysis
pip install pylint pyflakes radon

2 Configure API Keys

Create .env file:

# OpenAI API key (required)
OPENAI_API_KEY=your_openai_key_here

# GitHub token (optional - for PR integration)
GITHUB_TOKEN=your_github_token_here

💡 Get API Keys:

OpenAI: platform.openai.com
GitHub: Settings → Developer settings → Personal access tokens

3 Project Structure

code-review-system/
├── .env                       # API keys
├── agents/
│   ├── security_agent.py      # Security specialist
│   ├── performance_agent.py   # Performance specialist
│   ├── style_agent.py         # Style specialist
│   └── manager_agent.py       # Coordinator
├── tools/
│   ├── code_analyzer.py       # Static analysis tools
│   └── github_tools.py        # GitHub integration
├── crew.py                    # Multi-agent crew setup
├── main.py                    # Main entry point
└── test_files/                # Sample code to review

💻 Building the Multi-Agent System

Step 1: Define Specialist Agents

agents/security_agent.py

"""
Security specialist agent
"""
from crewai import Agent
from langchain_openai import ChatOpenAI

class SecurityAgent:
    """Agent specialized in code security"""
    
    def __init__(self, llm):
        self.agent = Agent(
            role="Security Expert",
            goal="Identify security vulnerabilities and risks in code",
            backstory="""You are an expert security engineer with 15 years of experience.
            You've found critical vulnerabilities in major projects and are passionate about
            secure coding practices. You specialize in:
            - SQL injection & XSS detection
            - Authentication & authorization flaws
            - Hardcoded secrets & credentials
            - Insecure dependencies
            - OWASP Top 10 vulnerabilities""",
            
            verbose=True,
            allow_delegation=False,
            llm=llm
        )
    
    def get_agent(self):
        return self.agent

# Sample security checks
SECURITY_CHECKS = """
When reviewing code, check for:

1. **Injection Flaws:**
   - SQL injection (unsafe string concatenation in queries)
   - Command injection (os.system with user input)
   - LDAP/XPath injection

2. **Authentication Issues:**
   - Weak password requirements
   - Missing authentication checks
   - Insecure session management

3. **Sensitive Data:**
   - Hardcoded API keys, passwords, secrets
   - Unencrypted sensitive data
   - Logging sensitive information

4. **Dependencies:**
   - Outdated libraries with known CVEs
   - Imports from untrusted sources

5. **Access Control:**
   - Missing authorization checks
   - Insecure direct object references

For each issue found, provide:
- Severity: CRITICAL/HIGH/MEDIUM/LOW
- Location: File and line number
- Description: What's wrong
- Recommendation: How to fix
"""

agents/performance_agent.py

"""
Performance specialist agent
"""
from crewai import Agent

class PerformanceAgent:
    """Agent specialized in performance optimization"""
    
    def __init__(self, llm):
        self.agent = Agent(
            role="Performance Engineer",
            goal="Identify performance bottlenecks and optimization opportunities",
            backstory="""You are a performance optimization expert who has scaled
            systems to handle millions of requests. You understand algorithms,
            data structures, and system architecture deeply. You specialize in:
            - Time complexity analysis (O(n), O(n²), etc.)
            - Memory leak detection
            - Database query optimization (N+1 queries)
            - Async/await usage
            - Caching opportunities
            - Resource management""",
            
            verbose=True,
            allow_delegation=False,
            llm=llm
        )
    
    def get_agent(self):
        return self.agent

PERFORMANCE_CHECKS = """
When reviewing code, check for:

1. **Algorithm Efficiency:**
   - Nested loops (potential O(n²) or worse)
   - Inefficient data structures (lists vs sets)
   - Redundant computations

2. **Database Issues:**
   - N+1 query problems
   - Missing indexes
   - SELECT * instead of specific columns
   - Loading too much data at once

3. **Memory:**
   - Memory leaks (unclosed files, connections)
   - Large objects in memory
   - Inefficient string concatenation

4. **Async Operations:**
   - Blocking operations in async code
   - Missing await keywords
   - Sequential operations that could be parallel

5. **Caching:**
   - Repeated expensive operations
   - Missing caching layer
   - Cache invalidation issues

For each issue, provide:
- Severity: HIGH/MEDIUM/LOW
- Impact: Estimated performance impact
- Location: File and line number
- Recommendation: Specific optimization
"""

agents/style_agent.py

"""
Code style specialist agent
"""
from crewai import Agent

class StyleAgent:
    """Agent specialized in code quality and style"""
    
    def __init__(self, llm):
        self.agent = Agent(
            role="Code Quality Expert",
            goal="Ensure code follows best practices and maintainability standards",
            backstory="""You are a software architect who values clean, readable code.
            You've mentored hundreds of developers on writing maintainable code.
            You specialize in:
            - Naming conventions (PEP 8 for Python)
            - Code structure and organization
            - Documentation and comments
            - DRY principle (Don't Repeat Yourself)
            - SOLID principles
            - Error handling patterns""",
            
            verbose=True,
            allow_delegation=False,
            llm=llm
        )
    
    def get_agent(self):
        return self.agent

STYLE_CHECKS = """
When reviewing code, check for:

1. **Naming:**
   - Variables: descriptive snake_case
   - Functions: verb-based, snake_case
   - Classes: PascalCase
   - Constants: UPPER_SNAKE_CASE
   - Avoid single-letter names (except i, j in loops)

2. **Structure:**
   - Function length (< 50 lines ideal)
   - Class complexity (single responsibility)
   - Module organization
   - Proper imports (grouped: std, 3rd party, local)

3. **Documentation:**
   - Missing docstrings
   - Unclear comments
   - TODO/FIXME without context
   - No type hints

4. **Code Smells:**
   - Duplicated code
   - Magic numbers (use constants)
   - Long parameter lists (> 4 parameters)
   - Deeply nested conditionals (> 3 levels)

5. **Error Handling:**
   - Bare except clauses
   - Swallowed exceptions
   - Missing error messages

For each issue, provide:
- Severity: HIGH/MEDIUM/LOW
- Location: File and line number
- Current code: What needs improvement
- Recommendation: Better approach
"""

Step 2: Create Manager Agent

agents/manager_agent.py

"""
Manager agent that coordinates the review team
"""
from crewai import Agent

class ManagerAgent:
    """Manager agent that coordinates specialist agents"""
    
    def __init__(self, llm):
        self.agent = Agent(
            role="Code Review Manager",
            goal="Coordinate the review team and synthesize findings into actionable report",
            backstory="""You are a technical lead who has managed code reviews
            for major software projects. You understand how to balance security,
            performance, and maintainability. You excel at:
            - Prioritizing issues by severity and impact
            - Synthesizing multiple perspectives
            - Communicating clearly with developers
            - Making approval/rejection decisions
            - Providing actionable feedback""",
            
            verbose=True,
            allow_delegation=True,  # Can delegate to specialist agents
            llm=llm
        )
    
    def get_agent(self):
        return self.agent

Step 3: Define Review Tasks

crew.py - Task definitions

"""
Multi-agent code review crew
"""
import os
from dotenv import load_dotenv
from crewai import Crew, Task
from langchain_openai import ChatOpenAI

from agents.security_agent import SecurityAgent, SECURITY_CHECKS
from agents.performance_agent import PerformanceAgent, PERFORMANCE_CHECKS
from agents.style_agent import StyleAgent, STYLE_CHECKS
from agents.manager_agent import ManagerAgent

load_dotenv()

class CodeReviewCrew:
    """Multi-agent code review system"""
    
    def __init__(self):
        # Initialize LLM
        self.llm = ChatOpenAI(
            model="gpt-4-turbo-preview",
            temperature=0.3,
            openai_api_key=os.getenv("OPENAI_API_KEY")
        )
        
        # Initialize agents
        self.security_agent = SecurityAgent(self.llm).get_agent()
        self.performance_agent = PerformanceAgent(self.llm).get_agent()
        self.style_agent = StyleAgent(self.llm).get_agent()
        self.manager_agent = ManagerAgent(self.llm).get_agent()
    
    def review_code(self, code: str, filename: str = "code.py") -> dict:
        """Review code using multi-agent crew"""
        
        # Task 1: Security Review
        security_task = Task(
            description=f"""Review this code for security vulnerabilities:

Filename: {filename}

Code:
```python
{code}
```

{SECURITY_CHECKS}

Provide detailed findings with severity levels.""",
            agent=self.security_agent,
            expected_output="Detailed security analysis with vulnerabilities found, severity, and recommendations"
        )
        
        # Task 2: Performance Review
        performance_task = Task(
            description=f"""Review this code for performance issues:

Filename: {filename}

Code:
```python
{code}
```

{PERFORMANCE_CHECKS}

Identify bottlenecks and optimization opportunities.""",
            agent=self.performance_agent,
            expected_output="Performance analysis with bottlenecks, complexity analysis, and optimization suggestions"
        )
        
        # Task 3: Style Review
        style_task = Task(
            description=f"""Review this code for style and maintainability:

Filename: {filename}

Code:
```python
{code}
```

{STYLE_CHECKS}

Check adherence to Python best practices and PEP 8.""",
            agent=self.style_agent,
            expected_output="Code quality analysis with style issues, refactoring suggestions, and best practices"
        )
        
        # Task 4: Final Report (Manager synthesizes all findings)
        synthesis_task = Task(
            description=f"""You are reviewing code file: {filename}

Your specialist team has completed their reviews. Synthesize their findings into a comprehensive code review report.

IMPORTANT: You must delegate to the security, performance, and style agents to get their findings.

After collecting all reviews:
1. Summarize key findings by category
2. Assign overall priority to each issue
3. Count issues by severity (CRITICAL/HIGH/MEDIUM/LOW)
4. Make final decision: APPROVED / APPROVED_WITH_CHANGES / REJECTED
5. Provide top 3 action items for the developer

Format as a professional code review report.""",
            agent=self.manager_agent,
            expected_output="Comprehensive code review report with final decision and prioritized action items",
            context=[security_task, performance_task, style_task]  # Dependencies
        )
        
        # Create crew
        crew = Crew(
            agents=[
                self.security_agent,
                self.performance_agent,
                self.style_agent,
                self.manager_agent
            ],
            tasks=[
                security_task,
                performance_task,
                style_task,
                synthesis_task
            ],
            verbose=True
        )
        
        # Run review
        print(f"\n🔍 Starting code review for {filename}...")
        print("=" * 60)
        
        result = crew.kickoff()
        
        print("\n" + "=" * 60)
        print("✅ Code review complete!")
        
        return {
            "filename": filename,
            "report": result,
            "success": True
        }

# Test with sample code
if __name__ == "__main__":
    # Sample vulnerable code
    test_code = '''
import sqlite3
import os

def get_user(username):
    # Vulnerable to SQL injection!
    conn = sqlite3.connect('users.db')
    cursor = conn.cursor()
    query = f"SELECT * FROM users WHERE username = '{username}'"
    cursor.execute(query)
    result = cursor.fetchall()
    return result

def process_data(data_list):
    # Performance issue: O(n²) complexity
    result = []
    for i in data_list:
        for j in data_list:
            if i != j:
                result.append((i, j))
    return result

def x(a, b, c, d, e, f):
    # Style issues: poor naming, too many params
    return a + b + c + d + e + f

API_KEY = "sk-1234567890abcdef"  # Security: hardcoded secret!
'''
    
    # Run review
    crew = CodeReviewCrew()
    result = crew.review_code(test_code, "vulnerable_code.py")
    
    print("\n" + "=" * 60)
    print("FINAL REPORT")
    print("=" * 60)
    print(result["report"])

✅ Checkpoint: Test the Crew

Run the code review system:

python crew.py

You should see:

Security agent finding SQL injection and hardcoded API key
Performance agent identifying O(n²) complexity
Style agent noting poor naming and parameter issues
Manager synthesizing findings into final report

🚀 Advanced Features

GitHub Integration

tools/github_tools.py

"""
GitHub integration for automatic PR reviews
"""
from github import Github
import os

class GitHubReviewer:
    """Integrate code review with GitHub PRs"""
    
    def __init__(self):
        self.github = Github(os.getenv("GITHUB_TOKEN"))
    
    def review_pull_request(self, repo_name: str, pr_number: int):
        """Review a GitHub pull request"""
        
        # Get repository and PR
        repo = self.github.get_repo(repo_name)
        pr = repo.get_pull(pr_number)
        
        # Get changed files
        files = pr.get_files()
        
        # Review each file
        from crew import CodeReviewCrew
        crew = CodeReviewCrew()
        
        all_reviews = []
        for file in files:
            if file.filename.endswith('.py'):
                # Get file content
                content = repo.get_contents(file.filename, ref=pr.head.sha)
                code = content.decoded_content.decode('utf-8')
                
                # Review
                review = crew.review_code(code, file.filename)
                all_reviews.append(review)
        
        # Post combined review as PR comment
        combined_report = self._combine_reviews(all_reviews)
        pr.create_issue_comment(combined_report)
        
        return all_reviews
    
    def _combine_reviews(self, reviews):
        """Combine multiple file reviews"""
        report = "## 🤖 Automated Code Review\\n\\n"
        
        for review in reviews:
            report += f"### {review['filename']}\\n"
            report += review['report'] + "\\n\\n"
        
        return report

# Usage
reviewer = GitHubReviewer()
reviewer.review_pull_request("yourusername/yourrepo", 42)

Static Analysis Tools Integration

Enhanced with static analysis

"""
Integrate static analysis tools
"""
import pylint.lint
import radon.complexity as radon_cc
from pyflakes import api as pyflakes_api
import io
import sys

def run_pylint(code: str) -> str:
    """Run pylint on code"""
    # Write code to temp file
    with open('temp.py', 'w') as f:
        f.write(code)
    
    # Run pylint
    stdout = io.StringIO()
    sys.stdout = stdout
    
    pylint.lint.Run(['temp.py'], exit=False)
    
    sys.stdout = sys.__stdout__
    return stdout.getvalue()

def calculate_complexity(code: str) -> list:
    """Calculate cyclomatic complexity"""
    return radon_cc.cc_visit(code)

def run_pyflakes(code: str) -> str:
    """Run pyflakes for syntax errors"""
    stdout = io.StringIO()
    sys.stdout = stdout
    
    pyflakes_api.check(code, 'code.py')
    
    sys.stdout = sys.__stdout__
    return stdout.getvalue()

# Add to agent tools
class EnhancedSecurityAgent(SecurityAgent):
    def analyze_with_tools(self, code: str):
        """Combine AI with static analysis"""
        
        # Get static analysis results
        pylint_results = run_pylint(code)
        pyflakes_results = run_pyflakes(code)
        complexity = calculate_complexity(code)
        
        # Provide to agent for enhanced analysis
        enhanced_prompt = f"""
Code to review:
{code}

Static Analysis Results:
Pylint: {pylint_results}
Pyflakes: {pyflakes_results}
Complexity: {complexity}

Combine these automated findings with your security expertise.
"""
        return enhanced_prompt

Custom Review Rules

Company-specific rules

"""
Add custom review rules
"""
import yaml

class CustomRules:
    """Load and apply custom review rules"""
    
    def __init__(self, rules_file: str = "review_rules.yaml"):
        with open(rules_file) as f:
            self.rules = yaml.safe_load(f)
    
    def get_rules_prompt(self) -> str:
        """Generate prompt from custom rules"""
        prompt = "Additional company-specific rules:\\n\\n"
        
        for category, rules in self.rules.items():
            prompt += f"{category}:\\n"
            for rule in rules:
                prompt += f"- {rule}\\n"
            prompt += "\\n"
        
        return prompt

# review_rules.yaml
"""
security:
  - Never use pickle for untrusted data
  - All API endpoints must have authentication
  - Secrets must use AWS Secrets Manager

performance:
  - Database queries must have EXPLAIN ANALYZE comment
  - API calls must have timeout < 5s
  - Batch operations for > 100 items

style:
  - All functions must have type hints
  - Docstrings must follow Google style
  - Max line length: 100 characters
"""

💪 Challenges & Extensions

🥉 Challenge 1: Add Test Coverage Agent

Create an agent that checks test coverage and suggests missing tests.

class TestAgent:
    """Agent specialized in test coverage"""
    
    def __init__(self, llm):
        self.agent = Agent(
            role="Test Engineer",
            goal="Ensure comprehensive test coverage",
            backstory="Expert in unit testing, integration testing, TDD...",
            llm=llm
        )
    
    # TODO: Implement checks for:
    # - Missing test files
    # - Untested functions
    # - Edge cases not covered
    # - Test quality (assertions, mocking)

🥈 Challenge 2: Learning from Past Reviews

Store reviews in a database and use past feedback to improve future reviews.

import chromadb

class ReviewMemory:
    """Store and retrieve past reviews"""
    
    def __init__(self):
        self.client = chromadb.Client()
        self.collection = self.client.create_collection("reviews")
    
    def store_review(self, code, review, metadata):
        """Store review for future reference"""
        self.collection.add(
            documents=[code],
            metadatas=[metadata],
            ids=[metadata['id']]
        )
    
    def get_similar_reviews(self, code, n=3):
        """Find similar past reviews"""
        results = self.collection.query(
            query_texts=[code],
            n_results=n
        )
        return results

🥇 Challenge 3: Multi-Language Support

Extend the system to review JavaScript, Go, Java, etc.

class LanguageRouter:
    """Route to language-specific agents"""
    
    LANGUAGE_AGENTS = {
        'python': PythonReviewCrew,
        'javascript': JavaScriptReviewCrew,
        'go': GoReviewCrew
    }
    
    def detect_language(self, filename: str) -> str:
        """Detect programming language"""
        ext = filename.split('.')[-1]
        mapping = {
            'py': 'python',
            'js': 'javascript',
            'go': 'go'
        }
        return mapping.get(ext, 'unknown')
    
    def review(self, code: str, filename: str):
        """Route to appropriate crew"""
        lang = self.detect_language(filename)
        crew_class = self.LANGUAGE_AGENTS.get(lang)
        
        if crew_class:
            crew = crew_class()
            return crew.review_code(code, filename)
        else:
            return {"error": f"Unsupported language: {lang}"}

🚀 Production Deployment

1. Set Up CI/CD Integration

# .github/workflows/code-review.yml
name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: pip install -r requirements.txt
      
      - name: Run AI Code Review
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: python crew.py --pr ${{ github.event.pull_request.number }}

2. Add Cost Controls

from langchain.callbacks import get_openai_callback

def review_with_budget(crew, code, max_cost=1.0):
    """Review with cost limit"""
    
    with get_openai_callback() as cb:
        result = crew.review_code(code)
        
        if cb.total_cost > max_cost:
            print(f"⚠️ Review exceeded budget: ${cb.total_cost}")
        
        result['cost'] = cb.total_cost
    
    return result

3. Rate Limiting

from ratelimit import limits, sleep_and_retry

@sleep_and_retry
@limits(calls=10, period=60)  # 10 reviews per minute
def rate_limited_review(crew, code):
    """Rate-limited review"""
    return crew.review_code(code)

⚠️ Production Considerations:

Costs: Each review uses ~5000-10000 tokens ($0.10-$0.30)
Time: Full review takes 30-60 seconds
Rate Limits: OpenAI has TPM (tokens per minute) limits
False Positives: AI can make mistakes - human review still important
Privacy: Don't send proprietary code to external APIs without approval

🎯 Key Takeaways

Specialization: Multiple specialized agents > one generalist
Collaboration: Agents work together through delegation and context sharing
Task Design: Clear, specific tasks with expected outputs improve results
Manager Pattern: Coordinator agent synthesizes findings effectively
Tool Integration: Combine AI with static analysis for best results
Iterative: Start simple, add agents and features progressively

📚 Next Steps

🔗 Project 1: AI Research Assistant - Build autonomous research agents
🔗 Project 3: Business Process Automation - Automate workflows
🔗 Multi-Agent Systems - Learn coordination patterns
🔗 Production Agent Systems - Scale your agents

👥 Build a Multi-Agent Code Review System