Code Execution Agent

Overview

The Code Execution Agent is an AI-powered system that generates, tests, and iteratively improves Python code based on natural language instructions. It implements a complete test-driven development workflow with self-healing capabilities.

Architecture

Core Components

AICodeGenerator Class (backend/agents/code_execution_agent.py)

Main entry point for the code generation system
Manages the complete workflow from instruction to working implementation
Implements caching for successful solutions

Workflow Pipeline

The agent follows a 5-step process:

Function Specification Generation - Analyzes instructions and defines required functions
Test Case Generation - Creates comprehensive test cases for each function
Implementation Generation - Generates initial code implementations
Test Execution - Runs tests and validates implementations
Iterative Improvement - Recursively improves failed implementations

Key Features

Self-Evolving Code Generation

Automatic function specification: Determines needed functions and their interfaces
Dependency resolution: Sorts functions by dependencies for correct implementation order
Test-driven development: Generates tests before implementations
Iterative improvement: Automatically fixes failing implementations

Intelligent Testing System

Comprehensive test coverage: Normal cases, edge cases, and error cases
Safe execution environment: Isolated test execution with cleanup
Output comparison: Handles complex data types (lists, dictionaries, objects)
Execution safety: Timeout protection and error handling

Caching and Performance

Solution caching: Stores successful implementations for reuse
Cache key generation: MD5-based hashing of instructions
Performance optimization: Avoids redundant computation

Usage Examples

Basic Usage

from backend.agents.code_execution_agent import AICodeGenerator

# Initialize with API key
generator = AICodeGenerator("your_api_key_here")

# Generate solution from natural language
instruction = """
Create a function that finds all prime numbers up to a given limit 
using the Sieve of Eratosthenes algorithm.
"""

solution = generator.generate_solution(instruction)

# Check results
print(f"All tests passed: {solution['all_tests_passed']}")
print(f"Functions generated: {len(solution['implementations'])}")

Solution Structure

The generate_solution() method returns a comprehensive result:

{
    "instruction": "Original instruction text",
    "function_specs": [
        {
            "name": "function_name",
            "purpose": "What this function does",
            "inputs": [{"name": "param", "type": "str"}],
            "outputs": {"type": "list", "description": "Return value"},
            "dependencies": ["other_function"],
            "complexity": "simple|medium|complex"
        }
    ],
    "implementations": {
        "function_name": "def function_name(param):\n    # implementation"
    },
    "tests": {
        "function_name": [
            {
                "name": "test_case_description",
                "inputs": {"param": "test_value"},
                "expected_output": "expected_result",
                "setup": "optional_setup_code"
            }
        ]
    },
    "test_results": [
        {
            "function": "function_name",
            "test": "test_case_name",
            "passed": True,
            "actual_output": "result"
        }
    ],
    "all_tests_passed": True
}

Implementation Details

Function Specification Process

The agent uses AI to analyze instructions and generate structured specifications:

def _generate_function_specs(self, instruction: str) -> List[Dict[str, Any]]:
    # Uses AI to determine required functions
    # Returns specifications with inputs, outputs, dependencies

Test Generation Strategy

Each function gets comprehensive test coverage:

def _generate_tests(self, function_specs, instruction) -> Dict[str, List[Dict]]:
    # Creates normal cases, edge cases, error cases
    # Includes setup code when needed

Dependency Resolution

Functions are implemented in correct order:

def _sort_by_dependencies(self, function_specs) -> List[Dict]:
    # Topological sort to handle dependencies
    # Detects circular dependencies

Iterative Improvement

Failed implementations are automatically improved:

def _improve_implementations(self, instruction, specs, implementations, tests, results):
    # Analyzes failure patterns
    # Generates improved implementations
    # Recursively improves until success or limit reached

Error Handling

Execution Safety

Timeout protection: Commands timeout after reasonable periods
Safe imports: Controlled module loading
Cleanup: Temporary files are automatically removed
Exception handling: Comprehensive error catching and reporting

Failure Recovery

Alternative approaches: Tries different implementation strategies
Gradual improvement: Makes incremental fixes to failing code
Detailed diagnostics: Provides specific error information for debugging

Configuration

Initialization Parameters

AICodeGenerator(
    api_key="your_api_key",      # AI service API key
    cache_dir=".code_cache"      # Directory for caching solutions
)

Environment Requirements

Python 3.7+
Write access for temporary files
Network access for AI API calls
Required packages: requests, hashlib, json

API Integration

The agent is designed to work with various AI services. The current implementation includes:

Generic API format: Configurable endpoint and headers
Request/response handling: JSON-based communication
Error handling: Graceful degradation on API failures
Rate limiting: Built-in request management

API Configuration

# Configure headers for AI service
self.headers = {
    "Content-Type": "application/json",
    "x-api-key": api_key
}

# API call with error handling
response = requests.post(
    "https://api.example.com/v1/completion",
    headers=self.headers,
    json={"prompt": prompt, "max_tokens": 2000},
    timeout=30
)

Performance Considerations

Caching Strategy

Cache hits: Instant return for previously solved problems
Cache misses: Full generation pipeline with caching of result
Cache invalidation: Manual clearing when needed

Optimization Techniques

Batch processing: Handles multiple functions efficiently
Smart dependency ordering: Minimizes redundant work
Progressive complexity: Simple functions first, complex ones using dependencies

Integration with Backend

The Code Execution Agent integrates with the main backend system:

File Location: backend/agents/code_execution_agent.py Integration Points:

Backend tool system can invoke the agent
Results can be formatted for chat responses
Caching works across different user sessions

Security Considerations

Code Execution Safety

Isolated execution: Code runs in controlled environment
Import restrictions: Limited to safe modules
Timeout enforcement: Prevents infinite loops
File system protection: Controlled file access

Input Validation

Instruction sanitization: Validates natural language inputs
Code validation: Checks generated code before execution
Test case validation: Ensures test cases are safe to run

Future Enhancements

Planned Features

Multi-language support: Beyond Python to JavaScript, Java, etc.
Advanced caching: Semantic similarity-based cache lookup
Performance profiling: Code efficiency analysis
Security scanning: Automated vulnerability detection

Integration Opportunities

IDE integration: Real-time code assistance
Continuous integration: Automated test generation for CI/CD
Documentation generation: Automatic code documentation
Code review assistance: Suggestions for code improvements

Troubleshooting

Common Issues

API Connection Failures

# Check API key and endpoint configuration
# Verify network connectivity
# Review request/response format

Test Execution Errors

# Check Python environment
# Verify import statements
# Review temporary file permissions

Cache Issues

# Clear cache directory
# Check file permissions
# Verify disk space

Debug Mode

Enable detailed logging by modifying the agent:

# Add debug prints to trace execution
# Monitor API requests and responses
# Check intermediate results at each step

VAC Service Architecture - Overall backend architecture
Backend API Guide - API integration patterns
Tool Orchestrator System - Tool integration framework