Code Execution Agent
Overview
The Code Execution Agent is an AI-powered system that generates, tests, and iteratively improves Python code based on natural language instructions. It implements a complete test-driven development workflow with self-healing capabilities.
Architecture
Core Components
AICodeGenerator Class (backend/agents/code_execution_agent.py)
- Main entry point for the code generation system
- Manages the complete workflow from instruction to working implementation
- Implements caching for successful solutions
Workflow Pipeline
The agent follows a 5-step process:
- Function Specification Generation - Analyzes instructions and defines required functions
- Test Case Generation - Creates comprehensive test cases for each function
- Implementation Generation - Generates initial code implementations
- Test Execution - Runs tests and validates implementations
- Iterative Improvement - Recursively improves failed implementations
Key Features
Self-Evolving Code Generation
- Automatic function specification: Determines needed functions and their interfaces
- Dependency resolution: Sorts functions by dependencies for correct implementation order
- Test-driven development: Generates tests before implementations
- Iterative improvement: Automatically fixes failing implementations
Intelligent Testing System
- Comprehensive test coverage: Normal cases, edge cases, and error cases
- Safe execution environment: Isolated test execution with cleanup
- Output comparison: Handles complex data types (lists, dictionaries, objects)
- Execution safety: Timeout protection and error handling
Caching and Performance
- Solution caching: Stores successful implementations for reuse
- Cache key generation: MD5-based hashing of instructions
- Performance optimization: Avoids redundant computation
Usage Examples
Basic Usage
from backend.agents.code_execution_agent import AICodeGenerator
# Initialize with API key
generator = AICodeGenerator("your_api_key_here")
# Generate solution from natural language
instruction = """
Create a function that finds all prime numbers up to a given limit
using the Sieve of Eratosthenes algorithm.
"""
solution = generator.generate_solution(instruction)
# Check results
print(f"All tests passed: {solution['all_tests_passed']}")
print(f"Functions generated: {len(solution['implementations'])}")
Solution Structure
The generate_solution() method returns a comprehensive result:
{
"instruction": "Original instruction text",
"function_specs": [
{
"name": "function_name",
"purpose": "What this function does",
"inputs": [{"name": "param", "type": "str"}],
"outputs": {"type": "list", "description": "Return value"},
"dependencies": ["other_function"],
"complexity": "simple|medium|complex"
}
],
"implementations": {
"function_name": "def function_name(param):\n # implementation"
},
"tests": {
"function_name": [
{
"name": "test_case_description",
"inputs": {"param": "test_value"},
"expected_output": "expected_result",
"setup": "optional_setup_code"
}
]
},
"test_results": [
{
"function": "function_name",
"test": "test_case_name",
"passed": True,
"actual_output": "result"
}
],
"all_tests_passed": True
}
Implementation Details
Function Specification Process
The agent uses AI to analyze instructions and generate structured specifications:
def _generate_function_specs(self, instruction: str) -> List[Dict[str, Any]]:
# Uses AI to determine required functions
# Returns specifications with inputs, outputs, dependencies
Test Generation Strategy
Each function gets comprehensive test coverage:
def _generate_tests(self, function_specs, instruction) -> Dict[str, List[Dict]]:
# Creates normal cases, edge cases, error cases
# Includes setup code when needed
Dependency Resolution
Functions are implemented in correct order:
def _sort_by_dependencies(self, function_specs) -> List[Dict]:
# Topological sort to handle dependencies
# Detects circular dependencies
Iterative Improvement
Failed implementations are automatically improved:
def _improve_implementations(self, instruction, specs, implementations, tests, results):
# Analyzes failure patterns
# Generates improved implementations
# Recursively improves until success or limit reached
Error Handling
Execution Safety
- Timeout protection: Commands timeout after reasonable periods
- Safe imports: Controlled module loading
- Cleanup: Temporary files are automatically removed
- Exception handling: Comprehensive error catching and reporting
Failure Recovery
- Alternative approaches: Tries different implementation strategies
- Gradual improvement: Makes incremental fixes to failing code
- Detailed diagnostics: Provides specific error information for debugging
Configuration
Initialization Parameters
AICodeGenerator(
api_key="your_api_key", # AI service API key
cache_dir=".code_cache" # Directory for caching solutions
)
Environment Requirements
- Python 3.7+
- Write access for temporary files
- Network access for AI API calls
- Required packages:
requests,hashlib,json
API Integration
The agent is designed to work with various AI services. The current implementation includes:
- Generic API format: Configurable endpoint and headers
- Request/response handling: JSON-based communication
- Error handling: Graceful degradation on API failures
- Rate limiting: Built-in request management
API Configuration
# Configure headers for AI service
self.headers = {
"Content-Type": "application/json",
"x-api-key": api_key
}
# API call with error handling
response = requests.post(
"https://api.example.com/v1/completion",
headers=self.headers,
json={"prompt": prompt, "max_tokens": 2000},
timeout=30
)
Performance Considerations
Caching Strategy
- Cache hits: Instant return for previously solved problems
- Cache misses: Full generation pipeline with caching of result
- Cache invalidation: Manual clearing when needed
Optimization Techniques
- Batch processing: Handles multiple functions efficiently
- Smart dependency ordering: Minimizes redundant work
- Progressive complexity: Simple functions first, complex ones using dependencies
Integration with Backend
The Code Execution Agent integrates with the main backend system:
File Location: backend/agents/code_execution_agent.py
Integration Points:
- Backend tool system can invoke the agent
- Results can be formatted for chat responses
- Caching works across different user sessions
Security Considerations
Code Execution Safety
- Isolated execution: Code runs in controlled environment
- Import restrictions: Limited to safe modules
- Timeout enforcement: Prevents infinite loops
- File system protection: Controlled file access
Input Validation
- Instruction sanitization: Validates natural language inputs
- Code validation: Checks generated code before execution
- Test case validation: Ensures test cases are safe to run
Future Enhancements
Planned Features
- Multi-language support: Beyond Python to JavaScript, Java, etc.
- Advanced caching: Semantic similarity-based cache lookup
- Performance profiling: Code efficiency analysis
- Security scanning: Automated vulnerability detection
Integration Opportunities
- IDE integration: Real-time code assistance
- Continuous integration: Automated test generation for CI/CD
- Documentation generation: Automatic code documentation
- Code review assistance: Suggestions for code improvements
Troubleshooting
Common Issues
API Connection Failures
# Check API key and endpoint configuration
# Verify network connectivity
# Review request/response format
Test Execution Errors
# Check Python environment
# Verify import statements
# Review temporary file permissions
Cache Issues
# Clear cache directory
# Check file permissions
# Verify disk space
Debug Mode
Enable detailed logging by modifying the agent:
# Add debug prints to trace execution
# Monitor API requests and responses
# Check intermediate results at each step
Related Documentation
- VAC Service Architecture - Overall backend architecture
- Backend API Guide - API integration patterns
- Tool Orchestrator System - Tool integration framework