Context-Aware Timeout Configuration
The backend implements a sophisticated context-aware timeout configuration system that automatically adjusts timeouts based on the type of request being processed. This ensures optimal performance for different use cases while maintaining system reliability.
Overview
The timeout system provides two distinct configurations:
- UI Context: Fast timeouts optimized for responsive user interactions
- Email/API Context: Extended timeouts for background processing that doesn’t require immediate response
Architecture
Two-Layer Timeout System
The system implements timeouts at two levels:
- HTTP/Transport Layer (
models/genai_client.py)- Individual HTTP request timeouts
- GenAI client operation timeouts
- Network connection timeouts
- Retry/Strategy Layer (
models/gemini.py)- Overall retry behavior and exponential backoff
- Maximum attempt limits
- Wait time between retries
Configuration Details
UI Context (Fast Response)
Optimized for interactive user experiences requiring immediate feedback:
UI_TIMEOUTS = {
"stream_timeout": 120, # 2 minutes
"tool_heartbeat": 120, # 2 minutes between heartbeats
"tool_hard_timeout": 600, # 10 minutes total
"http_read": 30.0, # 30 seconds
"http_write": 60.0, # 1 minute
"genai_timeout": 60000, # 60 seconds (milliseconds)
"retry_attempts": 7, # 7 total attempts
"retry_min": 1, # 1 second minimum wait
"retry_max": 32, # 32 seconds maximum wait
}
Email/API Context (Extended Processing)
Optimized for background operations that can take longer to complete:
EMAIL_API_TIMEOUTS = {
"stream_timeout": 1800, # 30 minutes
"tool_heartbeat": 300, # 5 minutes between heartbeats
"tool_hard_timeout": 3000, # 50 minutes total (under Cloud Run 60min limit)
"http_read": 300.0, # 5 minutes
"http_write": 300.0, # 5 minutes
"genai_timeout": 300000, # 5 minutes (milliseconds)
"retry_attempts": 5, # 5 total attempts (fewer but longer waits)
"retry_min": 5, # 5 seconds minimum wait
"retry_max": 300, # 5 minutes maximum wait
"quarto_export": 600, # 10 minutes for document generation
"file_download": 300, # 5 minutes for file downloads
"attachment_download": 180, # 3 minutes for email attachments
"mailgun_api": 120, # 2 minutes for email sending
}
Usage
Automatic Context Detection
The system automatically detects the appropriate context:
- Email Processing: Detected via
EMAIL_PROCESSING_MODEenvironment variable - UI Requests: Default context for all other operations
Manual Context Selection
For advanced use cases, you can explicitly specify context:
from timeout_config import TimeoutConfig
from models.genai_client import genai_client
from models.gemini import call_gemini_async
# Get timeouts for specific context
ui_timeouts = TimeoutConfig.get_timeouts("ui")
email_timeouts = TimeoutConfig.get_timeouts("email")
# Use context-aware client
client = genai_client(context="email") # Extended timeouts
# Use context-aware model calls
response = await call_gemini_async(
contents=messages,
gen_config=config,
context="email" # Extended retry strategy
)
Tool Orchestrator Integration
Tool orchestration automatically uses context-appropriate timeouts:
await create_context(
new_question=question,
contents=contents,
trace=trace,
first_responder_response=response,
callback=callback,
first_response_tools=tools,
toolConfigs=tool_configs,
currentUser=user,
context="email" # Extended tool execution timeouts
)
Implementation Components
Core Configuration (timeout_config.py)
TimeoutConfig: Central configuration classget_timeouts(context): Get timeout dictionary for contextget_retry_config(context): Get retry configuration for tenacityget_http_timeouts(context): Get HTTP client timeout configurationdetect_request_context(): Auto-detect context from environment
Updated Components
models/genai_client.py- Context-aware HTTP transport timeouts
- Context-aware GenAI client timeouts
- Automatic logging of timeout configurations
models/gemini.py- Dynamic retry decorator creation based on context
- Context-specific exponential backoff strategies
- Separate implementations for streaming and non-streaming calls
tools/tool_orchestrator.py- Context-aware AsyncTaskRunner configuration
- Extended heartbeat and hard timeout limits
email_integration.py- Context-aware HTTP timeouts for all operations
- Extended timeouts for Quarto exports and file downloads
- Automatic email context application
Timeout Comparison
| Operation | UI Context | Email Context | Improvement |
|---|---|---|---|
| Stream timeout | 2 minutes | 30 minutes | 15x longer |
| HTTP read | 30 seconds | 5 minutes | 10x longer |
| GenAI timeout | 60 seconds | 5 minutes | 5x longer |
| Tool heartbeat | 2 minutes | 5 minutes | 2.5x longer |
| Hard timeout | 10 minutes | 50 minutes | 5x longer |
| Quarto export | N/A | 10 minutes | New capability |
Benefits
Performance Optimization
- UI Requests: Maintain responsive user experience with fast timeouts
- Email/API Requests: Allow complex operations to complete successfully
- Resource Efficiency: Appropriate timeout lengths prevent resource waste
Reliability Improvements
- Reduced Failures: Extended timeouts prevent premature timeout failures
- Better Retry Strategies: Context-appropriate exponential backoff
- Cloud Run Compliance: 50-minute limit stays under 60-minute Cloud Run limit
Maintainability
- Centralized Configuration: All timeouts managed in one place
- Context Awareness: Automatic application of appropriate timeouts
- Backward Compatibility: Existing code works without changes
Monitoring and Debugging
Logging
The system provides detailed logging for timeout configuration:
INFO: EmailProcessor initialized with context 'email' and timeouts: {...}
INFO: Using HTTP timeouts for context 'email': {...}
INFO: Using GenAI timeout for context 'email': 300000ms
INFO: Using tool orchestrator timeouts for context 'email': heartbeat=300s, hard_timeout=3000s
Testing
Verify timeout configuration:
from timeout_config import TimeoutConfig
# Test configurations
ui_config = TimeoutConfig.get_timeouts("ui")
email_config = TimeoutConfig.get_timeouts("email")
# Verify expected values
assert ui_config["stream_timeout"] == 120
assert email_config["stream_timeout"] == 1800
Best Practices
When to Use Each Context
- UI Context: User-facing operations requiring immediate response
- Email Context: Background processing, complex document generation, large file operations
- API Context: External API integrations, batch processing
Configuration Guidelines
- Always use automatic context detection when possible
- Log timeout configurations for debugging purposes
- Monitor operation durations to optimize timeout values
- Test timeout scenarios in development and staging
Cloud Run Considerations
- Maximum limit: 60 minutes per request
- Recommended limit: 50 minutes to allow for cleanup
- Memory usage: Longer operations may require more memory
- Cost optimization: Balance timeout length with resource usage
Migration Guide
Existing Code
No changes required for existing code - the system is backward compatible.
New Features
When implementing new long-running operations:
- Determine appropriate context (UI vs Email/API)
- Use context-aware timeouts in HTTP calls
- Apply context parameter to model calls
- Test with realistic data sizes and network conditions
Troubleshooting
Common issues and solutions:
- Timeout too short: Check context detection and configuration
- Retry loops: Verify retry configuration for context
- Resource exhaustion: Monitor memory usage with extended timeouts
- Cloud Run limits: Ensure operations complete within 60-minute limit