Context-Aware Timeout Configuration

The backend implements a sophisticated context-aware timeout configuration system that automatically adjusts timeouts based on the type of request being processed. This ensures optimal performance for different use cases while maintaining system reliability.

Overview

The timeout system provides two distinct configurations:

UI Context: Fast timeouts optimized for responsive user interactions
Email/API Context: Extended timeouts for background processing that doesn’t require immediate response

Architecture

Two-Layer Timeout System

The system implements timeouts at two levels:

HTTP/Transport Layer (models/genai_client.py)
- Individual HTTP request timeouts
- GenAI client operation timeouts
- Network connection timeouts
Retry/Strategy Layer (models/gemini.py)
- Overall retry behavior and exponential backoff
- Maximum attempt limits
- Wait time between retries

Configuration Details

UI Context (Fast Response)

Optimized for interactive user experiences requiring immediate feedback:

UI_TIMEOUTS = {
    "stream_timeout": 120,        # 2 minutes
    "tool_heartbeat": 120,        # 2 minutes between heartbeats
    "tool_hard_timeout": 600,     # 10 minutes total
    "http_read": 30.0,           # 30 seconds
    "http_write": 60.0,          # 1 minute
    "genai_timeout": 60000,      # 60 seconds (milliseconds)
    "retry_attempts": 7,         # 7 total attempts
    "retry_min": 1,              # 1 second minimum wait
    "retry_max": 32,             # 32 seconds maximum wait
}

Email/API Context (Extended Processing)

Optimized for background operations that can take longer to complete:

EMAIL_API_TIMEOUTS = {
    "stream_timeout": 1800,       # 30 minutes
    "tool_heartbeat": 300,        # 5 minutes between heartbeats
    "tool_hard_timeout": 3000,    # 50 minutes total (under Cloud Run 60min limit)
    "http_read": 300.0,          # 5 minutes
    "http_write": 300.0,         # 5 minutes
    "genai_timeout": 300000,     # 5 minutes (milliseconds)
    "retry_attempts": 5,         # 5 total attempts (fewer but longer waits)
    "retry_min": 5,              # 5 seconds minimum wait
    "retry_max": 300,            # 5 minutes maximum wait
    "quarto_export": 600,        # 10 minutes for document generation
    "file_download": 300,        # 5 minutes for file downloads
    "attachment_download": 180,   # 3 minutes for email attachments
    "mailgun_api": 120,          # 2 minutes for email sending
}

Usage

Automatic Context Detection

The system automatically detects the appropriate context:

Email Processing: Detected via EMAIL_PROCESSING_MODE environment variable
UI Requests: Default context for all other operations

Manual Context Selection

For advanced use cases, you can explicitly specify context:

from timeout_config import TimeoutConfig
from models.genai_client import genai_client
from models.gemini import call_gemini_async

# Get timeouts for specific context
ui_timeouts = TimeoutConfig.get_timeouts("ui")
email_timeouts = TimeoutConfig.get_timeouts("email")

# Use context-aware client
client = genai_client(context="email")  # Extended timeouts

# Use context-aware model calls
response = await call_gemini_async(
    contents=messages,
    gen_config=config,
    context="email"  # Extended retry strategy
)

Tool Orchestrator Integration

Tool orchestration automatically uses context-appropriate timeouts:

await create_context(
    new_question=question,
    contents=contents,
    trace=trace,
    first_responder_response=response,
    callback=callback,
    first_response_tools=tools,
    toolConfigs=tool_configs,
    currentUser=user,
    context="email"  # Extended tool execution timeouts
)

Implementation Components

Core Configuration (`timeout_config.py`)

TimeoutConfig: Central configuration class
get_timeouts(context): Get timeout dictionary for context
get_retry_config(context): Get retry configuration for tenacity
get_http_timeouts(context): Get HTTP client timeout configuration
detect_request_context(): Auto-detect context from environment

Updated Components

models/genai_client.py
- Context-aware HTTP transport timeouts
- Context-aware GenAI client timeouts
- Automatic logging of timeout configurations
models/gemini.py
- Dynamic retry decorator creation based on context
- Context-specific exponential backoff strategies
- Separate implementations for streaming and non-streaming calls
tools/tool_orchestrator.py
- Context-aware AsyncTaskRunner configuration
- Extended heartbeat and hard timeout limits
email_integration.py
- Context-aware HTTP timeouts for all operations
- Extended timeouts for Quarto exports and file downloads
- Automatic email context application

Timeout Comparison

Operation	UI Context	Email Context	Improvement
Stream timeout	2 minutes	30 minutes	15x longer
HTTP read	30 seconds	5 minutes	10x longer
GenAI timeout	60 seconds	5 minutes	5x longer
Tool heartbeat	2 minutes	5 minutes	2.5x longer
Hard timeout	10 minutes	50 minutes	5x longer
Quarto export	N/A	10 minutes	New capability

Benefits

Performance Optimization

UI Requests: Maintain responsive user experience with fast timeouts
Email/API Requests: Allow complex operations to complete successfully
Resource Efficiency: Appropriate timeout lengths prevent resource waste

Reliability Improvements

Reduced Failures: Extended timeouts prevent premature timeout failures
Better Retry Strategies: Context-appropriate exponential backoff
Cloud Run Compliance: 50-minute limit stays under 60-minute Cloud Run limit

Maintainability

Centralized Configuration: All timeouts managed in one place
Context Awareness: Automatic application of appropriate timeouts
Backward Compatibility: Existing code works without changes

Monitoring and Debugging

Logging

The system provides detailed logging for timeout configuration:

INFO: EmailProcessor initialized with context 'email' and timeouts: {...}
INFO: Using HTTP timeouts for context 'email': {...}
INFO: Using GenAI timeout for context 'email': 300000ms
INFO: Using tool orchestrator timeouts for context 'email': heartbeat=300s, hard_timeout=3000s

Testing

Verify timeout configuration:

from timeout_config import TimeoutConfig

# Test configurations
ui_config = TimeoutConfig.get_timeouts("ui")
email_config = TimeoutConfig.get_timeouts("email")

# Verify expected values
assert ui_config["stream_timeout"] == 120
assert email_config["stream_timeout"] == 1800

Best Practices

When to Use Each Context

UI Context: User-facing operations requiring immediate response
Email Context: Background processing, complex document generation, large file operations
API Context: External API integrations, batch processing

Configuration Guidelines

Always use automatic context detection when possible
Log timeout configurations for debugging purposes
Monitor operation durations to optimize timeout values
Test timeout scenarios in development and staging

Cloud Run Considerations

Maximum limit: 60 minutes per request
Recommended limit: 50 minutes to allow for cleanup
Memory usage: Longer operations may require more memory
Cost optimization: Balance timeout length with resource usage

Migration Guide

Existing Code

No changes required for existing code - the system is backward compatible.

New Features

When implementing new long-running operations:

Determine appropriate context (UI vs Email/API)
Use context-aware timeouts in HTTP calls
Apply context parameter to model calls
Test with realistic data sizes and network conditions

Troubleshooting

Common issues and solutions:

Timeout too short: Check context detection and configuration
Retry loops: Verify retry configuration for context
Resource exhaustion: Monitor memory usage with extended timeouts
Cloud Run limits: Ensure operations complete within 60-minute limit