Context-Aware Timeout Configuration

The backend implements a sophisticated context-aware timeout configuration system that automatically adjusts timeouts based on the type of request being processed. This ensures optimal performance for different use cases while maintaining system reliability.

Overview

The timeout system provides two distinct configurations:

  • UI Context: Fast timeouts optimized for responsive user interactions
  • Email/API Context: Extended timeouts for background processing that doesn’t require immediate response

Architecture

Two-Layer Timeout System

The system implements timeouts at two levels:

  1. HTTP/Transport Layer (models/genai_client.py)
    • Individual HTTP request timeouts
    • GenAI client operation timeouts
    • Network connection timeouts
  2. Retry/Strategy Layer (models/gemini.py)
    • Overall retry behavior and exponential backoff
    • Maximum attempt limits
    • Wait time between retries

Configuration Details

UI Context (Fast Response)

Optimized for interactive user experiences requiring immediate feedback:

UI_TIMEOUTS = {
    "stream_timeout": 120,        # 2 minutes
    "tool_heartbeat": 120,        # 2 minutes between heartbeats
    "tool_hard_timeout": 600,     # 10 minutes total
    "http_read": 30.0,           # 30 seconds
    "http_write": 60.0,          # 1 minute
    "genai_timeout": 60000,      # 60 seconds (milliseconds)
    "retry_attempts": 7,         # 7 total attempts
    "retry_min": 1,              # 1 second minimum wait
    "retry_max": 32,             # 32 seconds maximum wait
}

Email/API Context (Extended Processing)

Optimized for background operations that can take longer to complete:

EMAIL_API_TIMEOUTS = {
    "stream_timeout": 1800,       # 30 minutes
    "tool_heartbeat": 300,        # 5 minutes between heartbeats
    "tool_hard_timeout": 3000,    # 50 minutes total (under Cloud Run 60min limit)
    "http_read": 300.0,          # 5 minutes
    "http_write": 300.0,         # 5 minutes
    "genai_timeout": 300000,     # 5 minutes (milliseconds)
    "retry_attempts": 5,         # 5 total attempts (fewer but longer waits)
    "retry_min": 5,              # 5 seconds minimum wait
    "retry_max": 300,            # 5 minutes maximum wait
    "quarto_export": 600,        # 10 minutes for document generation
    "file_download": 300,        # 5 minutes for file downloads
    "attachment_download": 180,   # 3 minutes for email attachments
    "mailgun_api": 120,          # 2 minutes for email sending
}

Usage

Automatic Context Detection

The system automatically detects the appropriate context:

  • Email Processing: Detected via EMAIL_PROCESSING_MODE environment variable
  • UI Requests: Default context for all other operations

Manual Context Selection

For advanced use cases, you can explicitly specify context:

from timeout_config import TimeoutConfig
from models.genai_client import genai_client
from models.gemini import call_gemini_async

# Get timeouts for specific context
ui_timeouts = TimeoutConfig.get_timeouts("ui")
email_timeouts = TimeoutConfig.get_timeouts("email")

# Use context-aware client
client = genai_client(context="email")  # Extended timeouts

# Use context-aware model calls
response = await call_gemini_async(
    contents=messages,
    gen_config=config,
    context="email"  # Extended retry strategy
)

Tool Orchestrator Integration

Tool orchestration automatically uses context-appropriate timeouts:

await create_context(
    new_question=question,
    contents=contents,
    trace=trace,
    first_responder_response=response,
    callback=callback,
    first_response_tools=tools,
    toolConfigs=tool_configs,
    currentUser=user,
    context="email"  # Extended tool execution timeouts
)

Implementation Components

Core Configuration (timeout_config.py)

  • TimeoutConfig: Central configuration class
  • get_timeouts(context): Get timeout dictionary for context
  • get_retry_config(context): Get retry configuration for tenacity
  • get_http_timeouts(context): Get HTTP client timeout configuration
  • detect_request_context(): Auto-detect context from environment

Updated Components

  1. models/genai_client.py
    • Context-aware HTTP transport timeouts
    • Context-aware GenAI client timeouts
    • Automatic logging of timeout configurations
  2. models/gemini.py
    • Dynamic retry decorator creation based on context
    • Context-specific exponential backoff strategies
    • Separate implementations for streaming and non-streaming calls
  3. tools/tool_orchestrator.py
    • Context-aware AsyncTaskRunner configuration
    • Extended heartbeat and hard timeout limits
  4. email_integration.py
    • Context-aware HTTP timeouts for all operations
    • Extended timeouts for Quarto exports and file downloads
    • Automatic email context application

Timeout Comparison

Operation UI Context Email Context Improvement
Stream timeout 2 minutes 30 minutes 15x longer
HTTP read 30 seconds 5 minutes 10x longer
GenAI timeout 60 seconds 5 minutes 5x longer
Tool heartbeat 2 minutes 5 minutes 2.5x longer
Hard timeout 10 minutes 50 minutes 5x longer
Quarto export N/A 10 minutes New capability

Benefits

Performance Optimization

  • UI Requests: Maintain responsive user experience with fast timeouts
  • Email/API Requests: Allow complex operations to complete successfully
  • Resource Efficiency: Appropriate timeout lengths prevent resource waste

Reliability Improvements

  • Reduced Failures: Extended timeouts prevent premature timeout failures
  • Better Retry Strategies: Context-appropriate exponential backoff
  • Cloud Run Compliance: 50-minute limit stays under 60-minute Cloud Run limit

Maintainability

  • Centralized Configuration: All timeouts managed in one place
  • Context Awareness: Automatic application of appropriate timeouts
  • Backward Compatibility: Existing code works without changes

Monitoring and Debugging

Logging

The system provides detailed logging for timeout configuration:

INFO: EmailProcessor initialized with context 'email' and timeouts: {...}
INFO: Using HTTP timeouts for context 'email': {...}
INFO: Using GenAI timeout for context 'email': 300000ms
INFO: Using tool orchestrator timeouts for context 'email': heartbeat=300s, hard_timeout=3000s

Testing

Verify timeout configuration:

from timeout_config import TimeoutConfig

# Test configurations
ui_config = TimeoutConfig.get_timeouts("ui")
email_config = TimeoutConfig.get_timeouts("email")

# Verify expected values
assert ui_config["stream_timeout"] == 120
assert email_config["stream_timeout"] == 1800

Best Practices

When to Use Each Context

  • UI Context: User-facing operations requiring immediate response
  • Email Context: Background processing, complex document generation, large file operations
  • API Context: External API integrations, batch processing

Configuration Guidelines

  • Always use automatic context detection when possible
  • Log timeout configurations for debugging purposes
  • Monitor operation durations to optimize timeout values
  • Test timeout scenarios in development and staging

Cloud Run Considerations

  • Maximum limit: 60 minutes per request
  • Recommended limit: 50 minutes to allow for cleanup
  • Memory usage: Longer operations may require more memory
  • Cost optimization: Balance timeout length with resource usage

Migration Guide

Existing Code

No changes required for existing code - the system is backward compatible.

New Features

When implementing new long-running operations:

  1. Determine appropriate context (UI vs Email/API)
  2. Use context-aware timeouts in HTTP calls
  3. Apply context parameter to model calls
  4. Test with realistic data sizes and network conditions

Troubleshooting

Common issues and solutions:

  • Timeout too short: Check context detection and configuration
  • Retry loops: Verify retry configuration for context
  • Resource exhaustion: Monitor memory usage with extended timeouts
  • Cloud Run limits: Ensure operations complete within 60-minute limit