API Timeout Parameters

This document describes the timeout parameters available in the backend API and how they interact with the context-aware timeout configuration system.

Overview

The backend API supports both explicit timeout parameters and automatic context-aware timeout configuration. This provides flexibility for different use cases while ensuring optimal performance.

Assistant Stream API (`/vac/assistant/<assistantId>`)

Request Parameters

The assistant stream endpoint accepts the following timeout-related parameters:

{
  "user_input": "Your question here",
  "stream_timeout": 120,  // Optional: Override default timeout (seconds)
  "stream_wait_time": 1,  // Optional: Wait time between stream chunks (seconds)
  "chat_history": [],
  "documents": [],
  "save_to_history": false,
  "read_from_history": false
}

Timeout Parameter Behavior

`stream_timeout` Parameter

Type: Integer (seconds)
Default: Context-dependent
- UI Context: 120 seconds (2 minutes)
- Email Context: 1800 seconds (30 minutes)
Range: 1 to 3600 seconds (1 hour maximum)
Usage: Controls overall processing timeout

# Example: Explicit timeout override
data = {
    "user_input": "Complex analysis task",
    "stream_timeout": 900,  # 15 minutes for this specific request
}

`stream_wait_time` Parameter

Type: Integer (seconds)
Default: 1 second
Range: 1 to 30 seconds
Usage: Controls delay between streaming response chunks

Context-Aware Defaults

Automatic Context Detection

The API automatically applies appropriate timeouts based on request context:

# Email requests automatically get extended timeouts
{
    "X-Message-Source": "email"  // Header indicates email context
}

Default Timeout Values

Context	Stream Timeout	Tool Timeout	HTTP Timeout	Retry Strategy
UI	120s (2 min)	600s (10 min)	30s	Fast recovery (7 attempts, 1-32s waits)
Email	1800s (30 min)	3000s (50 min)	300s	Patient retry (5 attempts, 5-300s waits)
API	1800s (30 min)	3000s (50 min)	300s	Patient retry (5 attempts, 5-300s waits)

Process Assistant Request Function

Function Signature

async def process_assistant_request(
    assistant_id: str,
    user_input: str,
    current_user: Dict[str, Any],
    chat_history: Optional[List[Dict]] = None,
    documents: Optional[List[Dict]] = None,
    selected_items: Optional[List] = None,
    save_to_history: bool = False,
    read_from_history: bool = False,
    history_limit: int = 50,
    stream_only: bool = False,
    stream_wait_time: int = 1,
    stream_timeout: int = 120,  # Context-aware default
    vac_name: str = "aitana3",
    trace_id: Optional[str] = None,
    config_overrides: Optional[Dict[str, Any]] = None,
    emissary_config_overrides: Optional[Dict[str, Any]] = None,
    callback = None
) -> Dict[str, Any]:

Timeout Parameter Details

`stream_timeout`

Purpose: Maximum time for the entire assistant processing operation
Context Behavior:
- Automatically set to 120s for UI requests
- Automatically set to 1800s for email requests
- Can be explicitly overridden in request data
Cloud Run Limit: Maximum 3600s (60 minutes)

`stream_wait_time`

Purpose: Delay between streaming response chunks
Default: 1 second
Use Cases:
- Slower connections: Increase to 2-3 seconds
- Real-time requirements: Keep at 1 second
- Debugging: Increase to observe streaming behavior

Tool Orchestrator Timeouts

AsyncTaskRunner Configuration

Tools executed via the orchestrator use context-aware timeouts:

runner = AsyncTaskRunner(
    retry_enabled=True,
    timeout=timeouts["tool_heartbeat"],     # Context-aware heartbeat
    heartbeat_extends_timeout=True,
    hard_timeout=timeouts["tool_hard_timeout"],  # Context-aware hard limit
    retry_kwargs={
        'wait': wait_random_exponential(multiplier=1, max=20),
        'stop': stop_after_attempt(2),
        'retry': retry_if_exception_type(Exception),
    }
)

Tool-Specific Timeouts

Individual tools may have additional timeout configurations:

File Browser: Uses HTTP timeouts for cloud storage operations
Google Search: Uses HTTP timeouts for search API calls
Document Search: Uses HTTP timeouts for search service calls
Code Execution: Uses execution timeouts for safety

Model Client Timeouts

GenAI Client Configuration

The GenAI client uses context-aware HTTP timeouts:

custom_transport = httpx.AsyncClient(
    timeout=httpx.Timeout(
        connect=timeouts["http_connect"],    # Context-aware
        read=timeouts["http_read"],          # Context-aware
        write=timeouts["http_write"],        # Context-aware
        pool=timeouts["http_pool"]           # Context-aware
    )
)

http_options = types.HttpOptions(
    timeout=timeouts["genai_timeout"]        # Context-aware
)

Model Call Timeouts

Model calls support context parameters:

# Non-streaming
response = await call_gemini_async(
    contents=messages,
    gen_config=config,
    context="email"  # Extended timeouts
)

# Streaming
async for chunk in call_gemini_stream_async(
    contents=messages,
    gen_config=config,
    context="email"  # Extended timeouts
):
    yield chunk

Email Integration Timeouts

Email-Specific Operations

Email processing includes additional timeout configurations:

EMAIL_TIMEOUTS = {
    "quarto_export": 600,        # 10 minutes for document generation
    "file_download": 300,        # 5 minutes for file downloads
    "attachment_download": 180,   # 3 minutes for email attachments
    "mailgun_api": 120,          # 2 minutes for email sending
}

Usage in Email Processing

# Automatic timeout application
result = await process_assistant_request(
    assistant_id=assistant_id,
    user_input=message,
    current_user=user,
    stream_timeout=self.timeouts["stream_timeout"],  # 30 minutes for email
    documents=attachments,
    save_to_history=True
)

Best Practices

API Client Implementation

When implementing API clients, consider timeout requirements:

// Frontend example
const response = await fetch('/vac/assistant/my-assistant', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-Message-Source': 'ui'  // Indicates UI context
  },
  body: JSON.stringify({
    user_input: message,
    stream_timeout: 300,  // 5 minutes for complex query
    save_to_history: true
  })
});

Error Handling

Handle timeout-related errors appropriately:

try:
    result = await process_assistant_request(
        assistant_id="complex-assistant",
        user_input=complex_query,
        stream_timeout=1800  # 30 minutes
    )
except asyncio.TimeoutError:
    # Handle timeout gracefully
    return {"error": "Operation timed out", "retry_suggested": True}

Monitoring and Debugging

Monitor timeout configurations in logs:

# Logs will show:
# INFO: Using timeouts for context 'email': stream_timeout=1800s
# INFO: Tool orchestrator timeouts: heartbeat=300s, hard_timeout=3000s
# INFO: GenAI client timeout: 300000ms

Migration Guide

Existing API Calls

No changes required - existing API calls will use context-appropriate defaults.

New Features Requiring Extended Timeouts

Identify context: Determine if operation is UI or background
Set appropriate timeout: Use context parameter or explicit timeout
Test with realistic data: Verify timeouts work with actual data sizes
Monitor performance: Track operation durations in production

Troubleshooting Timeouts

Common issues and solutions:

Issue	Symptom	Solution
Request times out too quickly	Operations fail prematurely	Increase `stream_timeout` or verify context
UI feels slow	Responses take too long	Verify UI context is being used
Email processing fails	Email operations timeout	Check email context detection
Tool execution incomplete	Complex tools fail	Verify tool orchestrator timeouts

Configuration Reference

Environment Variables

No additional environment variables required - timeouts are automatically configured based on context.

Runtime Configuration

Context detection environment variable:

# Set during email processing
EMAIL_PROCESSING_MODE=true

Override Configuration

For testing or special use cases:

from timeout_config import TimeoutConfig

# Get specific timeouts
custom_timeouts = TimeoutConfig.get_timeouts("email")
custom_timeouts["stream_timeout"] = 900  # 15 minutes

# Use in processing
result = await process_assistant_request(
    stream_timeout=custom_timeouts["stream_timeout"]
)