Backend FastAPI Migration

Overview

The Aitana backend is transitioning from Flask to FastAPI for improved performance, better async support, and native MCP integration. This document covers the migration architecture, current status, and implementation details.

Migration Status

Current Architecture

The backend currently supports three parallel implementations:

  1. Flask App (app.py) - Original implementation, production stable
  2. FastAPI App (app_fastapi.py) - New implementation with MCP support
  3. HTTPS Flask App (app_https.py) - Flask with SSL for local MCP testing

Deployment Strategy

  • Production: Still uses Flask (app.py)
  • Development: Can use either Flask or FastAPI
  • Testing: FastAPI for MCP integration testing

FastAPI Implementation

Core Application (app_fastapi.py)

The FastAPI implementation uses Sunholo’s VACRoutesFastAPI for standardized routing:

from sunholo.agents.fastapi import VACRoutesFastAPI

app, vac_routes = VACRoutesFastAPI.create_app_with_mcp(
    title="Aitana Backend API",
    stream_interpreter=vac_stream_with_assistant_support,
    enable_a2a_agent=True,
    additional_routes=additional_routes,
    add_langfuse_eval=True
)

Key Features

1. Native Async Support

All endpoints are async by default:

async def assistant_stream_handler(assistant_id: str, request: Request):
    return await assistant_stream(assistant_id, request)

2. MCP Integration

Built-in MCP server with tool registration:

# MCP tools are automatically registered
vac_routes.add_mcp_tool(
    tool_name="ai-search",
    tool_function=aitana_ai_search,
    params_model=AISearchParams
)

3. Streaming Support

Enhanced streaming with Server-Sent Events (SSE):

@app.post("/vac/assistant/{assistant_id}/sse")
async def assistant_stream_sse(assistant_id: str, request: Request):
    return StreamingResponse(
        stream_generator(),
        media_type="text/event-stream"
    )

4. Middleware Stack

Comprehensive middleware configuration:

# CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"]
)

# Request ID tracking
app.add_middleware(RequestIDMiddleware)

# Performance monitoring
app.add_middleware(PrometheusMiddleware)

API Endpoints

Core VAC Endpoints

Endpoint Method Description Status
/vac/assistant/{assistant_id} POST Main assistant interaction ✅ Migrated
/vac/assistant/{assistant_id}/sse POST SSE streaming ✅ New in FastAPI
/vac/assistants GET List all assistants ✅ Migrated
/vac/assistants/{assistant_id}/tools GET Get assistant tools ✅ Migrated
/vac/streaming/{model} POST Direct model streaming ✅ Migrated

Direct Tool Endpoints

Endpoint Method Description Status
/direct/tools/ai-search POST AI search tool ✅ Migrated
/direct/tools/google-search POST Google search ✅ Migrated
/direct/tools/extract-files POST File extraction ✅ Migrated
/direct/tools/list-gcs-bucket POST List GCS contents ✅ New
/direct/tools/structured-extraction POST Extract structured data ✅ Migrated
/direct/tools/url-processing POST Process URLs ✅ Migrated
/direct/tools/user-history POST Search history ✅ Migrated

Model Endpoints

Endpoint Method Description Status
/direct/models/gemini POST Gemini model ✅ Migrated
/direct/models/anthropic-smart POST Claude model ✅ Migrated
/direct/models/smart-stream POST Unified streaming ✅ Migrated

Email Integration Endpoints

Endpoint Method Description Status
/webhooks/email/receive POST Email webhook ✅ Migrated
/webhooks/email/status GET Webhook status ✅ Migrated
/email/subscriptions POST Create subscription ✅ Migrated
/email/subscriptions DELETE Remove subscription ✅ Migrated
/email/subscriptions GET List subscriptions ✅ Migrated

MCP Endpoints

Endpoint Method Description Status
/mcp - MCP over HTTP ✅ New
/aitana-mcp - Alternative MCP endpoint ✅ New

Request/Response Models

Pydantic Models

FastAPI uses Pydantic for request/response validation:

from pydantic import BaseModel, Field

class AssistantRequest(BaseModel):
    question: str = Field(..., description="User's question")
    tools: List[str] = Field(default=[], description="Tools to use")
    tool_configs: Dict[str, Any] = Field(default={}, description="Tool configurations")
    chat_history: List[Dict] = Field(default=[], description="Previous messages")
    user_email: Optional[str] = Field(None, description="User's email")

Response Streaming

FastAPI provides better streaming support:

async def stream_response(request: AssistantRequest):
    async for chunk in process_stream(request):
        yield f"data: {json.dumps(chunk)}\n\n"

Migration Patterns

Converting Flask Endpoints

Flask Pattern

@app.route('/vac/assistant/<assistant_id>', methods=['POST'])
def assistant_endpoint(assistant_id):
    data = request.json
    return jsonify(process(data))

FastAPI Pattern

@app.post('/vac/assistant/{assistant_id}')
async def assistant_endpoint(
    assistant_id: str,
    request: AssistantRequest
):
    return await process(request)

Handling Request Data

Flask

data = request.json
user_email = data.get('user_email')

FastAPI

# Automatic validation with Pydantic
user_email = request.user_email  # Type-safe access

Error Handling

Flask

try:
    result = process()
except Exception as e:
    return jsonify({'error': str(e)}), 500

FastAPI

from fastapi import HTTPException

try:
    result = await process()
except Exception as e:
    raise HTTPException(status_code=500, detail=str(e))

Performance Improvements

Async I/O

FastAPI’s async support provides better performance:

# Parallel processing
results = await asyncio.gather(
    ai_search(query1),
    ai_search(query2),
    google_search(query)
)

Connection Pooling

Better database connection management:

from databases import Database

database = Database(DATABASE_URL)

@app.on_event("startup")
async def startup():
    await database.connect()

@app.on_event("shutdown")
async def shutdown():
    await database.disconnect()

Request Validation

Automatic validation reduces processing overhead:

# FastAPI validates before handler execution
@app.post('/api/endpoint')
async def endpoint(request: ValidatedModel):
    # Request is already validated
    return process(request)

Testing

FastAPI Test Client

from fastapi.testclient import TestClient
from app_fastapi import app

client = TestClient(app)

def test_assistant():
    response = client.post(
        "/vac/assistant/test-id",
        json={"question": "test"}
    )
    assert response.status_code == 200

Async Testing

import pytest
from httpx import AsyncClient

@pytest.mark.asyncio
async def test_async_endpoint():
    async with AsyncClient(app=app, base_url="http://test") as client:
        response = await client.post("/vac/assistant/test-id")
        assert response.status_code == 200

Deployment Considerations

Running FastAPI

Development:

cd backend
uv run uvicorn app_fastapi:app --reload --port 1956

Production:

uvicorn app_fastapi:app --host 0.0.0.0 --port 1956 --workers 4

Docker Configuration

FROM python:3.10-slim

WORKDIR /app
COPY . .

RUN pip install -r requirements.txt

CMD ["uvicorn", "app_fastapi:app", "--host", "0.0.0.0", "--port", "1956"]

Environment Variables

FastAPI uses the same environment variables as Flask:

GOOGLE_CLOUD_PROJECT=your-project
ANTHROPIC_API_KEY=your-key
LANGFUSE_SECRET_KEY=your-key

Migration Checklist

Phase 1: Parallel Running ✅

  • Create FastAPI app structure
  • Migrate core endpoints
  • Add MCP support
  • Test alongside Flask

Phase 2: Feature Parity (In Progress)

  • All VAC endpoints migrated
  • Email integration complete
  • Direct tools migrated
  • WhatsApp integration
  • Admin endpoints

Phase 3: Production Switch

  • Performance testing
  • Load testing
  • Update deployment configs
  • Switch production traffic
  • Monitor and rollback plan

Common Issues and Solutions

Issue 1: Request Body Access

Problem: FastAPI doesn’t have request.json

Solution: Use Pydantic models or await request.json()

Issue 2: Synchronous Functions

Problem: Mixing sync and async functions

Solution: Use asyncio.to_thread() for sync functions:

result = await asyncio.to_thread(sync_function, args)

Issue 3: Streaming Responses

Problem: Different streaming patterns

Solution: Use FastAPI’s StreamingResponse:

from fastapi.responses import StreamingResponse

return StreamingResponse(
    generator(),
    media_type="text/event-stream"
)

Benefits of Migration

  1. Performance: 2-3x faster request handling
  2. Type Safety: Automatic validation and documentation
  3. Modern Python: Native async/await support
  4. Better Testing: Built-in test client
  5. Auto Documentation: Interactive API docs at /docs
  6. MCP Native: Built-in MCP server support
  7. WebSocket Support: Real-time bidirectional communication