Backend MCP Integration Architecture

Purpose: Technical implementation details for backend developers.

User Documentation:

MCP Claude Integration Guide - User setup guide

MCP Complete Architecture - Full overview

MCP Quick Reference - Quick commands

Overview

The Aitana backend provides comprehensive Model Context Protocol (MCP) integration through multiple server implementations and client detection capabilities. This document covers the technical implementation details for backend developers.

MCP Server Implementations
Client Detection System
MCP Tools Registry
FastAPI Integration
stdio Compatibility Layer
Thinking Content Capture
Testing and Debugging

MCP Server Implementations

The backend provides three distinct MCP server implementations:

1. stdio Server (`mcp_stdio.py` and `mcp_stdio_compat.py`)

Purpose: Direct process communication with Claude Desktop and Claude Code

Key Files:

mcp_stdio.py - Simple stdio server implementation
mcp_stdio_compat.py - Compatibility wrapper for Claude Code parameter serialization issues

Features:

Handles JSON string parameters from Claude Code (compatibility mode)
Direct stdio communication without HTTP overhead
Client detection support through environment variables

2. HTTP/HTTPS Server (`app_https.py`)

Purpose: Secure local connections for MCP over HTTP

Features:

Auto-generates SSL certificates for localhost
Runs on port 1956 with HTTPS
Endpoint: https://localhost:1956/mcp/mcp
Alternative to stdio for environments requiring HTTP transport

3. FastAPI MCP Integration (`app_fastapi.py`)

Purpose: Production-ready FastAPI server with MCP support

Features:

Uses Sunholo VACRoutesFastAPI.create_app_with_mcp
Integrated Langfuse tracing
Support for additional custom routes
Automatic MCP tool registration

Client Detection System

MCPClientDetector (`mcp_client_detector.py`)

Sophisticated client detection to identify the source of MCP requests:

class MCPClientType(Enum):
    CLAUDE_DESKTOP = "claude-desktop"
    CLAUDE_CODE = "claude-code"
    AITANA_CLI = "aitana-cli"
    HTTP_API = "http-api"
    CURL = "curl"
    UNKNOWN = "unknown"

Detection Methods:

Environment variables (MCP_CLIENT_TYPE, CLAUDE_DESKTOP_VERSION, CLAUDE_CODE_VERSION)
HTTP headers (User-Agent, X-MCP-Client)
Process parent inspection
Command-line arguments

Integration with Langfuse:

def add_client_detection_to_trace(
    trace: StatefulTraceClient,
    client_type: MCPClientType,
    metadata: Dict[str, Any]
) -> None:
    """Add client detection metadata to Langfuse trace"""

Launcher Scripts

Three specialized launcher scripts with client identification:

mcp_server_claude_desktop.sh
- Sets CLAUDE_DESKTOP_VERSION
- Uses simpler mcp_stdio.py (no compat needed)
mcp_server_claude_code.sh
- Sets CLAUDE_CODE_VERSION
- Uses mcp_stdio_compat.py for parameter compatibility
mcp_server.sh
- Generic launcher without client identification

MCP Tools Registry

Core Tool Registry (`mcp_tools.py`)

Comprehensive MCP tool implementation with three categories:

Assistant Tools (NEW)

assistant-call: Call any assistant with a prompt and get responses
assistant-inspect: Inspect assistant configuration and available tools
assistant-list: List all available assistants

Processing Tools

ai-search: AI-powered search with datastore and filter support
extract-files: Batch file extraction from GCS
list-gcs-bucket: Browse GCS bucket contents
google-search: Web search integration
structured-extraction: Extract structured data from text
url-processing: Process and extract from URLs
user-history: Search user chat history

Model Tools

gemini: Direct Gemini model access
anthropic-smart: Claude model access
smart-stream: Unified streaming model interface

New Assistant Tools

The assistant tools enable programmatic interaction with Aitana assistants through MCP:

assistant-list

Lists all available assistants with filtering options:

result = await aitana_assistant_list(
    include_templates=False,  # Include template assistants
    include_instances=True,    # Include assistant instances
    limit=100                  # Maximum results
)
# Returns: List of assistants with ID, name, description, tools

assistant-inspect

Inspects a specific assistant’s configuration:

result = await aitana_assistant_inspect(
    assistant_id="research-assistant-v1"
)
# Returns: Assistant configuration, available tools, permissions

assistant-call

Calls an assistant with a prompt:

result = await aitana_assistant_call(
    assistant_id="research-assistant-v1",
    prompt="What are the latest developments in renewable energy?",
    tools=["ai-search", "google-search"],  # Optional: override tools
    save_to_history=False                   # Optional: save to Firestore
)
# Returns: Assistant response with metadata

Key Features:

Full assistant interaction capabilities
Tool override support for specialized workflows
Chat history context support
Firestore integration for conversation persistence
MCP client detection for debugging

List GCS Bucket Tool (`tools/list_gcs_bucket.py`)

Purpose: Browse and list Google Cloud Storage bucket contents

Features:

Returns gs:// URIs for use with extract_files
File filtering by extension and pattern
Folder navigation support
Pagination with continuation tokens
Metadata inclusion (size, modified time)

Usage Example:

result = await list_gcs_bucket(
    bucket_path="aitana-documents-bucket",
    prefix="Competitors/",
    file_extensions=[".pdf", ".txt"],
    max_files=100,
    include_metadata=True
)

FastAPI Integration

Enhanced MCP Support (`app_fastapi.py`)

The FastAPI application now includes enhanced MCP integration:

app, vac_routes = VACRoutesFastAPI.create_app_with_mcp(
    title="Aitana Backend API",
    stream_interpreter=vac_stream_with_assistant_support,
    enable_a2a_agent=True,
    additional_routes=additional_routes,
    add_langfuse_eval=True
)

MCP Tool Registration:

# Register custom MCP tools
vac_routes.add_mcp_tool(
    tool_name="list-gcs-bucket",
    tool_description="List contents of GCS bucket",
    tool_function=list_gcs_bucket_wrapper,
    params_model=ListGCSBucketParams
)

stdio Compatibility Layer

Parameter Serialization Issue

Claude Code sends parameters as JSON strings instead of dictionaries, requiring a compatibility layer.

Problem:

// Claude Code sends:
{"params": "{\"query\": \"test\"}"}

// Expected:
{"params": {"query": "test"}}

Solution (mcp_stdio_compat.py):

class FlexibleParams(BaseModel):
    @classmethod
    def parse_if_string(cls, v: Union[Dict, str]) -> Dict:
        """Parse JSON string to dict if needed"""
        if isinstance(v, str):
            return json.loads(v)
        return v

Thinking Content Capture

ThinkingContentCapturingCallback (`thinking_content_capture.py`)

New feature to capture and preserve AI thinking content:

Purpose: Capture Claude’s thinking process tags while streaming

Implementation:

class ThinkingContentCapturingCallback(BufferStreamingStdOutCallbackHandlerAsync):
    def __init__(self, original_callback):
        self.captured_content = ""  # Accumulate ALL content including thinking
    
    async def async_on_llm_new_token(self, token: str, **kwargs):
        # Capture token and pass through
        self.captured_content += token
        return await self.original_callback.async_on_llm_new_token(token, **kwargs)

Usage:

Preserves tags in saved messages
Enables debugging of AI reasoning process
Maintains streaming performance

Testing and Debugging

Test Scripts

test_mcp_tools.py - Comprehensive MCP tool testing
test_mcp_updates.py - Test recent MCP updates
test_mcp_client_detection.py - Client detection testing

Debug Environment Variables

# Enable MCP debug logging
export MCP_DEBUG=true

# Force client type
export MCP_CLIENT_TYPE=claude-code

# Enable Langfuse tracing
export LANGFUSE_DEBUG=true

Testing MCP Tools Directly

# Test AI search
curl -X POST http://localhost:1956/direct/tools/ai-search \
  -H "Content-Type: application/json" \
  -d '{"question": "test query", "datastore_id": "aitana3"}'

# Test list GCS bucket
curl -X POST http://localhost:1956/direct/tools/list-gcs-bucket \
  -H "Content-Type: application/json" \
  -d '{"bucket_path": "aitana-documents-bucket", "max_files": 10}'

Configuration

Environment Variables

# Core MCP settings
GOOGLE_CLOUD_PROJECT=your-project
GOOGLE_CLOUD_LOCATION=global
ANTHROPIC_API_KEY=your-key

# Client detection
MCP_CLIENT_TYPE=claude-desktop
CLAUDE_DESKTOP_VERSION=1.0

# Debug settings
MCP_DEBUG=true
LANGFUSE_DEBUG=true

Claude Desktop Configuration

{
  "mcpServers": {
    "aitana": {
      "command": "/path/to/backend/mcp_server_claude_desktop.sh"
    }
  }
}

Claude Code Configuration

{
  "mcpServers": {
    "aitana": {
      "command": "/path/to/backend/mcp_server_claude_code.sh"
    }
  }
}

Performance Considerations

Thinking Content Impact

The check_and_display_thinking() function adds 1-4 seconds latency:

# Avoid in time-sensitive paths
if DEBUG_MODE:
    await check_and_display_thinking("Processing...", callback)

Tool Concurrency

Multiple searches can be run in parallel:

class MultipleAISearchParams(BaseModel):
    searches: List[AISearchParams]  # Run 2-3 variations for better coverage

Security

SSL Certificate Generation

Auto-generates certificates for HTTPS mode:

# generate_ssl_cert.py
create_self_signed_cert(
    cert_file="certs/localhost.crt",
    key_file="certs/localhost.key"
)

Permission Validation

All MCP tools respect backend permission system:

User email validation
Tool access control
Tag-based permissions

Future Enhancements

Planned Features

Agent Tool Integration:
- document-search agent
- code-execution agent
- assistant-calling agent
Enhanced Client Detection:
- Version-specific behavior
- Client capability detection
- Performance optimization per client
Tool Orchestration:
- Parallel tool execution
- Tool chaining
- Conditional tool selection

User Guides

MCP Claude Integration Guide - Claude Desktop/Code setup
MCP Complete Architecture - Full technical overview
MCP Quick Reference - Quick setup and commands
MCP Sunholo Patterns - FastAPI integration patterns

Technical Documentation

Backend Tool System - Tool implementation details
Backend API Guide - API endpoints and usage
How to Add Tools - Complete tool addition guide

Backend MCP Integration Architecture

Overview

Table of Contents

MCP Server Implementations

1. stdio Server (mcp_stdio.py and mcp_stdio_compat.py)

2. HTTP/HTTPS Server (app_https.py)

3. FastAPI MCP Integration (app_fastapi.py)

Client Detection System

MCPClientDetector (mcp_client_detector.py)

Launcher Scripts

MCP Tools Registry

Core Tool Registry (mcp_tools.py)

Assistant Tools (NEW)

Processing Tools

Model Tools

New Assistant Tools

assistant-list

assistant-inspect

assistant-call

List GCS Bucket Tool (tools/list_gcs_bucket.py)

FastAPI Integration

Enhanced MCP Support (app_fastapi.py)

stdio Compatibility Layer

Parameter Serialization Issue

Thinking Content Capture

ThinkingContentCapturingCallback (thinking_content_capture.py)

Testing and Debugging

Test Scripts

Debug Environment Variables

Testing MCP Tools Directly

Configuration

Environment Variables

Claude Desktop Configuration

Claude Code Configuration

Performance Considerations

Thinking Content Impact

Tool Concurrency

Security

SSL Certificate Generation

Permission Validation

Future Enhancements

Planned Features

Related Documentation

User Guides

Technical Documentation

1. stdio Server (`mcp_stdio.py` and `mcp_stdio_compat.py`)

2. HTTP/HTTPS Server (`app_https.py`)

3. FastAPI MCP Integration (`app_fastapi.py`)

MCPClientDetector (`mcp_client_detector.py`)

Core Tool Registry (`mcp_tools.py`)

List GCS Bucket Tool (`tools/list_gcs_bucket.py`)

Enhanced MCP Support (`app_fastapi.py`)

ThinkingContentCapturingCallback (`thinking_content_capture.py`)