Backend MCP Integration Architecture

Purpose: Technical implementation details for backend developers.

User Documentation:

Overview

The Aitana backend provides comprehensive Model Context Protocol (MCP) integration through multiple server implementations and client detection capabilities. This document covers the technical implementation details for backend developers.

Table of Contents

MCP Server Implementations

The backend provides three distinct MCP server implementations:

1. stdio Server (mcp_stdio.py and mcp_stdio_compat.py)

Purpose: Direct process communication with Claude Desktop and Claude Code

Key Files:

  • mcp_stdio.py - Simple stdio server implementation
  • mcp_stdio_compat.py - Compatibility wrapper for Claude Code parameter serialization issues

Features:

  • Handles JSON string parameters from Claude Code (compatibility mode)
  • Direct stdio communication without HTTP overhead
  • Client detection support through environment variables

2. HTTP/HTTPS Server (app_https.py)

Purpose: Secure local connections for MCP over HTTP

Features:

  • Auto-generates SSL certificates for localhost
  • Runs on port 1956 with HTTPS
  • Endpoint: https://localhost:1956/mcp/mcp
  • Alternative to stdio for environments requiring HTTP transport

3. FastAPI MCP Integration (app_fastapi.py)

Purpose: Production-ready FastAPI server with MCP support

Features:

  • Uses Sunholo VACRoutesFastAPI.create_app_with_mcp
  • Integrated Langfuse tracing
  • Support for additional custom routes
  • Automatic MCP tool registration

Client Detection System

MCPClientDetector (mcp_client_detector.py)

Sophisticated client detection to identify the source of MCP requests:

class MCPClientType(Enum):
    CLAUDE_DESKTOP = "claude-desktop"
    CLAUDE_CODE = "claude-code"
    AITANA_CLI = "aitana-cli"
    HTTP_API = "http-api"
    CURL = "curl"
    UNKNOWN = "unknown"

Detection Methods:

  • Environment variables (MCP_CLIENT_TYPE, CLAUDE_DESKTOP_VERSION, CLAUDE_CODE_VERSION)
  • HTTP headers (User-Agent, X-MCP-Client)
  • Process parent inspection
  • Command-line arguments

Integration with Langfuse:

def add_client_detection_to_trace(
    trace: StatefulTraceClient,
    client_type: MCPClientType,
    metadata: Dict[str, Any]
) -> None:
    """Add client detection metadata to Langfuse trace"""

Launcher Scripts

Three specialized launcher scripts with client identification:

  1. mcp_server_claude_desktop.sh
    • Sets CLAUDE_DESKTOP_VERSION
    • Uses simpler mcp_stdio.py (no compat needed)
  2. mcp_server_claude_code.sh
    • Sets CLAUDE_CODE_VERSION
    • Uses mcp_stdio_compat.py for parameter compatibility
  3. mcp_server.sh
    • Generic launcher without client identification

MCP Tools Registry

Core Tool Registry (mcp_tools.py)

Comprehensive MCP tool implementation with three categories:

Assistant Tools (NEW)

  • assistant-call: Call any assistant with a prompt and get responses
  • assistant-inspect: Inspect assistant configuration and available tools
  • assistant-list: List all available assistants

Processing Tools

  • ai-search: AI-powered search with datastore and filter support
  • extract-files: Batch file extraction from GCS
  • list-gcs-bucket: Browse GCS bucket contents
  • google-search: Web search integration
  • structured-extraction: Extract structured data from text
  • url-processing: Process and extract from URLs
  • user-history: Search user chat history

Model Tools

  • gemini: Direct Gemini model access
  • anthropic-smart: Claude model access
  • smart-stream: Unified streaming model interface

New Assistant Tools

The assistant tools enable programmatic interaction with Aitana assistants through MCP:

assistant-list

Lists all available assistants with filtering options:

result = await aitana_assistant_list(
    include_templates=False,  # Include template assistants
    include_instances=True,    # Include assistant instances
    limit=100                  # Maximum results
)
# Returns: List of assistants with ID, name, description, tools

assistant-inspect

Inspects a specific assistant’s configuration:

result = await aitana_assistant_inspect(
    assistant_id="research-assistant-v1"
)
# Returns: Assistant configuration, available tools, permissions

assistant-call

Calls an assistant with a prompt:

result = await aitana_assistant_call(
    assistant_id="research-assistant-v1",
    prompt="What are the latest developments in renewable energy?",
    tools=["ai-search", "google-search"],  # Optional: override tools
    save_to_history=False                   # Optional: save to Firestore
)
# Returns: Assistant response with metadata

Key Features:

  • Full assistant interaction capabilities
  • Tool override support for specialized workflows
  • Chat history context support
  • Firestore integration for conversation persistence
  • MCP client detection for debugging

List GCS Bucket Tool (tools/list_gcs_bucket.py)

Purpose: Browse and list Google Cloud Storage bucket contents

Features:

  • Returns gs:// URIs for use with extract_files
  • File filtering by extension and pattern
  • Folder navigation support
  • Pagination with continuation tokens
  • Metadata inclusion (size, modified time)

Usage Example:

result = await list_gcs_bucket(
    bucket_path="aitana-documents-bucket",
    prefix="Competitors/",
    file_extensions=[".pdf", ".txt"],
    max_files=100,
    include_metadata=True
)

FastAPI Integration

Enhanced MCP Support (app_fastapi.py)

The FastAPI application now includes enhanced MCP integration:

app, vac_routes = VACRoutesFastAPI.create_app_with_mcp(
    title="Aitana Backend API",
    stream_interpreter=vac_stream_with_assistant_support,
    enable_a2a_agent=True,
    additional_routes=additional_routes,
    add_langfuse_eval=True
)

MCP Tool Registration:

# Register custom MCP tools
vac_routes.add_mcp_tool(
    tool_name="list-gcs-bucket",
    tool_description="List contents of GCS bucket",
    tool_function=list_gcs_bucket_wrapper,
    params_model=ListGCSBucketParams
)

stdio Compatibility Layer

Parameter Serialization Issue

Claude Code sends parameters as JSON strings instead of dictionaries, requiring a compatibility layer.

Problem:

// Claude Code sends:
{"params": "{\"query\": \"test\"}"}

// Expected:
{"params": {"query": "test"}}

Solution (mcp_stdio_compat.py):

class FlexibleParams(BaseModel):
    @classmethod
    def parse_if_string(cls, v: Union[Dict, str]) -> Dict:
        """Parse JSON string to dict if needed"""
        if isinstance(v, str):
            return json.loads(v)
        return v

Thinking Content Capture

ThinkingContentCapturingCallback (thinking_content_capture.py)

New feature to capture and preserve AI thinking content:

Purpose: Capture Claude’s thinking process tags while streaming

Implementation:

class ThinkingContentCapturingCallback(BufferStreamingStdOutCallbackHandlerAsync):
    def __init__(self, original_callback):
        self.captured_content = ""  # Accumulate ALL content including thinking
    
    async def async_on_llm_new_token(self, token: str, **kwargs):
        # Capture token and pass through
        self.captured_content += token
        return await self.original_callback.async_on_llm_new_token(token, **kwargs)

Usage:

  • Preserves tags in saved messages
  • Enables debugging of AI reasoning process
  • Maintains streaming performance

Testing and Debugging

Test Scripts

  1. test_mcp_tools.py - Comprehensive MCP tool testing
  2. test_mcp_updates.py - Test recent MCP updates
  3. test_mcp_client_detection.py - Client detection testing

Debug Environment Variables

# Enable MCP debug logging
export MCP_DEBUG=true

# Force client type
export MCP_CLIENT_TYPE=claude-code

# Enable Langfuse tracing
export LANGFUSE_DEBUG=true

Testing MCP Tools Directly

# Test AI search
curl -X POST http://localhost:1956/direct/tools/ai-search \
  -H "Content-Type: application/json" \
  -d '{"question": "test query", "datastore_id": "aitana3"}'

# Test list GCS bucket
curl -X POST http://localhost:1956/direct/tools/list-gcs-bucket \
  -H "Content-Type: application/json" \
  -d '{"bucket_path": "aitana-documents-bucket", "max_files": 10}'

Configuration

Environment Variables

# Core MCP settings
GOOGLE_CLOUD_PROJECT=your-project
GOOGLE_CLOUD_LOCATION=global
ANTHROPIC_API_KEY=your-key

# Client detection
MCP_CLIENT_TYPE=claude-desktop
CLAUDE_DESKTOP_VERSION=1.0

# Debug settings
MCP_DEBUG=true
LANGFUSE_DEBUG=true

Claude Desktop Configuration

{
  "mcpServers": {
    "aitana": {
      "command": "/path/to/backend/mcp_server_claude_desktop.sh"
    }
  }
}

Claude Code Configuration

{
  "mcpServers": {
    "aitana": {
      "command": "/path/to/backend/mcp_server_claude_code.sh"
    }
  }
}

Performance Considerations

Thinking Content Impact

The check_and_display_thinking() function adds 1-4 seconds latency:

# Avoid in time-sensitive paths
if DEBUG_MODE:
    await check_and_display_thinking("Processing...", callback)

Tool Concurrency

Multiple searches can be run in parallel:

class MultipleAISearchParams(BaseModel):
    searches: List[AISearchParams]  # Run 2-3 variations for better coverage

Security

SSL Certificate Generation

Auto-generates certificates for HTTPS mode:

# generate_ssl_cert.py
create_self_signed_cert(
    cert_file="certs/localhost.crt",
    key_file="certs/localhost.key"
)

Permission Validation

All MCP tools respect backend permission system:

  • User email validation
  • Tool access control
  • Tag-based permissions

Future Enhancements

Planned Features

  1. Agent Tool Integration:
    • document-search agent
    • code-execution agent
    • assistant-calling agent
  2. Enhanced Client Detection:
    • Version-specific behavior
    • Client capability detection
    • Performance optimization per client
  3. Tool Orchestration:
    • Parallel tool execution
    • Tool chaining
    • Conditional tool selection

User Guides

Technical Documentation