Backend MCP Integration Architecture
Purpose: Technical implementation details for backend developers.
User Documentation:
- MCP Claude Integration Guide - User setup guide
- MCP Complete Architecture - Full overview
- MCP Quick Reference - Quick commands
Overview
The Aitana backend provides comprehensive Model Context Protocol (MCP) integration through multiple server implementations and client detection capabilities. This document covers the technical implementation details for backend developers.
Table of Contents
- MCP Server Implementations
- Client Detection System
- MCP Tools Registry
- FastAPI Integration
- stdio Compatibility Layer
- Thinking Content Capture
- Testing and Debugging
MCP Server Implementations
The backend provides three distinct MCP server implementations:
1. stdio Server (mcp_stdio.py and mcp_stdio_compat.py)
Purpose: Direct process communication with Claude Desktop and Claude Code
Key Files:
mcp_stdio.py- Simple stdio server implementationmcp_stdio_compat.py- Compatibility wrapper for Claude Code parameter serialization issues
Features:
- Handles JSON string parameters from Claude Code (compatibility mode)
- Direct stdio communication without HTTP overhead
- Client detection support through environment variables
2. HTTP/HTTPS Server (app_https.py)
Purpose: Secure local connections for MCP over HTTP
Features:
- Auto-generates SSL certificates for localhost
- Runs on port 1956 with HTTPS
- Endpoint:
https://localhost:1956/mcp/mcp - Alternative to stdio for environments requiring HTTP transport
3. FastAPI MCP Integration (app_fastapi.py)
Purpose: Production-ready FastAPI server with MCP support
Features:
- Uses Sunholo VACRoutesFastAPI.create_app_with_mcp
- Integrated Langfuse tracing
- Support for additional custom routes
- Automatic MCP tool registration
Client Detection System
MCPClientDetector (mcp_client_detector.py)
Sophisticated client detection to identify the source of MCP requests:
class MCPClientType(Enum):
CLAUDE_DESKTOP = "claude-desktop"
CLAUDE_CODE = "claude-code"
AITANA_CLI = "aitana-cli"
HTTP_API = "http-api"
CURL = "curl"
UNKNOWN = "unknown"
Detection Methods:
- Environment variables (MCP_CLIENT_TYPE, CLAUDE_DESKTOP_VERSION, CLAUDE_CODE_VERSION)
- HTTP headers (User-Agent, X-MCP-Client)
- Process parent inspection
- Command-line arguments
Integration with Langfuse:
def add_client_detection_to_trace(
trace: StatefulTraceClient,
client_type: MCPClientType,
metadata: Dict[str, Any]
) -> None:
"""Add client detection metadata to Langfuse trace"""
Launcher Scripts
Three specialized launcher scripts with client identification:
- mcp_server_claude_desktop.sh
- Sets CLAUDE_DESKTOP_VERSION
- Uses simpler mcp_stdio.py (no compat needed)
- mcp_server_claude_code.sh
- Sets CLAUDE_CODE_VERSION
- Uses mcp_stdio_compat.py for parameter compatibility
- mcp_server.sh
- Generic launcher without client identification
MCP Tools Registry
Core Tool Registry (mcp_tools.py)
Comprehensive MCP tool implementation with three categories:
Assistant Tools (NEW)
- assistant-call: Call any assistant with a prompt and get responses
- assistant-inspect: Inspect assistant configuration and available tools
- assistant-list: List all available assistants
Processing Tools
- ai-search: AI-powered search with datastore and filter support
- extract-files: Batch file extraction from GCS
- list-gcs-bucket: Browse GCS bucket contents
- google-search: Web search integration
- structured-extraction: Extract structured data from text
- url-processing: Process and extract from URLs
- user-history: Search user chat history
Model Tools
- gemini: Direct Gemini model access
- anthropic-smart: Claude model access
- smart-stream: Unified streaming model interface
New Assistant Tools
The assistant tools enable programmatic interaction with Aitana assistants through MCP:
assistant-list
Lists all available assistants with filtering options:
result = await aitana_assistant_list(
include_templates=False, # Include template assistants
include_instances=True, # Include assistant instances
limit=100 # Maximum results
)
# Returns: List of assistants with ID, name, description, tools
assistant-inspect
Inspects a specific assistant’s configuration:
result = await aitana_assistant_inspect(
assistant_id="research-assistant-v1"
)
# Returns: Assistant configuration, available tools, permissions
assistant-call
Calls an assistant with a prompt:
result = await aitana_assistant_call(
assistant_id="research-assistant-v1",
prompt="What are the latest developments in renewable energy?",
tools=["ai-search", "google-search"], # Optional: override tools
save_to_history=False # Optional: save to Firestore
)
# Returns: Assistant response with metadata
Key Features:
- Full assistant interaction capabilities
- Tool override support for specialized workflows
- Chat history context support
- Firestore integration for conversation persistence
- MCP client detection for debugging
List GCS Bucket Tool (tools/list_gcs_bucket.py)
Purpose: Browse and list Google Cloud Storage bucket contents
Features:
- Returns gs:// URIs for use with extract_files
- File filtering by extension and pattern
- Folder navigation support
- Pagination with continuation tokens
- Metadata inclusion (size, modified time)
Usage Example:
result = await list_gcs_bucket(
bucket_path="aitana-documents-bucket",
prefix="Competitors/",
file_extensions=[".pdf", ".txt"],
max_files=100,
include_metadata=True
)
FastAPI Integration
Enhanced MCP Support (app_fastapi.py)
The FastAPI application now includes enhanced MCP integration:
app, vac_routes = VACRoutesFastAPI.create_app_with_mcp(
title="Aitana Backend API",
stream_interpreter=vac_stream_with_assistant_support,
enable_a2a_agent=True,
additional_routes=additional_routes,
add_langfuse_eval=True
)
MCP Tool Registration:
# Register custom MCP tools
vac_routes.add_mcp_tool(
tool_name="list-gcs-bucket",
tool_description="List contents of GCS bucket",
tool_function=list_gcs_bucket_wrapper,
params_model=ListGCSBucketParams
)
stdio Compatibility Layer
Parameter Serialization Issue
Claude Code sends parameters as JSON strings instead of dictionaries, requiring a compatibility layer.
Problem:
// Claude Code sends:
{"params": "{\"query\": \"test\"}"}
// Expected:
{"params": {"query": "test"}}
Solution (mcp_stdio_compat.py):
class FlexibleParams(BaseModel):
@classmethod
def parse_if_string(cls, v: Union[Dict, str]) -> Dict:
"""Parse JSON string to dict if needed"""
if isinstance(v, str):
return json.loads(v)
return v
Thinking Content Capture
ThinkingContentCapturingCallback (thinking_content_capture.py)
New feature to capture and preserve AI thinking content:
Purpose: Capture Claude’s thinking process tags while streaming
Implementation:
class ThinkingContentCapturingCallback(BufferStreamingStdOutCallbackHandlerAsync):
def __init__(self, original_callback):
self.captured_content = "" # Accumulate ALL content including thinking
async def async_on_llm_new_token(self, token: str, **kwargs):
# Capture token and pass through
self.captured_content += token
return await self.original_callback.async_on_llm_new_token(token, **kwargs)
Usage:
- Preserves
tags in saved messages - Enables debugging of AI reasoning process
- Maintains streaming performance
Testing and Debugging
Test Scripts
- test_mcp_tools.py - Comprehensive MCP tool testing
- test_mcp_updates.py - Test recent MCP updates
- test_mcp_client_detection.py - Client detection testing
Debug Environment Variables
# Enable MCP debug logging
export MCP_DEBUG=true
# Force client type
export MCP_CLIENT_TYPE=claude-code
# Enable Langfuse tracing
export LANGFUSE_DEBUG=true
Testing MCP Tools Directly
# Test AI search
curl -X POST http://localhost:1956/direct/tools/ai-search \
-H "Content-Type: application/json" \
-d '{"question": "test query", "datastore_id": "aitana3"}'
# Test list GCS bucket
curl -X POST http://localhost:1956/direct/tools/list-gcs-bucket \
-H "Content-Type: application/json" \
-d '{"bucket_path": "aitana-documents-bucket", "max_files": 10}'
Configuration
Environment Variables
# Core MCP settings
GOOGLE_CLOUD_PROJECT=your-project
GOOGLE_CLOUD_LOCATION=global
ANTHROPIC_API_KEY=your-key
# Client detection
MCP_CLIENT_TYPE=claude-desktop
CLAUDE_DESKTOP_VERSION=1.0
# Debug settings
MCP_DEBUG=true
LANGFUSE_DEBUG=true
Claude Desktop Configuration
{
"mcpServers": {
"aitana": {
"command": "/path/to/backend/mcp_server_claude_desktop.sh"
}
}
}
Claude Code Configuration
{
"mcpServers": {
"aitana": {
"command": "/path/to/backend/mcp_server_claude_code.sh"
}
}
}
Performance Considerations
Thinking Content Impact
The check_and_display_thinking() function adds 1-4 seconds latency:
# Avoid in time-sensitive paths
if DEBUG_MODE:
await check_and_display_thinking("Processing...", callback)
Tool Concurrency
Multiple searches can be run in parallel:
class MultipleAISearchParams(BaseModel):
searches: List[AISearchParams] # Run 2-3 variations for better coverage
Security
SSL Certificate Generation
Auto-generates certificates for HTTPS mode:
# generate_ssl_cert.py
create_self_signed_cert(
cert_file="certs/localhost.crt",
key_file="certs/localhost.key"
)
Permission Validation
All MCP tools respect backend permission system:
- User email validation
- Tool access control
- Tag-based permissions
Future Enhancements
Planned Features
- Agent Tool Integration:
- document-search agent
- code-execution agent
- assistant-calling agent
- Enhanced Client Detection:
- Version-specific behavior
- Client capability detection
- Performance optimization per client
- Tool Orchestration:
- Parallel tool execution
- Tool chaining
- Conditional tool selection
Related Documentation
User Guides
- MCP Claude Integration Guide - Claude Desktop/Code setup
- MCP Complete Architecture - Full technical overview
- MCP Quick Reference - Quick setup and commands
- MCP Sunholo Patterns - FastAPI integration patterns
Technical Documentation
- Backend Tool System - Tool implementation details
- Backend API Guide - API endpoints and usage
- How to Add Tools - Complete tool addition guide