Backend Email Integration API

Technical documentation for the backend email processing system (backend/email_integration.py).

Overview

The backend email integration provides a comprehensive email processing system that handles webhook validation, user permissions, AI response generation, and advanced features like document exports and attachment processing.

Key Components:

  • Email webhook processing and validation
  • Rate limiting and security measures
  • Document export via Quarto integration
  • File attachment handling with Firebase Storage
  • HTML email formatting with design system consistency
  • Integration with VAC Service Architecture and Tool Context

Core Classes

EmailProcessor

Main orchestration class that handles the complete email processing pipeline.

class EmailProcessor:
    def __init__(self):
        self.db = firestore.Client()
        self.rate_limiter = EmailRateLimiter()
        self.validator = EmailWebhookValidator()
        self.attachment_handler = EmailAttachmentHandler()
        self.quarto_exporter = QuartoExporter()

Key Methods

extract_assistant_id(recipient_email: str) -> Optional[str]

Extracts assistant ID from structured email addresses.

Supported Formats:

  • assistant-{assistantId}@domain.com (production)
  • dev-assistant-{assistantId}@domain.com (development)
  • test-assistant-{assistantId}@domain.com (testing)
  • assistant-list@domain.com (special case for assistant directory)

Example:

processor = EmailProcessor()

# Production format
assistant_id = processor.extract_assistant_id("assistant-gpt-helper@email.aitana.chat")
# Returns: "gpt-helper"

# Development format  
assistant_id = processor.extract_assistant_id("dev-assistant-123@email.aitana.chat")
# Returns: "123"
clean_email_content(email_body: str) -> str

Removes email headers, signatures, and quoted content to extract the actual user message.

Filters out:

  • Email headers (From:, To:, Subject:, Date:)
  • Common signatures (Sent from my iPhone, --, ___)
  • Quoted content (lines starting with >)
  • Unsubscribe links and boilerplate text

Example:

raw_email = """Hello assistant,

Can you help me with this project?

Thanks!
John

From: john@example.com
Sent from my iPhone
> Previous email content
"""

cleaned = processor.clean_email_content(raw_email)
# Returns: "Hello assistant,\n\nCan you help me with this project?\n\nThanks!\nJohn"
process_email_message(sender_email, assistant_id, email_content, subject, recipient_email=None, form_data=None) -> Dict[str, Any]

Main processing method that handles the complete email-to-AI-response pipeline.

Integration Points:

  • Permission System: Uses tool_permissions.permitted_tools() to validate user access
  • VAC Service: Calls vac_stream() for AI response generation
  • Chat History: Integrates with assistant_config.get_chat_history_from_firestore()
  • Langfuse Tracing: Tracks processing with trace IDs

Process Flow:

  1. Validate assistant configuration exists
  2. Check user permissions for tools and access
  3. Process attachments via EmailAttachmentHandler
  4. Format message with email context
  5. Call VAC service for AI response
  6. Process export requests (PDF, DOCX, etc.)
  7. Save interaction to chat history and Firestore
  8. Send formatted email response

Example Response:

{
  "status": "processed",
  "assistant_id": "gpt-helper",
  "sender": "user@example.com", 
  "response": "AI generated response",
  "email_sent": true,
  "message": "Email processed and response sent"
}

EmailRateLimiter

Prevents spam and abuse with per-user rate limiting stored in Firestore.

class EmailRateLimiter:
    def __init__(self):
        self.db = firestore.Client()
        self.rate_limit_collection = "email_rate_limits"

check_rate_limit(user_email: str) -> Tuple[bool, Optional[str]]

Enforces 1 email per minute per user limit with timezone-aware tracking.

Firestore Storage:

{
  "collection": "email_rate_limits",
  "document": "user@example.com",
  "data": {
    "last_email_time": "2025-06-14T12:00:00Z",
    "total_emails": 5
  }
}

Example:

rate_limiter = EmailRateLimiter()
allowed, error = await rate_limiter.check_rate_limit("user@example.com")

if not allowed:
    # Returns: (False, "Rate limit exceeded. Please wait 45 seconds.")
    return error_response(error)

EmailWebhookValidator

Handles security validation for incoming webhooks.

validate_mailgun_webhook(timestamp: str, token: str, signature: str) -> bool

Validates webhook authenticity using HMAC-SHA256 signature verification.

Process:

  1. Combines timestamp and token
  2. Generates HMAC signature using MAILGUN_WEBHOOK_SECRET
  3. Compares with provided signature using secure comparison

Example:

validator = EmailWebhookValidator()
is_valid = validator.validate_mailgun_webhook(
    timestamp="1234567890",
    token="webhook_token", 
    signature="expected_hmac_signature"
)

validate_email_address(email: str) -> bool

Validates email format using robust parsing.

Validation Rules:

  • Must contain @ symbol
  • Local part cannot be empty
  • Domain must contain at least one dot
  • Must have exactly two parts (local@domain)

QuartoExporter

Handles document export functionality for email attachments.

extract_export_flags(subject: str) -> List[str]

Parses email subjects for export format requests.

Supported Formats:

  • (export:pdf) → PDF document
  • (export:docx) → Word document
  • (export:html) → HTML document
  • (export:pptx) → PowerPoint presentation

Example:

exporter = QuartoExporter()
formats = exporter.extract_export_flags("Analysis report (export:pdf) (export:docx)")
# Returns: ["pdf", "docx"]

generate_export(content: str, assistant_name: str, export_format: str, subject: str) -> Optional[Dict[str, str]]

Generates documents via Quarto API integration.

API Integration:

  • Endpoint: ${NEXT_PUBLIC_BACKEND_URL}/api/quarto
  • Timeout: 120 seconds for generation
  • Returns download URL and filename

Document Structure:

{
  "message": {
    "content": "AI response content",
    "timestamp": 1703174400000,
    "sender": "assistant",
    "userName": "Assistant Name"
  },
  "format": "pdf",
  "reportData": {
    "title": "Email Response: Original Subject",
    "author": "Assistant Name", 
    "date": "2025-06-14"
  }
}

EmailAttachmentHandler

Processes email attachments and integrates with Firebase Storage.

Storage Architecture

Attachments follow the UI upload structure for consistency:

Storage Path:

users/{userId}/shares/{assistantId}/documents/{uniqueFilename}

Process:

  1. Find user ID by sender email address
  2. Download attachment from Mailgun with API key authentication
  3. Validate file type against supported MIME types
  4. Check file size (30MB limit)
  5. Upload to Firebase Storage with metadata
  6. Make blob publicly readable for assistant access
  7. Return document object compatible with UI uploads

Supported File Types:

supported_types = {
    'application/pdf',
    'text/javascript', 'text/x-python', 'text/plain', 'text/html',
    'text/css', 'text/md', 'text/csv', 'text/xml', 'text/rtf',
    'application/json',
    'image/png', 'image/jpeg', 'image/gif', 'image/webp',
    'audio/wav', 'audio/mp3', 'audio/mpeg',
    'video/mp4', 'video/mpeg', 'video/mov', 'video/avi', 'video/wmv', 'video/flv'
}

Document Integration: Attachments are passed to the VAC service as documents parameter, enabling AI analysis of uploaded files.

EmailFormatter

Handles rich HTML email formatting with design system consistency.

convert_markdown_to_html(content: str) -> str

Converts markdown content to HTML with custom component handling.

Markdown Extensions:

  • fenced_code - Syntax highlighting for code blocks
  • tables - Full table support with styling
  • nl2br - Converts newlines to <br> tags

Custom Component Processing:

# Email-compatible components
email_compatible = {
    'img': lambda match: f'<img src="{src}" alt="{alt}" style="max-width: 100%; height: auto;">',
    'image': lambda match: f'<img src="{src}" alt="{alt}" style="max-width: 100%; height: auto;">'
}

# Interactive components (graceful degradation)
online_only = {
    'plot', 'networkgraph', 'googlesearch', 'toolconfirmation',
    'dynamicui', 'dynamic-ui', 'artifact', 'preview', 'assistantresponse'
}

create_email_signature(assistant_name: str, assistant_id: str, avatar_url: str = None) -> tuple[str, str]

Generates professional email signatures matching the design system.

HTML Features:

  • 48px circular avatar image
  • Assistant name with proper typography
  • “AI Assistant” role subtitle
  • Call-to-action button with brand styling
  • Reply instructions and export tips

Design System Colors:

/* Brand colors from design system */
--accent-color: hsl(20, 100%, 70%);
--text-color: hsl(222.2, 84%, 4.9%);
--border-color: hsl(214.3, 31.8%, 91.4%);
--muted-color: hsl(215.4, 16.3%, 46.9%);

Typography:

  • Primary: ‘Euclid Circular A’ with fallbacks
  • Headers: ‘Crimson Pro’ for consistency
  • Code: Menlo, Monaco, Consolas for monospace

API Endpoints

Webhook Processing

email_webhook_receive()

Main webhook endpoint for receiving emails from Mailgun.

Expected Endpoint: /api/email/webhook

Request Processing:

  1. Validation Priority:
    • Frontend pre-validation via X-Validated-User and X-Assistant-ID headers
    • Fallback to direct Mailgun signature validation
    • Rate limiting check
    • Email format validation
  2. Processing Pipeline:
    • Extract and clean email content
    • Process through EmailProcessor.process_email_message()
    • Handle attachments and export requests
    • Send formatted response email

Response Codes:

  • 200 - Email processed successfully
  • 400 - Invalid email format or empty content
  • 401 - Invalid webhook signature
  • 429 - Rate limit exceeded
  • 500 - Internal server error

email_webhook_status()

Health check endpoint for email system monitoring.

Expected Endpoint: /api/email/status

Health Checks:

  • Required environment variables (MAILGUN_API_KEY)
  • Webhook secret configuration status
  • System timestamp

Integration with Core Systems

VAC Service Integration

Email processing integrates with the VAC Service Architecture:

# Email context passed to VAC service
emissary_config = {
    "botId": assistant_id,
    "botName": assistant_config.get("name"),
    "botAvatar": assistant_config.get("avatar"),
    "senderName": sender_email.split("@")[0],
    "tools": assistant_config.get("tools", []),
    "toolConfigs": assistant_config.get("toolConfigs", {}),
    "selectedItems": [],
    "currentUser": current_user,
    "confirm_past_pause": True
}

result = await vac_stream(
    question=message_content,
    vector_name="aitana3",
    chat_history=chat_history,
    callback=callback,
    instructions=assistant_config.get("initialInstructions", ""),
    emissaryConfig=emissary_config,
    documents=attachments,  # Email attachments integrated
    stream_only=False,
    stream_wait_time=1,
    stream_timeout=120
)

Tool System Integration

Leverages the Tool Context system:

# Permission validation
user_permissions, allowed_configs = permitted_tools(
    current_user, 
    requested_tools, 
    tool_configs
)

# Tool configuration passed to VAC service
tool_context = {
    "tools": user_permissions,
    "toolConfigs": allowed_configs,
    "documents": attachments  # Attachments available to tools
}

Chat History Integration

Email interactions are seamlessly integrated with assistant chat history:

# Save email as user message
user_message_data = {
    "sender": "user",
    "content": user_message,
    "userName": f"{current_user['displayName']} (from email)",
    "userEmail": sender_email,
    "source": "email",  # Tagged as email source
    "traceId": trace_id
}

# Save AI response
bot_message_data = {
    "sender": "assistant", 
    "content": ai_response,
    "source": "email",
    "traceId": trace_id
}

Security Considerations

Input Validation

Email Content:

  • HTML entity encoding for user input
  • Removal of potentially malicious content
  • Size limits on email content and attachments

File Uploads:

  • MIME type validation against allowlist
  • File size limits (30MB)
  • Virus scanning consideration for production

Rate Limiting

Current Implementation:

  • 1 email per minute per user
  • Firestore-based tracking
  • Timezone-aware timestamps

Production Considerations:

  • Consider per-assistant rate limits
  • Implement exponential backoff for repeat violations
  • Monitor for abuse patterns

Permission System

Access Control:

  • Same permission system as web interface
  • Email sender must be registered user
  • Tool access based on user roles and tags
  • Assistant access control respected

Error Handling

Graceful Degradation

Export Failures:

  • Individual export failures don’t prevent email sending
  • Clear error logging for debugging
  • Users receive response even if exports fail

Attachment Processing:

  • Failed attachments don’t block email processing
  • Clear error messages in logs
  • Continue with available attachments

AI Processing:

  • Fallback error messages for AI failures
  • Timeout handling for long responses
  • Graceful handling of tool failures

Error Logging

Key Metrics to Monitor:

  • Email processing success rate
  • Export generation success rate
  • Attachment upload success rate
  • Rate limiting trigger frequency
  • Permission denial frequency

Testing

See Email Integration Testing Guide for comprehensive testing strategies and examples.