Backend Email Integration API

Technical documentation for the backend email processing system (backend/email_integration.py).

Overview

The backend email integration provides a comprehensive email processing system that handles webhook validation, user permissions, AI response generation, and advanced features like document exports and attachment processing.

Key Components:

Email webhook processing and validation
Rate limiting and security measures
Document export via Quarto integration
File attachment handling with Firebase Storage
HTML email formatting with design system consistency
Integration with VAC Service Architecture and Tool Context

Core Classes

EmailProcessor

Main orchestration class that handles the complete email processing pipeline.

class EmailProcessor:
    def __init__(self):
        self.db = firestore.Client()
        self.rate_limiter = EmailRateLimiter()
        self.validator = EmailWebhookValidator()
        self.attachment_handler = EmailAttachmentHandler()
        self.quarto_exporter = QuartoExporter()

Key Methods

`extract_assistant_id(recipient_email: str) -> Optional[str]`

Extracts assistant ID from structured email addresses.

Supported Formats:

assistant-{assistantId}@domain.com (production)
dev-assistant-{assistantId}@domain.com (development)
test-assistant-{assistantId}@domain.com (testing)
assistant-list@domain.com (special case for assistant directory)

Example:

processor = EmailProcessor()

# Production format
assistant_id = processor.extract_assistant_id("assistant-gpt-helper@email.aitana.chat")
# Returns: "gpt-helper"

# Development format  
assistant_id = processor.extract_assistant_id("dev-assistant-123@email.aitana.chat")
# Returns: "123"

`clean_email_content(email_body: str) -> str`

Removes email headers, signatures, and quoted content to extract the actual user message.

Filters out:

Email headers (From:, To:, Subject:, Date:)
Common signatures (Sent from my iPhone, --, ___)
Quoted content (lines starting with >)
Unsubscribe links and boilerplate text

Example:

raw_email = """Hello assistant,

Can you help me with this project?

Thanks!
John

From: john@example.com
Sent from my iPhone
> Previous email content
"""

cleaned = processor.clean_email_content(raw_email)
# Returns: "Hello assistant,\n\nCan you help me with this project?\n\nThanks!\nJohn"

`process_email_message(sender_email, assistant_id, email_content, subject, recipient_email=None, form_data=None) -> Dict[str, Any]`

Main processing method that handles the complete email-to-AI-response pipeline.

Integration Points:

Permission System: Uses tool_permissions.permitted_tools() to validate user access
VAC Service: Calls vac_stream() for AI response generation
Chat History: Integrates with assistant_config.get_chat_history_from_firestore()
Langfuse Tracing: Tracks processing with trace IDs

Process Flow:

Validate assistant configuration exists
Check user permissions for tools and access
Process attachments via EmailAttachmentHandler
Format message with email context
Call VAC service for AI response
Process export requests (PDF, DOCX, etc.)
Save interaction to chat history and Firestore
Send formatted email response

Example Response:

{
  "status": "processed",
  "assistant_id": "gpt-helper",
  "sender": "user@example.com", 
  "response": "AI generated response",
  "email_sent": true,
  "message": "Email processed and response sent"
}

EmailRateLimiter

Prevents spam and abuse with per-user rate limiting stored in Firestore.

class EmailRateLimiter:
    def __init__(self):
        self.db = firestore.Client()
        self.rate_limit_collection = "email_rate_limits"

`check_rate_limit(user_email: str) -> Tuple[bool, Optional[str]]`

Enforces 1 email per minute per user limit with timezone-aware tracking.

Firestore Storage:

{
  "collection": "email_rate_limits",
  "document": "user@example.com",
  "data": {
    "last_email_time": "2025-06-14T12:00:00Z",
    "total_emails": 5
  }
}

Example:

rate_limiter = EmailRateLimiter()
allowed, error = await rate_limiter.check_rate_limit("user@example.com")

if not allowed:
    # Returns: (False, "Rate limit exceeded. Please wait 45 seconds.")
    return error_response(error)

EmailWebhookValidator

Handles security validation for incoming webhooks.

`validate_mailgun_webhook(timestamp: str, token: str, signature: str) -> bool`

Validates webhook authenticity using HMAC-SHA256 signature verification.

Process:

Combines timestamp and token
Generates HMAC signature using MAILGUN_WEBHOOK_SECRET
Compares with provided signature using secure comparison

Example:

validator = EmailWebhookValidator()
is_valid = validator.validate_mailgun_webhook(
    timestamp="1234567890",
    token="webhook_token", 
    signature="expected_hmac_signature"
)

`validate_email_address(email: str) -> bool`

Validates email format using robust parsing.

Validation Rules:

Must contain @ symbol
Local part cannot be empty
Domain must contain at least one dot
Must have exactly two parts (local@domain)

QuartoExporter

Handles document export functionality for email attachments.

`extract_export_flags(subject: str) -> List[str]`

Parses email subjects for export format requests.

Supported Formats:

(export:pdf) → PDF document
(export:docx) → Word document
(export:html) → HTML document
(export:pptx) → PowerPoint presentation

Example:

exporter = QuartoExporter()
formats = exporter.extract_export_flags("Analysis report (export:pdf) (export:docx)")
# Returns: ["pdf", "docx"]

`generate_export(content: str, assistant_name: str, export_format: str, subject: str) -> Optional[Dict[str, str]]`

Generates documents via Quarto API integration.

API Integration:

Endpoint: ${NEXT_PUBLIC_BACKEND_URL}/api/quarto
Timeout: 120 seconds for generation
Returns download URL and filename

Document Structure:

{
  "message": {
    "content": "AI response content",
    "timestamp": 1703174400000,
    "sender": "assistant",
    "userName": "Assistant Name"
  },
  "format": "pdf",
  "reportData": {
    "title": "Email Response: Original Subject",
    "author": "Assistant Name", 
    "date": "2025-06-14"
  }
}

EmailAttachmentHandler

Processes email attachments and integrates with Firebase Storage.

Storage Architecture

Attachments follow the UI upload structure for consistency:

Storage Path:

users/{userId}/shares/{assistantId}/documents/{uniqueFilename}

Process:

Find user ID by sender email address
Download attachment from Mailgun with API key authentication
Validate file type against supported MIME types
Check file size (30MB limit)
Upload to Firebase Storage with metadata
Make blob publicly readable for assistant access
Return document object compatible with UI uploads

Supported File Types:

supported_types = {
    'application/pdf',
    'text/javascript', 'text/x-python', 'text/plain', 'text/html',
    'text/css', 'text/md', 'text/csv', 'text/xml', 'text/rtf',
    'application/json',
    'image/png', 'image/jpeg', 'image/gif', 'image/webp',
    'audio/wav', 'audio/mp3', 'audio/mpeg',
    'video/mp4', 'video/mpeg', 'video/mov', 'video/avi', 'video/wmv', 'video/flv'
}

Document Integration: Attachments are passed to the VAC service as documents parameter, enabling AI analysis of uploaded files.

EmailFormatter

Handles rich HTML email formatting with design system consistency.

`convert_markdown_to_html(content: str) -> str`

Converts markdown content to HTML with custom component handling.

Markdown Extensions:

fenced_code - Syntax highlighting for code blocks
tables - Full table support with styling
nl2br - Converts newlines to <br> tags

Custom Component Processing:

# Email-compatible components
email_compatible = {
    'img': lambda match: f'<img src="{src}" alt="{alt}" style="max-width: 100%; height: auto;">',
    'image': lambda match: f'<img src="{src}" alt="{alt}" style="max-width: 100%; height: auto;">'
}

# Interactive components (graceful degradation)
online_only = {
    'plot', 'networkgraph', 'googlesearch', 'toolconfirmation',
    'dynamicui', 'dynamic-ui', 'artifact', 'preview', 'assistantresponse'
}

`create_email_signature(assistant_name: str, assistant_id: str, avatar_url: str = None) -> tuple[str, str]`

Generates professional email signatures matching the design system.

HTML Features:

48px circular avatar image
Assistant name with proper typography
“AI Assistant” role subtitle
Call-to-action button with brand styling
Reply instructions and export tips

Design System Colors:

/* Brand colors from design system */
--accent-color: hsl(20, 100%, 70%);
--text-color: hsl(222.2, 84%, 4.9%);
--border-color: hsl(214.3, 31.8%, 91.4%);
--muted-color: hsl(215.4, 16.3%, 46.9%);

Typography:

Primary: ‘Euclid Circular A’ with fallbacks
Headers: ‘Crimson Pro’ for consistency
Code: Menlo, Monaco, Consolas for monospace

API Endpoints

Webhook Processing

`email_webhook_receive()`

Main webhook endpoint for receiving emails from Mailgun.

Expected Endpoint: /api/email/webhook

Request Processing:

Validation Priority:
- Frontend pre-validation via X-Validated-User and X-Assistant-ID headers
- Fallback to direct Mailgun signature validation
- Rate limiting check
- Email format validation
Processing Pipeline:
- Extract and clean email content
- Process through EmailProcessor.process_email_message()
- Handle attachments and export requests
- Send formatted response email

Response Codes:

200 - Email processed successfully
400 - Invalid email format or empty content
401 - Invalid webhook signature
429 - Rate limit exceeded
500 - Internal server error

`email_webhook_status()`

Health check endpoint for email system monitoring.

Expected Endpoint: /api/email/status

Health Checks:

Required environment variables (MAILGUN_API_KEY)
Webhook secret configuration status
System timestamp

Integration with Core Systems

VAC Service Integration

Email processing integrates with the VAC Service Architecture:

# Email context passed to VAC service
emissary_config = {
    "botId": assistant_id,
    "botName": assistant_config.get("name"),
    "botAvatar": assistant_config.get("avatar"),
    "senderName": sender_email.split("@")[0],
    "tools": assistant_config.get("tools", []),
    "toolConfigs": assistant_config.get("toolConfigs", {}),
    "selectedItems": [],
    "currentUser": current_user,
    "confirm_past_pause": True
}

result = await vac_stream(
    question=message_content,
    vector_name="aitana3",
    chat_history=chat_history,
    callback=callback,
    instructions=assistant_config.get("initialInstructions", ""),
    emissaryConfig=emissary_config,
    documents=attachments,  # Email attachments integrated
    stream_only=False,
    stream_wait_time=1,
    stream_timeout=120
)

Tool System Integration

Leverages the Tool Context system:

# Permission validation
user_permissions, allowed_configs = permitted_tools(
    current_user, 
    requested_tools, 
    tool_configs
)

# Tool configuration passed to VAC service
tool_context = {
    "tools": user_permissions,
    "toolConfigs": allowed_configs,
    "documents": attachments  # Attachments available to tools
}

Chat History Integration

Email interactions are seamlessly integrated with assistant chat history:

# Save email as user message
user_message_data = {
    "sender": "user",
    "content": user_message,
    "userName": f"{current_user['displayName']} (from email)",
    "userEmail": sender_email,
    "source": "email",  # Tagged as email source
    "traceId": trace_id
}

# Save AI response
bot_message_data = {
    "sender": "assistant", 
    "content": ai_response,
    "source": "email",
    "traceId": trace_id
}

Security Considerations

Input Validation

Email Content:

HTML entity encoding for user input
Removal of potentially malicious content
Size limits on email content and attachments

File Uploads:

MIME type validation against allowlist
File size limits (30MB)
Virus scanning consideration for production

Rate Limiting

Current Implementation:

1 email per minute per user
Firestore-based tracking
Timezone-aware timestamps

Production Considerations:

Consider per-assistant rate limits
Implement exponential backoff for repeat violations
Monitor for abuse patterns

Permission System

Access Control:

Same permission system as web interface
Email sender must be registered user
Tool access based on user roles and tags
Assistant access control respected

Error Handling

Graceful Degradation

Export Failures:

Individual export failures don’t prevent email sending
Clear error logging for debugging
Users receive response even if exports fail

Attachment Processing:

Failed attachments don’t block email processing
Clear error messages in logs
Continue with available attachments

AI Processing:

Fallback error messages for AI failures
Timeout handling for long responses
Graceful handling of tool failures

Error Logging

Key Metrics to Monitor:

Email processing success rate
Export generation success rate
Attachment upload success rate
Rate limiting trigger frequency
Permission denial frequency

Testing

See Email Integration Testing Guide for comprehensive testing strategies and examples.

Email Integration - User-facing email integration guide
VAC Service Architecture - Core AI processing pipeline
Tool Context - Tool system integration
Backend Utility Functions - Core utility functions used by email system
Langfuse Tracing - Observability and debugging

Backend Email Integration API

Overview

Core Classes

EmailProcessor

Key Methods

extract_assistant_id(recipient_email: str) -> Optional[str]

clean_email_content(email_body: str) -> str

process_email_message(sender_email, assistant_id, email_content, subject, recipient_email=None, form_data=None) -> Dict[str, Any]

EmailRateLimiter

check_rate_limit(user_email: str) -> Tuple[bool, Optional[str]]

EmailWebhookValidator

validate_mailgun_webhook(timestamp: str, token: str, signature: str) -> bool

validate_email_address(email: str) -> bool

QuartoExporter

extract_export_flags(subject: str) -> List[str]

generate_export(content: str, assistant_name: str, export_format: str, subject: str) -> Optional[Dict[str, str]]

EmailAttachmentHandler

Storage Architecture

EmailFormatter

convert_markdown_to_html(content: str) -> str

create_email_signature(assistant_name: str, assistant_id: str, avatar_url: str = None) -> tuple[str, str]

API Endpoints

Webhook Processing

email_webhook_receive()

email_webhook_status()

Integration with Core Systems

VAC Service Integration

Tool System Integration

Chat History Integration

Security Considerations

Input Validation

Rate Limiting

Permission System

Error Handling

Graceful Degradation

Error Logging

Testing

Related Documentation

`extract_assistant_id(recipient_email: str) -> Optional[str]`

`clean_email_content(email_body: str) -> str`

`process_email_message(sender_email, assistant_id, email_content, subject, recipient_email=None, form_data=None) -> Dict[str, Any]`

`check_rate_limit(user_email: str) -> Tuple[bool, Optional[str]]`

`validate_mailgun_webhook(timestamp: str, token: str, signature: str) -> bool`

`validate_email_address(email: str) -> bool`

`extract_export_flags(subject: str) -> List[str]`

`generate_export(content: str, assistant_name: str, export_format: str, subject: str) -> Optional[Dict[str, str]]`

`convert_markdown_to_html(content: str) -> str`

`create_email_signature(assistant_name: str, assistant_id: str, avatar_url: str = None) -> tuple[str, str]`

`email_webhook_receive()`

`email_webhook_status()`