Email Integration Testing Guide
Comprehensive testing strategies for the email integration system, covering unit tests, integration tests, and real-world scenarios.
Overview
This guide covers testing approaches for the email integration system based on the comprehensive test suite in backend/tests/test_email_integration.py and backend/tests/test_my_utils.py. The testing strategy ensures reliability across webhook processing, rate limiting, file handling, and AI integration.
Key Testing Areas:
- Email webhook processing and validation
- Rate limiting and security measures
- Content processing and formatting
- File attachment handling
- Integration with VAC Service and Tool Context
- Error handling and edge cases
Test Architecture
Test Structure
backend/tests/
├── test_email_integration.py # Email system tests
├── test_my_utils.py # Utility function tests
├── test_tools_model_tools.py # Tool creation tests
└── test_tools_tool_prompts.py # Tool prompt tests
Testing Frameworks
Backend Testing:
- pytest - Primary testing framework
- pytest-asyncio - Async test support
- unittest.mock - Mocking and patching
- pytest fixtures - Test setup and teardown
Key Imports:
import pytest
from unittest.mock import Mock, patch, AsyncMock
from datetime import datetime, timedelta
import json
Email Integration Tests
EmailRateLimiter Testing
Test Class: TestEmailRateLimiter
Rate Limiting Behavior
First Email Allowed:
@pytest.mark.asyncio
async def test_rate_limit_allows_first_email(self, rate_limiter):
"""Test that first email from user is allowed."""
# Mock Firestore document that doesn't exist
mock_doc = Mock()
mock_doc.exists = False
mock_doc_ref = Mock()
mock_doc_ref.get.return_value = mock_doc
mock_doc_ref.set = Mock()
rate_limiter.db.collection.return_value.document.return_value = mock_doc_ref
allowed, error = await rate_limiter.check_rate_limit("user@example.com")
assert allowed is True
assert error is None
mock_doc_ref.set.assert_called_once()
Rapid Email Blocking:
@pytest.mark.asyncio
async def test_rate_limit_blocks_rapid_emails(self, rate_limiter):
"""Test that rapid emails are blocked."""
# Mock Firestore document with recent timestamp
mock_doc = Mock()
mock_doc.exists = True
mock_doc.to_dict.return_value = {
'last_email_time': datetime.utcnow() - timedelta(seconds=30)
}
allowed, error = await rate_limiter.check_rate_limit("user@example.com")
assert allowed is False
assert "Rate limit exceeded" in error
Timeout Recovery:
@pytest.mark.asyncio
async def test_rate_limit_allows_after_timeout(self, rate_limiter):
"""Test that emails are allowed after rate limit timeout."""
mock_doc = Mock()
mock_doc.exists = True
mock_doc.to_dict.return_value = {
'last_email_time': datetime.utcnow() - timedelta(minutes=2) # Old timestamp
}
allowed, error = await rate_limiter.check_rate_limit("user@example.com")
assert allowed is True
assert error is None
EmailWebhookValidator Testing
Test Class: TestEmailWebhookValidator
Email Address Validation
Valid Email Patterns:
def test_validate_email_address_valid(self, validator):
"""Test validation of valid email addresses."""
valid_emails = [
"user@example.com",
"test.user+tag@domain.co.uk",
"name@subdomain.example.org"
]
for email in valid_emails:
assert validator.validate_email_address(email) is True
Invalid Email Patterns:
def test_validate_email_address_invalid(self, validator):
"""Test validation of invalid email addresses."""
invalid_emails = [
"not-an-email",
"@example.com", # Missing local part
"user@", # Missing domain
"user@domain", # Domain without TLD
"" # Empty string
]
for email in invalid_emails:
assert validator.validate_email_address(email) is False
Webhook Signature Validation
HMAC Validation with Mocking:
@patch('email_integration.hmac.compare_digest')
@patch('email_integration.hmac.new')
def test_validate_mailgun_webhook_with_secret(self, mock_hmac_new, mock_compare, validator):
"""Test webhook validation with proper HMAC checking."""
mock_hmac_new.return_value.hexdigest.return_value = "expected_signature"
mock_compare.return_value = True
result = validator.validate_mailgun_webhook("123", "token", "signature")
assert result is True
mock_compare.assert_called_once_with("signature", "expected_signature")
EmailProcessor Testing
Test Class: TestEmailProcessor
Assistant ID Extraction
Valid Extraction Patterns:
def test_extract_assistant_id_valid(self, processor):
"""Test extraction of assistant ID from valid email addresses."""
test_cases = [
("assistant-123@example.com", "123"),
("assistant-test-id@domain.org", "test-id"),
("assistant-uuid-123-456@mail.com", "uuid-123-456"),
("assistant-list@example.com", "list"), # Special case
("dev-assistant-list@example.com", "list") # Special case with environment prefix
]
for email, expected_id in test_cases:
result = processor.extract_assistant_id(email)
assert result == expected_id
Invalid Pattern Handling:
def test_extract_assistant_id_invalid(self, processor):
"""Test extraction of assistant ID from invalid email addresses."""
invalid_emails = [
"user@example.com", # Not assistant email
"assistant@example.com", # Missing ID
"not-assistant-123@example.com", # Wrong prefix
"" # Empty string
]
for email in invalid_emails:
result = processor.extract_assistant_id(email)
assert result is None
Content Cleaning
Email Content Processing:
def test_clean_email_content(self, processor):
"""Test email content cleaning functionality."""
raw_email = """Hello,
This is my question about the project.
Thanks!
John
From: john@example.com
Sent from my iPhone
--
This is a signature
> This is quoted content
> More quoted content"""
cleaned = processor.clean_email_content(raw_email)
# Should remove signatures, headers, and quoted content
assert "From:" not in cleaned
assert "Sent from" not in cleaned
assert "--" not in cleaned
assert "> This is quoted" not in cleaned
assert "This is my question about the project." in cleaned
Thinking Tag Removal
Comprehensive Thinking Tag Testing:
def test_clean_assistant_response(self, processor):
"""Test assistant response cleaning removes thinking tags."""
# Basic thinking tag removal
response_with_thinking = """<thinking>
This is internal reasoning that users shouldn't see in email.
Let me think about this...
</thinking>
Here is my actual response to the user.
This should be visible in the email."""
cleaned = processor.clean_assistant_response(response_with_thinking)
assert "<thinking>" not in cleaned
assert "internal reasoning" not in cleaned
assert "Here is my actual response" in cleaned
assert "This should be visible" in cleaned
# Case insensitive removal
case_mixed = """<THINKING>
Mixed case thinking
</thinking>
User visible content."""
cleaned_mixed = processor.clean_assistant_response(case_mixed)
assert "Mixed case thinking" not in cleaned_mixed
assert "User visible content" in cleaned_mixed
# Multiple thinking blocks
multiple_thinking = """First part.
<thinking>First internal thought</thinking>
Middle part.
<thinking>
Second internal thought
spanning multiple lines
</thinking>
Final part."""
cleaned_multiple = processor.clean_assistant_response(multiple_thinking)
assert "First internal thought" not in cleaned_multiple
assert "Second internal thought" not in cleaned_multiple
assert "First part." in cleaned_multiple
assert "Middle part." in cleaned_multiple
assert "Final part." in cleaned_multiple
Integration Testing
Successful Email Processing:
@pytest.mark.asyncio
async def test_process_email_message_success(self, processor):
"""Test successful email processing."""
mock_config = {"name": "Test Assistant", "tools": ["search"], "toolConfigs": {}}
mock_ai_response = "This is the AI response"
with patch('email_integration.get_assistant_config', return_value=mock_config), \
patch('email_integration.permitted_tools', return_value=(["search"], {})), \
patch('email_integration.vac_stream', return_value={"metadata": {"answer": mock_ai_response}}) as mock_vac_stream, \
patch.object(processor, '_store_email_interaction') as mock_store, \
patch.object(processor, '_save_email_to_chat_history') as mock_save_chat, \
patch.object(processor, '_send_email_response', return_value=True) as mock_send:
result = await processor.process_email_message(
"user@example.com", "test-assistant", "Test message", "Test Subject"
)
assert result["status"] == "processed"
assert result["response"] == mock_ai_response
assert result["email_sent"] is True
mock_vac_stream.assert_called_once()
mock_store.assert_called_once()
mock_send.assert_called_once()
Webhook Endpoint Testing
Invalid Signature Handling
@pytest.mark.asyncio
async def test_email_webhook_receive_invalid_signature():
"""Test webhook receives with invalid signatures."""
from flask import Flask
app = Flask(__name__)
with app.test_request_context('/api/email/webhook', method='POST', data={
'timestamp': '123',
'token': 'token',
'signature': 'invalid'
}), patch('email_integration.email_processor') as mock_processor:
mock_processor.validator.validate_mailgun_webhook.return_value = False
response, status_code = await email_webhook_receive()
assert status_code == 401
assert "Invalid signature" in response.get_json()["error"]
Rate Limiting Testing
@pytest.mark.asyncio
async def test_email_webhook_receive_rate_limited():
"""Test webhook receives when rate limited."""
with app.test_request_context('/api/email/webhook', method='POST', data={
'sender': 'user@example.com',
'recipient': 'assistant-123@example.com',
'subject': 'Test',
'body-plain': 'Test message'
}), patch('email_integration.email_processor') as mock_processor:
mock_processor.rate_limiter.check_rate_limit = AsyncMock(
return_value=(False, "Rate limit exceeded")
)
response, status_code = await email_webhook_receive()
assert status_code == 429
assert "Rate limit exceeded" in response.get_json()["error"]
Utility Function Tests
File Sanitization Testing
Test Class: Tests in test_my_utils.py
Filename Sanitization
Special Character Handling:
def test_sanitize_file_with_special_chars():
"""Test sanitizing filenames with special characters"""
assert sanitize_file("My-File_Name!.txt") == "my-file-name"
def test_sanitize_file_with_consecutive_dashes():
"""Test sanitizing filenames with consecutive dashes"""
assert sanitize_file("my--file---name.txt") == "my-file-name"
def test_sanitize_file_with_leading_trailing_dashes():
"""Test sanitizing filenames with leading/trailing dashes"""
assert sanitize_file("-myfile-.txt") == "myfile"
def test_sanitize_file_empty_after_sanitization():
"""Test sanitizing filenames that become empty after sanitization"""
assert sanitize_file("!!!.txt") == "file"
def test_sanitize_file_length_limit():
"""Test sanitizing filenames with length exceeding the limit"""
long_name = "a" * 50 + ".txt"
assert len(sanitize_file(long_name)) <= 40
Content Processing Testing
Thinking Tag Processing
Comprehensive Tag Removal:
def test_strip_thinking_tags_multiple():
"""Test stripping multiple thinking blocks"""
text = """First part.
<thinking>First internal thought</thinking>
Middle part.
<thinking>
Second internal thought
spanning multiple lines
</thinking>
Final part."""
result = strip_thinking_tags(text)
assert "First internal thought" not in result
assert "Second internal thought" not in result
assert "First part." in result
assert "Middle part." in result
assert "Final part." in result
Case Insensitive Processing:
def test_strip_thinking_tags_case_insensitive():
"""Test stripping thinking tags with mixed case"""
text = "<THINKING>Internal reasoning</thinking>\n\nUser visible content"
expected = "User visible content"
assert strip_thinking_tags(text) == expected
Chat History Formatting
Multiple Message Processing:
def test_format_human_chat_history_multiple_messages():
"""Test formatting chat history with multiple messages"""
history = [
{"name": "user", "content": "Hello"},
{"name": "assistant", "content": "Hi there"},
{"name": "user", "content": "How are you?"}
]
expected = "user: Hello\nassistant: Hi there\nuser: How are you?"
assert format_human_chat_history(history) == expected
Async Function Testing
Callback Processing
Mock Callback Testing:
class MockCallback:
"""Mock callback class for testing"""
def __init__(self):
self.tokens = []
async def async_on_llm_new_token(self, token):
self.tokens.append(token)
@pytest.mark.asyncio
async def test_check_and_display_thinking_with_thinking_tags():
"""Test processing message with thinking tags"""
with patch('my_utils.log.info'): # Patch to avoid log.info call
callback = MockCallback()
await check_and_display_thinking("test <thinking>thought</thinking>", callback)
assert "<​thinking>" in callback.tokens[0]
assert "thought" in callback.tokens[0]
Error Handling:
@pytest.mark.asyncio
async def test_check_and_display_thinking_no_callback():
"""Test processing message with no callback"""
with patch('my_utils.log.error') as mock_log_error:
await check_and_display_thinking("test message", None)
mock_log_error.assert_called_once()
args, _ = mock_log_error.call_args
assert "No callback" in args[0]
Tool System Testing
Model Tools Testing
Test Class: TestCreateModelTools in test_tools_model_tools.py
Tool Creation Validation
Google Search Tool:
def test_create_model_tools_google_search_retrieval(self):
"""Test creating Google Search tool."""
tools = ["google_search_retrieval"]
result = create_model_tools(tools)
assert len(result) == 1
assert isinstance(result[0], types.Tool)
assert hasattr(result[0], 'google_search')
Multiple Tool Processing:
def test_create_model_tools_multiple_tools(self):
"""Test creating multiple tools."""
tools = ["google_search_retrieval", "code_execution", "url_processing"]
result = create_model_tools(tools)
assert len(result) == 3
assert all(isinstance(tool, types.Tool) for tool in result)
Unknown Tool Handling:
def test_create_model_tools_unknown_tool(self):
"""Test with unknown tool - should be ignored."""
tools = ["unknown_tool", "google_search_retrieval"]
result = create_model_tools(tools)
# Should only create the known tool
assert len(result) == 1
assert hasattr(result[0], 'google_search')
Tool Prompts Testing
Test Class: TestAddToolPrompts in test_tools_tool_prompts.py
Prompt Loading and Formatting
String Tools Processing:
@patch('tools.tool_prompts.langfuse')
def test_add_tool_prompts_string_tools(self, mock_langfuse):
"""Test with tools as strings."""
mock_prompt = MagicMock()
mock_prompt.compile.return_value = "Tool prompt content"
mock_langfuse.get_prompt.return_value = mock_prompt
tools = ["tool1", "tool2"]
result = add_tool_prompts(tools)
assert "Tool prompt content" in result
assert mock_langfuse.get_prompt.call_count == 2
Error Handling:
@patch('tools.tool_prompts.langfuse')
@patch('tools.tool_prompts.log')
def test_add_tool_prompts_error_handling(self, mock_log, mock_langfuse):
"""Test error handling when prompt loading fails."""
mock_langfuse.get_prompt.side_effect = Exception("Prompt not found")
tools = ["tool1", "tool2"]
result = add_tool_prompts(tools)
# Should return empty string when all prompts fail
assert result == ""
# Should log warnings for each failed tool
assert mock_log.warning.call_count == 2
Integration Testing Strategies
Email Processing Pipeline
Full Pipeline Test:
@pytest.mark.asyncio
async def test_complete_email_processing():
"""Test complete email processing from webhook to response."""
# Setup test data
test_email_data = {
'sender': 'user@example.com',
'recipient': 'assistant-123@email.aitana.chat',
'subject': 'Test question',
'body-plain': 'How do I create a React component?'
}
# Mock dependencies
with patch('email_integration.get_assistant_config') as mock_config, \
patch('email_integration.vac_stream') as mock_vac, \
patch('email_integration.send_email_response') as mock_send:
# Configure mocks
mock_config.return_value = {"name": "Test Assistant", "tools": []}
mock_vac.return_value = {"metadata": {"answer": "Test response"}}
mock_send.return_value = True
# Process email
processor = EmailProcessor()
result = await processor.process_email_message(
test_email_data['sender'],
'123',
test_email_data['body-plain'],
test_email_data['subject']
)
# Verify processing
assert result['status'] == 'processed'
assert result['email_sent'] is True
mock_vac.assert_called_once()
mock_send.assert_called_once()
Error Recovery Testing
Graceful Degradation:
@pytest.mark.asyncio
async def test_email_processing_with_failures():
"""Test email processing handles partial failures gracefully."""
with patch('email_integration.get_assistant_config') as mock_config, \
patch('email_integration.vac_stream') as mock_vac, \
patch('email_integration.QuartoExporter.generate_export') as mock_export:
# Configure successful AI response but failed export
mock_config.return_value = {"name": "Test Assistant"}
mock_vac.return_value = {"metadata": {"answer": "Test response"}}
mock_export.return_value = None # Export fails
processor = EmailProcessor()
result = await processor.process_email_message(
'user@example.com',
'123',
'Test message (export:pdf)', # Request export
'Test subject'
)
# Should still succeed despite export failure
assert result['status'] == 'processed'
assert result['email_sent'] is True
Performance Testing
Load Testing Strategies
Rate Limiting Under Load:
@pytest.mark.asyncio
async def test_rate_limiting_concurrent_requests():
"""Test rate limiting with concurrent requests."""
rate_limiter = EmailRateLimiter()
user_email = "loadtest@example.com"
# Simulate concurrent requests
tasks = []
for _ in range(10):
task = rate_limiter.check_rate_limit(user_email)
tasks.append(task)
results = await asyncio.gather(*tasks)
# Only first request should be allowed
allowed_count = sum(1 for allowed, _ in results if allowed)
assert allowed_count == 1
Memory Usage Testing:
def test_large_content_processing():
"""Test processing of large email content."""
# Generate large content
large_content = "x" * 100000 # 100KB content
processor = EmailProcessor()
cleaned = processor.clean_email_content(large_content)
# Should handle large content without memory issues
assert len(cleaned) > 0
assert isinstance(cleaned, str)
Test Environment Setup
Fixture Configuration
Rate Limiter Fixture:
@pytest.fixture
def rate_limiter():
with patch('email_integration.firestore.Client'):
return EmailRateLimiter()
Validator Fixture:
@pytest.fixture
def validator():
with patch.dict('os.environ', {'MAILGUN_WEBHOOK_SECRET': 'test-secret'}):
return EmailWebhookValidator()
Processor Fixture:
@pytest.fixture
def processor():
with patch('email_integration.firestore.Client'):
return EmailProcessor()
Mock Configuration
Firestore Mocking:
def setup_firestore_mocks():
"""Setup comprehensive Firestore mocking."""
mock_db = Mock()
mock_collection = Mock()
mock_document = Mock()
mock_doc_ref = Mock()
mock_db.collection.return_value = mock_collection
mock_collection.document.return_value = mock_doc_ref
mock_doc_ref.get.return_value = mock_document
return mock_db, mock_collection, mock_document, mock_doc_ref
VAC Service Mocking:
@patch('email_integration.vac_stream')
def mock_vac_service(mock_vac):
"""Mock VAC service responses."""
mock_vac.return_value = {
"metadata": {
"answer": "Mocked AI response",
"trace_id": "test-trace-123"
}
}
return mock_vac
Running Tests
Local Development
Run Email Integration Tests:
# All email integration tests
cd backend && python -m pytest tests/test_email_integration.py -v
# Specific test class
python -m pytest tests/test_email_integration.py::TestEmailRateLimiter -v
# Specific test method
python -m pytest tests/test_email_integration.py::TestEmailProcessor::test_clean_email_content -v
Run Utility Function Tests:
# All utility tests
python -m pytest tests/test_my_utils.py -v
# With coverage
python -m pytest tests/test_my_utils.py --cov=my_utils --cov-report=html
Run Tool System Tests:
# Model tools tests
python -m pytest tests/test_tools_model_tools.py -v
# Tool prompts tests
python -m pytest tests/test_tools_tool_prompts.py -v
CI/CD Integration
GitHub Actions Test Command:
cd backend && python -m pytest tests/ -v --tb=short --cov=. --cov-report=json
Coverage Requirements:
- Email integration: >90% coverage
- Utility functions: >95% coverage
- Tool system: >85% coverage
Test Data Management
Sample Email Data
Valid Email Payload:
VALID_EMAIL_PAYLOAD = {
'timestamp': '1234567890',
'token': 'test_token',
'signature': 'valid_signature',
'sender': 'user@example.com',
'recipient': 'assistant-123@email.aitana.chat',
'subject': 'Test Question',
'body-plain': 'How do I implement authentication?',
'attachment-count': '0'
}
Rate Limit Test Data:
RATE_LIMIT_DATA = {
'recent_email': datetime.utcnow() - timedelta(seconds=30),
'old_email': datetime.utcnow() - timedelta(minutes=5),
'user_email': 'ratelimit@example.com'
}
Mock Response Templates
Assistant Config Mock:
MOCK_ASSISTANT_CONFIG = {
'name': 'Test Assistant',
'avatar': '/avatars/test.png',
'tools': ['search', 'code_execution'],
'toolConfigs': {'search': {'enabled': True}},
'initialInstructions': 'You are a helpful assistant.'
}
VAC Service Response Mock:
MOCK_VAC_RESPONSE = {
'metadata': {
'answer': 'This is a test AI response with helpful information.',
'trace_id': 'trace_123456',
'model_used': 'claude-3-sonnet',
'processing_time': 2.5
}
}
Related Documentation
- Backend Email API - Technical API documentation
- Backend Utility Functions - Utility function documentation
- Email Integration - User-facing integration guide
- Testing Guide - General testing documentation
- VAC Service Architecture - Core AI processing pipeline