WhatsApp Integration

Overview

The WhatsApp integration allows users to interact with Aitana assistants through WhatsApp using Twilio’s WhatsApp Business API. This feature bridges WhatsApp messages with the existing assistant API endpoints.

Architecture

Message Flow

WhatsApp User → Twilio → /whatsapp/webhook → /vac/assistant/<id> → Twilio → WhatsApp User

Key Components

  • whatsapp_service.py - Main WhatsApp integration service
  • WhatsAppSessionManager - Manages phone number to assistant ID mappings
  • WhatsAppMessageHandler - Handles message transformation and commands
  • Webhook endpoint - /whatsapp/webhook receives Twilio messages

Features

Session Management

  • Maps WhatsApp phone numbers to assistant sessions
  • Stores session data in Firestore (whatsapp_sessions collection)
  • Maintains conversation continuity across messages
  • Automatic session creation for new users

Assistant Selection

Users can switch between assistants using WhatsApp commands:

  • /switch assistant-id - Switch to a different assistant
  • /list - Show available assistants
  • /current - Show current assistant info
  • /help - Show help information

Message Processing

  • Automatic message transformation between WhatsApp and Aitana API formats
  • Support for media attachments (URLs passed through)
  • Long message splitting (WhatsApp 1600 character limit)
  • Chat history persistence in Firestore

Command System

Commands are prefixed with / and provide management functionality:

  • Assistant switching
  • Assistant listing
  • Help and status information

Configuration

Environment Variables

Required Twilio Variables:

TWILIO_ACCOUNT_SID=your_account_sid
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_WHATSAPP_NUMBER=whatsapp:+14155238886

Required Google Cloud Variables:

GOOGLE_CLOUD_PROJECT=your_project_id

Firestore Collections

whatsapp_sessions collection:

{
  "phone_number": "+1234567890",
  "assistant_id": "default-assistant",
  "session_id": "whatsapp-+1234567890",
  "user_email": "+1234567890@whatsapp.user",
  "display_name": "WhatsApp User 7890",
  "created_at": "timestamp",
  "last_active": "timestamp"
}

API Endpoints

POST /whatsapp/webhook

Twilio webhook endpoint that receives incoming WhatsApp messages.

Webhook Data:

  • From - WhatsApp number (e.g., whatsapp:+1234567890)
  • To - Twilio WhatsApp number
  • Body - Message text content
  • MediaUrl0 - First media attachment URL (optional)

GET /whatsapp/health

Health check endpoint for monitoring integration status.

Response:

{
  "status": "healthy",
  "twilio_configured": true,
  "environment_vars": {
    "TWILIO_ACCOUNT_SID": true,
    "TWILIO_AUTH_TOKEN": true,
    "TWILIO_WHATSAPP_NUMBER": true,
    "GOOGLE_CLOUD_PROJECT": true
  }
}

Security Considerations

  • Phone numbers are normalized and stored without whatsapp: prefix
  • Session IDs are derived from phone numbers for consistency
  • User emails are generated as {phone}@whatsapp.user format
  • Trace IDs include WhatsApp source identification
  • Media URLs are passed through but not automatically downloaded

Integration with Existing System

Assistant API Integration

  • Uses existing /vac/assistant/<assistant_id> endpoint
  • Leverages current tool ecosystem (file-browser, search, code execution)
  • Maintains compatibility with authentication and permissions system
  • Supports all existing assistant configurations

Message History

  • Uses existing Firestore message persistence
  • Messages tagged with "source": "whatsapp"
  • Maintains thinking content extraction
  • Supports chat history loading and saving

Limitations

  1. Media Handling - Media URLs are mentioned but not automatically processed
  2. Streaming - WhatsApp doesn’t support real-time streaming responses
  3. Message Length - Long responses are split into multiple messages
  4. File Uploads - Document processing requires separate handling

Usage Examples

Basic Conversation

User: Hello, can you help me with Python?
Assistant: Hi! I'd be happy to help you with Python...

Switching Assistants

User: /list
Assistant: 📋 Available Assistants:
• Python Expert (python-expert)
• Data Analyst (data-analyst)
...

User: /switch python-expert
Assistant: ✅ Switched to assistant: Python Expert (python-expert)

Getting Help

User: /help
Assistant: 🤖 WhatsApp Aitana Commands:
/switch assistant-id - Switch to a different assistant
/list - Show available assistants
/help - Show this help message
/current - Show current assistant info

Testing

Unit Testing

Test individual components of the WhatsApp integration:

# Navigate to backend directory
cd backend

# Run WhatsApp service tests
python -m pytest tests/test_whatsapp_service.py -v

# Run specific test methods
python -m pytest tests/test_whatsapp_service.py::test_phone_number_normalization -v
python -m pytest tests/test_whatsapp_service.py::test_session_management -v
python -m pytest tests/test_whatsapp_service.py::test_message_commands -v

Integration Testing

Test the complete WhatsApp workflow:

# Test webhook endpoint
curl -X POST http://localhost:1956/whatsapp/webhook \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "From=whatsapp:+1234567890&To=whatsapp:+14155238886&Body=Hello"

# Test health endpoint
curl http://localhost:1956/whatsapp/health

Manual Testing with Twilio

  1. Setup Test WhatsApp Number
    # Join Twilio WhatsApp Sandbox
    # Send "join <sandbox-keyword>" to +1 415 523 8886
    
  2. Test Basic Conversation
    User: Hello
    Expected: Assistant response based on default assistant
    
  3. Test Command System
    User: /list
    Expected: List of available assistants
       
    User: /switch python-expert
    Expected: Confirmation of assistant switch
       
    User: /help
    Expected: Command help information
    
  4. Test Media Handling
    User: [Send image]
    Expected: Response acknowledging media URL
    

Test Environment Variables

For testing, set up a separate .env.test file:

TWILIO_ACCOUNT_SID=test_account_sid
TWILIO_AUTH_TOKEN=test_auth_token
TWILIO_WHATSAPP_NUMBER=whatsapp:+14155238886
GOOGLE_CLOUD_PROJECT=test-project

Troubleshooting

Common Issues and Solutions

1. Webhook Not Receiving Messages

Symptoms:

  • Messages sent to WhatsApp number but no response
  • Webhook endpoint not being called

Causes & Solutions:

Public URL Issue

# Ensure your webhook URL is publicly accessible
# Use ngrok for local testing:
ngrok http 1956

# Update Twilio webhook URL to ngrok URL
# Example: https://abc123.ngrok.io/whatsapp/webhook

Firewall/Security Groups

  • Ensure port 1956 is open for inbound traffic
  • Check Google Cloud Run service allows HTTP traffic
  • Verify Twilio IP whitelist if applicable

2. Environment Variables Not Set

Symptoms:

  • Health endpoint shows missing configuration
  • 500 errors on webhook requests

Solution:

# Check environment variables
curl http://localhost:1956/whatsapp/health

# Should return all environment_vars as true
# If false, check your .env file or deployment configuration

3. Session Management Issues

Symptoms:

  • Users can’t switch assistants
  • Conversation context lost

Debugging:

# Check Firestore sessions collection
# Look for documents with phone number patterns
# Verify session_id format: "whatsapp-+1234567890"

Common Fix:

# Clear problematic sessions
# Delete documents in whatsapp_sessions collection
# User will get new session on next message

4. Message Length Issues

Symptoms:

  • Long responses are cut off
  • Multiple partial messages received

Solution:

  • WhatsApp has 1600 character limit per message
  • Service automatically splits long responses
  • This is expected behavior for lengthy AI responses

5. Twilio Authentication Errors

Symptoms:

  • 401 errors in logs
  • Messages not being sent

Debugging:

# Test Twilio credentials
curl -X GET "https://api.twilio.com/2010-04-01/Accounts/{AccountSid}.json" \
  -u "{AccountSid}:{AuthToken}"

# Should return account information
# If 401, check TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN

6. Assistant Configuration Issues

Symptoms:

  • Default assistant not responding correctly
  • Tool permissions not working

Solution:

# Verify assistant exists in Firestore
# Check assistants collection for "default-assistant" document
# Ensure user permissions allow tool usage

Debugging Tools

Enable Debug Logging

Add to your environment:

WHATSAPP_DEBUG=true
VAC_DEBUG=true

Monitor Firestore Collections

Key collections to monitor:

  • whatsapp_sessions - User session data
  • messages - Chat history
  • assistants - Assistant configurations
  • user_permissions - User access rights

Twilio Debugger

Use Twilio Console Debugger to:

  • View webhook request/response logs
  • Check message delivery status
  • Monitor API usage and errors

Performance Monitoring

Key Metrics to Track

  1. Response Time: Time from WhatsApp message to first response
  2. Session Duration: How long users stay engaged
  3. Error Rate: Percentage of failed webhook requests
  4. Tool Usage: Which tools are used most frequently

Langfuse Integration

If Langfuse is configured, monitor:

  • WhatsApp conversation traces
  • Tool execution performance
  • Model response quality
  • Error patterns and frequencies

Future Enhancements

Planned Improvements

1. Media File Processing

  • Auto-download and process images sent via WhatsApp
  • Document analysis for PDFs and text files
  • Image analysis using vision models
  • File size and type validation

Implementation:

# Enhanced media handling
async def process_whatsapp_media(media_url: str, media_type: str):
    if media_type.startswith('image/'):
        # Download and analyze image
        image_data = await download_media(media_url)
        analysis = await analyze_image(image_data)
        return f"I can see {analysis['description']}. {analysis['details']}"
    elif media_type == 'application/pdf':
        # Process PDF document
        pdf_content = await extract_pdf_text(media_url)
        return f"I've analyzed your PDF. {len(pdf_content)} characters of text found."

2. Rich Message Formatting

  • Bold and italic text support
  • List formatting for structured responses
  • Link previews for shared URLs
  • Emoji integration for better user experience

Example:

# Rich formatting helper
def format_whatsapp_message(content: str) -> str:
    # Convert markdown to WhatsApp formatting
    content = content.replace('**', '*')  # Bold
    content = content.replace('__', '_')  # Italic
    content = add_emojis(content)  # Smart emoji insertion
    return content

3. Broadcast Messaging

  • Group messaging support
  • Broadcast lists for announcements
  • Scheduled messages for reminders
  • Template messages for common responses

4. Analytics and Monitoring

  • Usage analytics dashboard
  • Conversation insights and patterns
  • Performance metrics and optimization
  • User engagement tracking

Metrics to Track:

  • Daily/monthly active users
  • Average conversation length
  • Most used assistants and tools
  • Response time performance
  • User satisfaction scores

5. Advanced Features

  • Voice message transcription and response
  • Location sharing integration
  • Contact sharing for team features
  • Multi-language support
  • Business hours configuration

Migration Path

When implementing enhancements:

  1. Maintain backward compatibility with existing sessions
  2. Gradual feature rollout with feature flags
  3. Comprehensive testing before production deployment
  4. User communication about new features
  5. Fallback mechanisms for feature failures

Contributing

To contribute to WhatsApp integration development:

  1. Review existing code in backend/whatsapp_service.py
  2. Write comprehensive tests for new features
  3. Follow existing patterns for error handling and logging
  4. Update documentation with new capabilities
  5. Test with real WhatsApp numbers before submitting