Langfuse Distributed Tracing

This document describes how Langfuse tracing is implemented across the frontend and backend to capture end-to-end latency metrics and provide comprehensive observability.

Overview

The Langfuse integration uses a server-side proxy pattern to securely track the complete lifecycle of user interactions from the moment they click “Send” until the response is fully streamed back. This provides valuable insights into:

  • Frontend processing time (UI preparation, validation, etc.)
  • Network latency to backend
  • Backend processing and AI model time
  • Time to first token
  • Streaming performance
  • Assistant-specific analytics

Architecture

Security & Proxy Pattern

The system uses a server-side proxy (/api/langfuse/trace) to protect the Langfuse secret key:

  • Frontend: Uses public key only, makes authenticated requests to proxy
  • Backend Proxy: Handles secure Langfuse operations with secret key
  • Main Backend: Continues traces received from frontend

Trace Flow

  1. User clicks send → Frontend creates trace with assistant name as trace name
  2. Frontend processing → Multiple spans track UI preparation and validation
  3. Proxy flush → Events flushed to Langfuse before API call
  4. API call to backend → Backend call span tracks network latency
  5. Backend receives trace ID → Continues the same trace with assistant context
  6. Streaming response → Generation span tracks streaming performance
  7. Complete trace → Full end-to-end metrics captured with assistant attribution

Trace ID Format

  • Frontend-initiated: ui2-{timestamp}-{requestId}
  • Backend-initiated: backend-{trace_type}-{timestamp}
  • Session-based: All traces include sessionId for grouping

Trace Names: Now use assistant names (e.g., “Claude Assistant”, “Research Bot”) instead of generic names for better analytics.

Implementation Details

Frontend Tracing Architecture

The frontend uses a proxy-based architecture for secure Langfuse integration. All Langfuse operations go through /api/langfuse/trace endpoint to protect the secret key.

1. Proxy Client (src/utils/langfuseProxy.ts)

class LangfuseProxy {
  async createTrace(params: CreateTraceParams): Promise<any> {
    const response = await fetch('/api/langfuse/trace', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        action: 'create_trace',
        data: { traceId: params.traceId, name: params.name, ... }
      })
    });
    return response.json();
  }
  
  async createSpan(params: CreateSpanParams): Promise<any> { /* ... */ }
  async endSpan(params: EndSpanParams): Promise<any> { /* ... */ }
  async flush(): Promise<any> { /* ... */ }
}

2. Enhanced Tracing (src/utils/langfuseEnhanced.ts)

All spans are prefixed with “frontend-“ and include source metadata:

export async function createUserInteractionTrace(params: {
  traceId: string;
  sessionId: string;
  userId?: string;
  userEmail?: string;
  assistantId: string;
  assistantName: string;  // 🔑 Assistant name for trace naming
  message: string;
  selectedItemsCount?: number;
}) {
  await langfuseProxy.createTrace({
    traceId: params.traceId,
    name: params.assistantName,  // 🔑 Trace named with assistant
    sessionId: params.sessionId,
    userId: params.userEmail || params.userId,
    metadata: {
      assistantId: params.assistantId,
      assistantName: params.assistantName,
      messageLength: params.message.length,
      selectedItemsCount: params.selectedItemsCount || 0,
      source: 'frontend-click',  // 🔑 Frontend source labeling
      timestamp: new Date().toISOString(),
      environment: isBrowser ? 'browser' : 'server',
    },
    input: { message: params.message }
  });
}

3. Core Spans Created

a) UI Processing Span (frontend-ui-processing)

  • Captures message preparation, validation, UI updates
  • Created immediately when user clicks send

b) Backend API Call Span (frontend-backend-api-call)

  • Tracks network latency to main backend API
  • Includes request preparation time

c) Streaming Generation Span (streaming-response)

  • Monitors streaming performance and chunk delivery
  • Updates with progress metrics every 10 chunks

4. VacChat Integration (src/utils/vacChat.ts)

// Only create trace if none exists (prevents duplicates)
const shouldCreateNewTrace = !params.traceId && !params.langfuseTrace;

if (shouldCreateNewTrace) {
  await langfuseProxy.createTrace({
    traceId: traceId,
    name: (params.emissaryConfig?.name as string) ?? 'Assistant',
    sessionId: params.sessionId || undefined,
    userId: params.emissaryConfig?.currentUser?.email || undefined,
    metadata: {
      requestId,
      emissaryConfig: params.emissaryConfig,
    }
  });
}

5. Latency Tracking

A LatencyTracker instance tracks key milestones:

const latencyTracker = new LatencyTracker();
latencyTracker.mark('trace_created');
latencyTracker.mark('firestore_save_started'); 
latencyTracker.mark('pre_api_flush');          // 🔑 New: before API call
latencyTracker.mark('post_api_flush_success'); // 🔑 New: flush timing
latencyTracker.mark('first_backend_response');
latencyTracker.mark('first_chunk_received');
latencyTracker.mark('streaming_complete');

Backend Proxy API (src/app/api/langfuse/trace/route.ts)

The backend proxy endpoint handles secure Langfuse operations:

export async function POST(request: NextRequest) {
  const body = await request.json();
  const { action, data } = body;

  switch (action) {
    case 'create_trace': {
      const { traceId, name, input, metadata, sessionId, userId } = data;
      const traceParams: any = {
        id: traceId,
        input,
        metadata: { ...metadata, source: 'frontend-proxy' },
        sessionId,
        userId
      };
      
      // Only add name if provided (supports assistant-specific naming)
      if (name) traceParams.name = name;
      
      const trace = langfuse.trace(traceParams);
      return NextResponse.json({ success: true, traceId: trace.id });
    }
    // ... other actions: create_span, end_span, create_generation, flush
  }
}

Main Backend Integration

The main backend (Python) receives the trace ID and continues the trace with assistant context:

# Backend continues the trace started by frontend
if kwargs.get("trace_id") is not None:
    trace_id = kwargs.get('trace_id')
    trace = langfuse.trace(
        id=trace_id, 
        input=question,
        name=f"backend-assistant-{assistant_name}",  # Assistant-specific naming
        metadata={
          "SERVICE_NAME": os.getenv("SERVICE_NAME","NO_SERVICE_NAME"),
          "assistant_id": assistant_id,
          "assistant_name": assistant_name,
          "source": "backend-api"
        }
    )

Key Metrics Captured

  1. Frontend Processing Time: UI preparation, validation, Firestore saves
  2. Proxy Flush Latency: Time to flush events before API call
  3. Network Latency: Frontend to backend API response time
  4. Time to First Token: Time until first streaming chunk received
  5. Streaming Performance: Chunk delivery rate and total duration
  6. Assistant Attribution: All metrics tagged with assistant name/ID
  7. Session Grouping: Related interactions grouped by sessionId
  8. Source Identification: Clear labeling of frontend vs backend spans

File References & Implementation Locations

Core Frontend Files

File Purpose Key Functions
src/utils/langfuseProxy.ts Secure proxy client createTrace(), createSpan(), flush()
src/utils/langfuseEnhanced.ts Frontend tracing logic createUserInteractionTrace(), LatencyTracker
src/utils/vacChat.ts Main chat integration Trace creation, backend call spans
src/contexts/MessageStreamingContext.tsx React context integration User interaction trace creation
src/utils/chatStreaming.ts Streaming management Trace continuation, completion
src/services/streamingService.ts Service layer Assistant name integration

Backend Files

File Purpose Description
src/app/api/langfuse/trace/route.ts Secure proxy API Handles all Langfuse operations with secret key
Backend Python files Main API continuation Continues traces with assistant context

Configuration

Environment Variables

# Frontend (proxy-based - only public key exposed)
NEXT_PUBLIC_LANGFUSE_PUBLIC_KEY=pk-your-public-key
NEXT_PUBLIC_LANGFUSE_BASE_URL=https://analytics.aitana.chat

# Backend Proxy (Next.js API route - secret key protected)
LANGFUSE_SECRET_KEY=sk-your-secret-key

# Main Backend (Python - secret key for direct operations)
LANGFUSE_SECRET_KEY=sk-your-secret-key

Proxy-Based Security

The frontend never directly accesses the Langfuse secret key:

// ❌ OLD: Direct client (security risk)
const langfuseClient = new LangfuseWeb({
  publicKey: process.env.NEXT_PUBLIC_LANGFUSE_PUBLIC_KEY,
  secretKey: process.env.LANGFUSE_SECRET_KEY  // 🚨 EXPOSED!
});

// ✅ NEW: Proxy-based (secure)
const langfuseProxy = new LangfuseProxy(); // Uses /api/langfuse/trace

Usage Example

Complete Flow with Assistant Context

  1. User clicks sendMessageStreamingContext.tsx creates trace with assistant name
  2. Frontend processing → Multiple spans track UI operations (all prefixed with “frontend-“)
  3. Firestore save → User message saved with trace ID
  4. Pre-API flush → Events flushed to Langfuse via proxy before backend call
  5. API call → Backend receives trace ID and assistant context
  6. Backend continuation → Python backend continues trace with assistant-specific naming
  7. Streaming response → Generation span tracks AI model performance
  8. Completion → Full end-to-end metrics captured with assistant attribution

Trace Structure in Dashboard

📊 Trace: "Claude Assistant" (or specific assistant name)
├── 🎯 frontend-ui-processing (UI preparation)
├── 🌐 frontend-backend-api-call (network latency)
├── 🤖 streaming-response (AI generation)
└── 📝 backend-assistant-{name} (backend processing)

Viewing Traces

Traces can be viewed in the Langfuse dashboard at:

  • URL: https://analytics.aitana.chat
  • Trace Names: Assistant names (e.g., “Research Assistant”, “Claude Helper”)
  • Span Prefixes: frontend-* for frontend spans, backend-* for backend spans
  • Grouping: By sessionId for related conversations
  • Metadata: Assistant ID, user info, message lengths, source labels

Sample Trace Metadata

{
  "assistantId": "asst_123456",
  "assistantName": "Research Assistant", 
  "messageLength": 45,
  "selectedItemsCount": 2,
  "source": "frontend-click",
  "environment": "browser",
  "latencies": {
    "trace_created": 1.2,
    "pre_api_flush": 15.4,
    "first_backend_response": 234.7,
    "streaming_complete": 2847.3
  }
}

Debugging & Troubleshooting

Console Logging

Enable debugging by checking browser console:

Langfuse Enhanced client using server-side proxy: {
  environment: 'browser',
  baseUrl: 'https://analytics.aitana.chat',
  publicKeyConfigured: true,
  usingProxy: true
}

Created new langfuse trace via proxy ui2-1671234567890-abc123
Creating span via proxy: frontend-ui-processing
Span created via proxy: { success: true }

Health Check Endpoint

Check proxy configuration:

curl https://your-app.com/api/langfuse/trace

Expected response:

{
  "status": "healthy",
  "langfuse_configured": true,
  "public_key_available": true,
  "secret_key_available": true,
  "base_url": "https://analytics.aitana.chat"
}

Common Issues

Issue Symptom Solution
Missing secret key secret_key_available: false Set LANGFUSE_SECRET_KEY in environment
Proxy failures Failed to create trace via proxy Check proxy endpoint and network
Duplicate traces Multiple traces for same interaction Verify shouldCreateNewTrace logic

See Langfuse Troubleshooting for detailed debugging.

Benefits

  1. Assistant-Specific Analytics: Track performance per assistant type
  2. End-to-End Visibility: Frontend + backend in single trace
  3. Security: Proxy pattern protects sensitive credentials
  4. Performance Monitoring: Identify bottlenecks across the full stack
  5. User Experience: Measure actual user-perceived latency
  6. Session Analysis: Group related interactions for user journey insights

Future Enhancements

  1. Error Tracking: Add error spans with stack traces
  2. Tool Usage Analytics: Track specific tool invocations within traces
  3. A/B Testing: Tag traces with experiment variants
  4. Cost Tracking: Include token usage and costs in metadata
  5. Real-time Dashboards: Live performance monitoring