Langfuse Distributed Tracing

This document describes how Langfuse tracing is implemented across the frontend and backend to capture end-to-end latency metrics and provide comprehensive observability.

Overview

The Langfuse integration uses a server-side proxy pattern to securely track the complete lifecycle of user interactions from the moment they click “Send” until the response is fully streamed back. This provides valuable insights into:

Frontend processing time (UI preparation, validation, etc.)
Network latency to backend
Backend processing and AI model time
Time to first token
Streaming performance
Assistant-specific analytics

Architecture

Security & Proxy Pattern

The system uses a server-side proxy (/api/langfuse/trace) to protect the Langfuse secret key:

Frontend: Uses public key only, makes authenticated requests to proxy
Backend Proxy: Handles secure Langfuse operations with secret key
Main Backend: Continues traces received from frontend

Trace Flow

User clicks send → Frontend creates trace with assistant name as trace name
Frontend processing → Multiple spans track UI preparation and validation
Proxy flush → Events flushed to Langfuse before API call
API call to backend → Backend call span tracks network latency
Backend receives trace ID → Continues the same trace with assistant context
Streaming response → Generation span tracks streaming performance
Complete trace → Full end-to-end metrics captured with assistant attribution

Trace ID Format

Frontend-initiated: ui2-{timestamp}-{requestId}
Backend-initiated: backend-{trace_type}-{timestamp}
Session-based: All traces include sessionId for grouping

Trace Names: Now use assistant names (e.g., “Claude Assistant”, “Research Bot”) instead of generic names for better analytics.

Implementation Details

Frontend Tracing Architecture

The frontend uses a proxy-based architecture for secure Langfuse integration. All Langfuse operations go through /api/langfuse/trace endpoint to protect the secret key.

1. Proxy Client (`src/utils/langfuseProxy.ts`)

class LangfuseProxy {
  async createTrace(params: CreateTraceParams): Promise<any> {
    const response = await fetch('/api/langfuse/trace', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        action: 'create_trace',
        data: { traceId: params.traceId, name: params.name, ... }
      })
    });
    return response.json();
  }
  
  async createSpan(params: CreateSpanParams): Promise<any> { /* ... */ }
  async endSpan(params: EndSpanParams): Promise<any> { /* ... */ }
  async flush(): Promise<any> { /* ... */ }
}

2. Enhanced Tracing (`src/utils/langfuseEnhanced.ts`)

All spans are prefixed with “frontend-“ and include source metadata:

export async function createUserInteractionTrace(params: {
  traceId: string;
  sessionId: string;
  userId?: string;
  userEmail?: string;
  assistantId: string;
  assistantName: string;  // 🔑 Assistant name for trace naming
  message: string;
  selectedItemsCount?: number;
}) {
  await langfuseProxy.createTrace({
    traceId: params.traceId,
    name: params.assistantName,  // 🔑 Trace named with assistant
    sessionId: params.sessionId,
    userId: params.userEmail || params.userId,
    metadata: {
      assistantId: params.assistantId,
      assistantName: params.assistantName,
      messageLength: params.message.length,
      selectedItemsCount: params.selectedItemsCount || 0,
      source: 'frontend-click',  // 🔑 Frontend source labeling
      timestamp: new Date().toISOString(),
      environment: isBrowser ? 'browser' : 'server',
    },
    input: { message: params.message }
  });
}

3. Core Spans Created

a) UI Processing Span (frontend-ui-processing)

Captures message preparation, validation, UI updates
Created immediately when user clicks send

b) Backend API Call Span (frontend-backend-api-call)

Tracks network latency to main backend API
Includes request preparation time

c) Streaming Generation Span (streaming-response)

Monitors streaming performance and chunk delivery
Updates with progress metrics every 10 chunks

4. VacChat Integration (`src/utils/vacChat.ts`)

// Only create trace if none exists (prevents duplicates)
const shouldCreateNewTrace = !params.traceId && !params.langfuseTrace;

if (shouldCreateNewTrace) {
  await langfuseProxy.createTrace({
    traceId: traceId,
    name: (params.emissaryConfig?.name as string) ?? 'Assistant',
    sessionId: params.sessionId || undefined,
    userId: params.emissaryConfig?.currentUser?.email || undefined,
    metadata: {
      requestId,
      emissaryConfig: params.emissaryConfig,
    }
  });
}

5. Latency Tracking

A LatencyTracker instance tracks key milestones:

const latencyTracker = new LatencyTracker();
latencyTracker.mark('trace_created');
latencyTracker.mark('firestore_save_started'); 
latencyTracker.mark('pre_api_flush');          // 🔑 New: before API call
latencyTracker.mark('post_api_flush_success'); // 🔑 New: flush timing
latencyTracker.mark('first_backend_response');
latencyTracker.mark('first_chunk_received');
latencyTracker.mark('streaming_complete');

Backend Proxy API (`src/app/api/langfuse/trace/route.ts`)

The backend proxy endpoint handles secure Langfuse operations:

export async function POST(request: NextRequest) {
  const body = await request.json();
  const { action, data } = body;

  switch (action) {
    case 'create_trace': {
      const { traceId, name, input, metadata, sessionId, userId } = data;
      const traceParams: any = {
        id: traceId,
        input,
        metadata: { ...metadata, source: 'frontend-proxy' },
        sessionId,
        userId
      };
      
      // Only add name if provided (supports assistant-specific naming)
      if (name) traceParams.name = name;
      
      const trace = langfuse.trace(traceParams);
      return NextResponse.json({ success: true, traceId: trace.id });
    }
    // ... other actions: create_span, end_span, create_generation, flush
  }
}

Main Backend Integration

The main backend (Python) receives the trace ID and continues the trace with assistant context:

# Backend continues the trace started by frontend
if kwargs.get("trace_id") is not None:
    trace_id = kwargs.get('trace_id')
    trace = langfuse.trace(
        id=trace_id, 
        input=question,
        name=f"backend-assistant-{assistant_name}",  # Assistant-specific naming
        metadata={
          "SERVICE_NAME": os.getenv("SERVICE_NAME","NO_SERVICE_NAME"),
          "assistant_id": assistant_id,
          "assistant_name": assistant_name,
          "source": "backend-api"
        }
    )

Key Metrics Captured

Frontend Processing Time: UI preparation, validation, Firestore saves
Proxy Flush Latency: Time to flush events before API call
Network Latency: Frontend to backend API response time
Time to First Token: Time until first streaming chunk received
Streaming Performance: Chunk delivery rate and total duration
Assistant Attribution: All metrics tagged with assistant name/ID
Session Grouping: Related interactions grouped by sessionId
Source Identification: Clear labeling of frontend vs backend spans

File References & Implementation Locations

Core Frontend Files

File	Purpose	Key Functions
`src/utils/langfuseProxy.ts`	Secure proxy client	`createTrace()`, `createSpan()`, `flush()`
`src/utils/langfuseEnhanced.ts`	Frontend tracing logic	`createUserInteractionTrace()`, `LatencyTracker`
`src/utils/vacChat.ts`	Main chat integration	Trace creation, backend call spans
`src/contexts/MessageStreamingContext.tsx`	React context integration	User interaction trace creation
`src/utils/chatStreaming.ts`	Streaming management	Trace continuation, completion
`src/services/streamingService.ts`	Service layer	Assistant name integration

Backend Files

File	Purpose	Description
`src/app/api/langfuse/trace/route.ts`	Secure proxy API	Handles all Langfuse operations with secret key
Backend Python files	Main API continuation	Continues traces with assistant context

Frontend Environment Variables - Environment setup
VAC Service Architecture - Overall system design
Streaming Context - Chat streaming implementation

Configuration

Environment Variables

# Frontend (proxy-based - only public key exposed)
NEXT_PUBLIC_LANGFUSE_PUBLIC_KEY=pk-your-public-key
NEXT_PUBLIC_LANGFUSE_BASE_URL=https://analytics.aitana.chat

# Backend Proxy (Next.js API route - secret key protected)
LANGFUSE_SECRET_KEY=sk-your-secret-key

# Main Backend (Python - secret key for direct operations)
LANGFUSE_SECRET_KEY=sk-your-secret-key

Proxy-Based Security

The frontend never directly accesses the Langfuse secret key:

// ❌ OLD: Direct client (security risk)
const langfuseClient = new LangfuseWeb({
  publicKey: process.env.NEXT_PUBLIC_LANGFUSE_PUBLIC_KEY,
  secretKey: process.env.LANGFUSE_SECRET_KEY  // 🚨 EXPOSED!
});

// ✅ NEW: Proxy-based (secure)
const langfuseProxy = new LangfuseProxy(); // Uses /api/langfuse/trace

Usage Example

Complete Flow with Assistant Context

User clicks send → MessageStreamingContext.tsx creates trace with assistant name
Frontend processing → Multiple spans track UI operations (all prefixed with “frontend-“)
Firestore save → User message saved with trace ID
Pre-API flush → Events flushed to Langfuse via proxy before backend call
API call → Backend receives trace ID and assistant context
Backend continuation → Python backend continues trace with assistant-specific naming
Streaming response → Generation span tracks AI model performance
Completion → Full end-to-end metrics captured with assistant attribution

Trace Structure in Dashboard

📊 Trace: "Claude Assistant" (or specific assistant name)
├── 🎯 frontend-ui-processing (UI preparation)
├── 🌐 frontend-backend-api-call (network latency)
├── 🤖 streaming-response (AI generation)
└── 📝 backend-assistant-{name} (backend processing)

Viewing Traces

Traces can be viewed in the Langfuse dashboard at:

URL: https://analytics.aitana.chat
Trace Names: Assistant names (e.g., “Research Assistant”, “Claude Helper”)
Span Prefixes: frontend-* for frontend spans, backend-* for backend spans
Grouping: By sessionId for related conversations
Metadata: Assistant ID, user info, message lengths, source labels

Sample Trace Metadata

{
  "assistantId": "asst_123456",
  "assistantName": "Research Assistant", 
  "messageLength": 45,
  "selectedItemsCount": 2,
  "source": "frontend-click",
  "environment": "browser",
  "latencies": {
    "trace_created": 1.2,
    "pre_api_flush": 15.4,
    "first_backend_response": 234.7,
    "streaming_complete": 2847.3
  }
}

Debugging & Troubleshooting

Console Logging

Enable debugging by checking browser console:

Langfuse Enhanced client using server-side proxy: {
  environment: 'browser',
  baseUrl: 'https://analytics.aitana.chat',
  publicKeyConfigured: true,
  usingProxy: true
}

Created new langfuse trace via proxy ui2-1671234567890-abc123
Creating span via proxy: frontend-ui-processing
Span created via proxy: { success: true }

Health Check Endpoint

Check proxy configuration:

curl https://your-app.com/api/langfuse/trace

Expected response:

{
  "status": "healthy",
  "langfuse_configured": true,
  "public_key_available": true,
  "secret_key_available": true,
  "base_url": "https://analytics.aitana.chat"
}

Common Issues

Issue	Symptom	Solution
Missing secret key	`secret_key_available: false`	Set `LANGFUSE_SECRET_KEY` in environment
Proxy failures	`Failed to create trace via proxy`	Check proxy endpoint and network
Duplicate traces	Multiple traces for same interaction	Verify `shouldCreateNewTrace` logic

See Langfuse Troubleshooting for detailed debugging.

Benefits

Assistant-Specific Analytics: Track performance per assistant type
End-to-End Visibility: Frontend + backend in single trace
Security: Proxy pattern protects sensitive credentials
Performance Monitoring: Identify bottlenecks across the full stack
User Experience: Measure actual user-perceived latency
Session Analysis: Group related interactions for user journey insights

Future Enhancements

Error Tracking: Add error spans with stack traces
Tool Usage Analytics: Track specific tool invocations within traces
A/B Testing: Tag traces with experiment variants
Cost Tracking: Include token usage and costs in metadata
Real-time Dashboards: Live performance monitoring