Claude Code Telemetry - Development Guide

This guide helps you quickly understand and resume work on the telemetry bridge.

🏗️ Project Overview

Purpose: Bridge that captures Claude Code's telemetry and forwards to Langfuse for LLM observability.

Architecture:

Claude Code → OTLP/HTTP → Bridge Server → Langfuse API
             (JSON logs)   (Parse & Map)   (Traces/Spans)

Quick Setup: Run ./quickstart.sh for a complete setup with:

Langfuse stack (PostgreSQL, ClickHouse, Redis, MinIO)
Unique credentials generated automatically
Telemetry bridge configured and ready

Key Modules:

src/server.js - Main OTLP server (port 4318)
src/sessionHandler.js - Session lifecycle management
src/eventProcessor.js - Maps Claude events to Langfuse
src/metricsProcessor.js - Handles cost/token metrics
src/requestHandlers.js - HTTP request processing
src/serverHelpers.js - Server utilities

🚨 Critical Knowledge

Claude's Non-Standard OTLP Implementation

MUST REMEMBER: Claude Code does NOT follow OpenTelemetry defaults!

No default endpoint - OTEL_EXPORTER_OTLP_ENDPOINT must be explicitly set
All 6 env vars required - No partial configuration works
JSON only - Protobuf not supported
Custom event names - Uses claude_code.* namespace

Required Environment Variables (All 6)

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_LOGS_EXPORTER=otlp
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=http/json
export OTEL_EXPORTER_OTLP_METRICS_PROTOCOL=http/json
export OTEL_EXPORTER_OTLP_ENDPOINT=http://127.0.0.1:4318  # NO DEFAULT!

Standard Attributes on All Events/Metrics

session.id - Unique session identifier
organization.id - Organization UUID
user.account_uuid - User account UUID
user.email - User email address
terminal.type - Terminal type (e.g., "vscode")
app.version - Claude Code version

📊 Data Flow

Events (via Logs)

claude_code.user_prompt - User input received
- Contains: prompt, prompt_length, event.timestamp
claude_code.api_request - Model calls (Haiku + Opus)
- Contains: model, tokens (input/output/cache), cost, duration, request_id
claude_code.tool_result - Tool execution results
- Contains: tool_name, success, duration_ms
claude_code.api_error - Failed API calls
- Contains: error_message, status_code, model
claude_code.tool_decision - Tool permission decisions
- Contains: decision, source, tool_name

Metrics

claude_code.cost.usage - USD per model
claude_code.token.usage - Token breakdown (input/output/cacheRead/cacheCreation)
claude_code.lines_of_code.count - Code changes (added/removed)
claude_code.commit.count - Git commits
claude_code.pull_request.count - PRs created
claude_code.code_edit_tool.decision - Tool accept/reject decisions
claude_code.active_time.total - Active interaction time
claude_code.session.count - Session starts

Session Lifecycle

Auto-created on first event
1-hour timeout for cleanup (configurable via SESSION_TIMEOUT)
Tracks: total cost, tokens, cache usage, tool usage, code changes
Creates session summary with quality and efficiency scores

Langfuse Mapping

Traces: One per conversation + session summary
Generations: For each API call with full token/cost data
Events: For tools, decisions, errors, and milestones
Scores: Quality and efficiency on session summary

🧪 Testing

Test Structure

test/
├── unit/           # Mocked tests - fast, isolated
├── integration/    # Real Langfuse API tests
└── helpers/        # Test utilities and clients

Running Tests

# Unit tests only (no external dependencies)
npm run test:unit

# Integration tests (requires Langfuse)
export LANGFUSE_PUBLIC_KEY=xxx
export LANGFUSE_SECRET_KEY=xxx
npm run test:integration

# All tests
npm test

Manual Testing

# Terminal 1: Start server
npm start

# Terminal 2: Test with real Claude
export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_LOGS_EXPORTER=otlp
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=http/json
export OTEL_EXPORTER_OTLP_METRICS_PROTOCOL=http/json
export OTEL_EXPORTER_OTLP_ENDPOINT=http://127.0.0.1:4318
export OTEL_LOG_USER_PROMPTS=1
claude "What is 2+2?"

Integration Test Features

LangfuseTestClient: Direct API access for verification
OTLP Test Data Builders: Consistent test fixtures
E2E Tests: Run real Claude commands
Validation Helpers: Comprehensive data checking

🐛 Debugging

No Telemetry Received

Verify ALL 6 env vars set (use server startup banner)
Check endpoint has no typos (common: missing http://)
Confirm server health: curl http://localhost:4318/health
Enable debug: LOG_LEVEL=debug npm start

Missing Data in Langfuse

Run validation: node test/helpers/validate-langfuse.js
Check for conversation traces vs session-summary traces
Verify all standard attributes are present
Ensure metrics are being sent (not just logs)

Common Issues

No generations: Check logs are being sent, not just metrics
No scores: Created only on session finalization
Missing cache tokens: Verify token metrics include all types
No events: Tool usage creates events, simple prompts don't

Langfuse Connection Issues

Check .env has valid keys
Verify LANGFUSE_HOST is correct
Use scripts/debug-generations.js for manual testing

🔧 Development Workflow

Making Changes

Event Processing: Edit src/eventProcessor.js
Metrics: Edit src/metricsProcessor.js
Session Logic: Edit src/sessionHandler.js
Request Handling: Edit src/requestHandlers.js
New Endpoints: Edit src/server.js

Adding New Event Types

// In eventProcessor.js
case 'claude_code.new_event':
  return processNewEvent(attrs, standardAttrs, timestamp, session)

// Don't forget to:
// 1. Extract standard attributes
// 2. Pass to session handler
// 3. Create appropriate Langfuse entities

Testing Changes

Write unit test with mocks
Write integration test with real Langfuse
Test manually with Claude (see Quick Commands)
Check Langfuse dashboard for results

📝 Recent Changes & Context

Major Refactoring (Latest)

Comprehensive telemetry capture - All Claude Code events and metrics
Fixed Langfuse SDK usage - Use traceId, not parentObservationId
True integration tests - Test with real Langfuse API
Modular architecture - Separated concerns for testability
Complete metadata tracking - Organization, user, terminal info
Cache token support - Track cache read/creation separately

Architecture Decision

Stateful session management is the correct approach
Session aggregation provides the actual value customers need
Complexity is worth it for actionable insights (costs, efficiency, productivity)
Focus on making the aggregated data more valuable, not simpler

Test Coverage

96%+ coverage on business logic
Real integration tests catch actual issues
E2E tests validate full flow

Known Limitations

Sessions accumulate in memory (1-hour cleanup helps manage this)
Some metrics rarely observed (PR count, active time)
Session summary created on timeout or graceful shutdown

🎯 Future Improvements That Actually Matter

Customer-Focused Features

Cost Alerts: Notify when session exceeds threshold
Daily Reports: Email summary of team AI usage
Efficiency Tips: "You could save 80% by using cache"
Team Dashboard: Compare developer AI efficiency
ROI Calculator: Hours saved vs dollars spent

What NOT to Build

Alternative architectures (stateless, event sourcing, etc.)
Complex configuration options
Multiple storage backends
Theoretical performance optimizations

Remember: Customers need insights, not infrastructure.

💡 Quick Commands

# Full setup with Langfuse included
./quickstart.sh

# Start standalone server (requires existing Langfuse)
npm start

# Manage Langfuse services
./scripts/langfuse up    # Start
./scripts/langfuse down  # Stop
./scripts/langfuse logs  # View logs

# Run tests
npm test                      # All tests
npm run test:unit            # Unit tests only
npm run test:integration     # Integration tests (needs Langfuse)

# Test with real Claude
source claude-telemetry.env
claude "What is 2+2?"

# Debug mode
LOG_LEVEL=debug npm start

# Check what's running
lsof -i:4318

# Clean up everything
./scripts/cleanup-langfuse.sh

🔗 Key Resources

Remember:

Claude's OTLP implementation is non-standard - no defaults!
Always test with real Claude binary when making changes
Integration tests are your friend - they catch real issues
Check all standard attributes are captured and forwarded

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude Code Telemetry - Development Guide

🏗️ Project Overview

🚨 Critical Knowledge

Claude's Non-Standard OTLP Implementation

Required Environment Variables (All 6)

Standard Attributes on All Events/Metrics

📊 Data Flow

Events (via Logs)

Metrics

Session Lifecycle

Langfuse Mapping

🧪 Testing

Test Structure

Running Tests

Manual Testing

Integration Test Features

🐛 Debugging

No Telemetry Received

Missing Data in Langfuse

Common Issues

Langfuse Connection Issues

🔧 Development Workflow

Making Changes

Adding New Event Types

Testing Changes

📝 Recent Changes & Context

Major Refactoring (Latest)

Architecture Decision

Test Coverage

Known Limitations

🎯 Future Improvements That Actually Matter

Customer-Focused Features

What NOT to Build

💡 Quick Commands

🔗 Key Resources

FilesExpand file tree

DEVELOPMENT.md

Latest commit

History

DEVELOPMENT.md

File metadata and controls

Claude Code Telemetry - Development Guide

🏗️ Project Overview

🚨 Critical Knowledge

Claude's Non-Standard OTLP Implementation

Required Environment Variables (All 6)

Standard Attributes on All Events/Metrics

📊 Data Flow

Events (via Logs)

Metrics

Session Lifecycle

Langfuse Mapping

🧪 Testing

Test Structure

Running Tests

Manual Testing

Integration Test Features

🐛 Debugging

No Telemetry Received

Missing Data in Langfuse

Common Issues

Langfuse Connection Issues

🔧 Development Workflow

Making Changes

Adding New Event Types

Testing Changes

📝 Recent Changes & Context

Major Refactoring (Latest)

Architecture Decision

Test Coverage

Known Limitations

🎯 Future Improvements That Actually Matter

Customer-Focused Features

What NOT to Build

💡 Quick Commands

🔗 Key Resources