Platform Overview

Now that your requests are flowing through Helicone, let’s explore the platform architecture and understand how everything works together.

Architecture Overview

Helicone is comprised of five core services that work together to provide AI Gateway and observability capabilities:

Core Components

Worker
Jawn
Web
Data Layer

Worker (Cloudflare Workers)

The Worker is the edge proxy that intercepts and routes all LLM requests. It’s deployed globally on Cloudflare’s edge network for minimal latency.Key responsibilities:

Route requests to 100+ LLM providers
Apply intelligent fallbacks when providers fail
Enforce rate limits and caching rules
Log request metadata to Jawn
Add <50ms overhead on average

Code location: /workerHow it works:

// Simplified worker flow
export default {
  async fetch(request: Request) {
    // 1. Parse Helicone headers
    const heliconeAuth = request.headers.get("Helicone-Auth");
    
    // 2. Determine target provider and model
    const { provider, model } = parseModel(body.model);
    
    // 3. Route to correct provider
    const providerResponse = await routeToProvider({
      provider,
      model,
      request,
    });
    
    // 4. Log request/response to Jawn
    await logToJawn({
      request,
      response: providerResponse,
      metadata: { cost, tokens, latency },
    });
    
    // 5. Return provider response
    return providerResponse;
  },
};

Deployment:

Hosted on Cloudflare Workers (cloud)
Runs locally with wrangler dev (self-hosted)

Jawn (Express + TypeScript)

Jawn is the dedicated API server for collecting logs, serving data to the web dashboard, and handling all backend operations.Key responsibilities:

Receive logs from Worker and SDKs
Store request metadata in Supabase
Write analytics to ClickHouse
Upload request/response bodies to MinIO
Serve REST API for dashboard

Code location: /valhalla/jawnArchitecture:

jawn/
├── src/
│   ├── controllers/    # REST API endpoints
│   ├── managers/       # Business logic
│   ├── lib/
│   │   ├── stores/     # Database operations
│   │   ├── clients/    # External services (S3, etc.)
│   │   └── wrappers/   # Database wrappers
│   └── tsoa-build/     # Auto-generated types

Example API flow:

// controllers/requestController.ts
@Post("/request")
public async logRequest(
  @Body() requestBody: LogRequest
): Promise<void> {
  // 1. Validate request
  const validated = await this.validateRequest(requestBody);
  
  // 2. Store in Supabase (metadata)
  await this.requestStore.insertRequest(validated);
  
  // 3. Store in ClickHouse (analytics)
  await this.clickhouseClient.insertRequest(validated);
  
  // 4. Store bodies in MinIO (S3)
  await this.s3Client.uploadRequestBody(validated);
}

Tech stack:

Express.js with TSOA for API generation
Supabase for auth and metadata
ClickHouse for analytics
MinIO/S3 for object storage

Web (Next.js)

The Web frontend is the dashboard UI where you view requests, analyze costs, debug sessions, and manage your account.Key features:

Request logs with filtering and search
Session trees for agent debugging
Cost analytics and usage charts
Prompt management interface
User and organization settings

Code location: /webTech stack:

Next.js 14 (App Router)
React with TypeScript
TailwindCSS for styling
TanStack Query for data fetching
Recharts for analytics

Key pages:

web/app/
├── (authenticated)/
│   ├── requests/          # Request logs
│   ├── sessions/          # Session debugging
│   ├── dashboard/         # Analytics overview
│   ├── prompts/           # Prompt management
│   └── settings/          # Account settings

Request Flow

Let’s trace a complete request through the system:

User sends LLM request

Your application sends a request to the Helicone AI Gateway:

const response = await openai.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

Worker routes to provider

The Worker (Cloudflare edge):

Authenticates the request
Determines target provider (OpenAI)
Applies any caching or rate limit rules
Forwards request to OpenAI
Receives response

Provider responds

OpenAI processes the request and returns the completion.The Worker:

Calculates cost and token usage
Streams response back to user (if streaming)
Logs request metadata to Jawn

Jawn stores data

Jawn receives the log and:

Stores metadata in Supabase (request ID, user, timestamp)
Writes metrics to ClickHouse (cost, tokens, latency)
Uploads request/response to MinIO (full bodies)

Dashboard displays

The Web dashboard:

Queries Jawn API for request list
Displays in real-time (<5 seconds)
Loads full details on click from MinIO

Key Features Enabled by Architecture

Edge Routing

Worker on Cloudflare edge means:

<50ms latency overhead
Global availability
Automatic failover
DDoS protection

Scalable Analytics

ClickHouse columnar storage:

Query millions of requests
Real-time aggregations
Cost-effective at scale
Sub-second dashboard loads

Flexible Storage

Tiered storage approach:

Hot data in Supabase
Analytics in ClickHouse
Bodies in S3/MinIO
Optimized costs

Multi-Provider

Worker intelligence:

100+ provider integrations
Automatic fallbacks
Smart load balancing
Unified observability

Self-Hosting Options

Helicone is fully open source and can be self-hosted in multiple ways:

Docker (Easiest)
Docker Compose
Kubernetes/Helm

All-in-One Docker Container

Run everything in a single container for testing or small deployments:

docker pull helicone/helicone-all-in-one:latest
docker run -d \
  --name helicone \
  -p 3000:3000 \
  -p 8585:8585 \
  -p 9080:9080 \
  helicone/helicone-all-in-one:latest

Ports:

3000 - Web dashboard
8585 - Jawn API + LLM proxy
9080 - MinIO S3 storage

Perfect for:

Local development
Testing integrations
Small teams

Docker Compose (Recommended)

Run all services separately for production-like setup:

git clone https://github.com/Helicone/helicone.git
cd helicone/docker
cp .env.example .env
docker compose up

Services included:

Web (Next.js)
Jawn (API server)
Worker (proxy)
Supabase (auth + DB)
ClickHouse (analytics)
MinIO (object storage)

Benefits:

Full production stack
Easy to scale individual services
Standard Docker tooling

Development Setup

Want to contribute or customize Helicone? Here’s the local development setup:

Requirements

Docker - For infrastructure (Postgres, ClickHouse, MinIO)
Node.js 20+ - Use nvm to manage versions
Yarn - Package manager
Supabase CLI - For database management
Wrangler - For Cloudflare Worker development

Quick Start

# Clone the repo
git clone https://github.com/Helicone/helicone.git
cd helicone

# Install dependencies
nvm install 20 && nvm use 20
npm install -g yarn wrangler
yarn install

# Set up environment
cp .env.example .env
supabase start

# Start infrastructure
cd docker
docker compose -f docker-compose-local.yml up -d

# Start services (3 terminals)
cd valhalla/jawn && yarn dev    # Terminal 1
cd web && yarn dev:local         # Terminal 2
cd worker && yarn dev            # Terminal 3

Access:

Project Structure

helicone/
├── web/                    # Next.js dashboard
├── worker/                 # Cloudflare Worker proxy
├── valhalla/jawn/          # API server
├── packages/
│   ├── cost/              # Cost calculation registry
│   ├── llm-mapper/        # Provider mappings
│   └── prompts/           # Prompt templates
├── supabase/              # Database migrations
├── clickhouse/            # Analytics schema
├── docker/                # Docker configs
├── examples/              # Integration examples
└── docs/                  # This documentation

Security & Compliance

SOC 2 Compliant

Type II certified for cloud hosting. Enterprise-grade security controls and annual audits.

GDPR Compliant

Full GDPR compliance with data residency options and user data controls.

Data Encryption

TLS 1.3 in transit
AES-256 at rest
Encrypted backups

Data Ownership

You own your data
Export anytime via API
Self-host for complete control

Performance Characteristics

Latency
Scalability
Availability

Edge routing keeps overhead minimal:

p50: <25ms added latency
p95: <50ms added latency
p99: <100ms added latency

The Worker runs on Cloudflare’s edge (300+ locations), so requests are routed from the nearest location to the provider.

What’s Next?

Now that you understand how Helicone works, explore the features:

Sessions & Agent Debugging

Track multi-step AI workflows with session trees

Gateway Fallbacks

Configure automatic failover when providers go down

Prompt Management

Version control and deploy prompts without code

Cost Tracking

Understand your LLM economics by user or feature

Questions?

Join our Discord or email help@helicone.ai for support.

Get Started

AI Gateway

Observability

Prompt Management

Evaluation & Testing

Features

Self-Hosting

Integrations

Platform Overview

Architecture Overview

Core Components

Worker (Cloudflare Workers)

Jawn (Express + TypeScript)

Web (Next.js)

Data Storage

Request Flow

Key Features Enabled by Architecture

Edge Routing

Scalable Analytics

Flexible Storage

Multi-Provider

Self-Hosting Options

All-in-One Docker Container

Docker Compose (Recommended)

Production Helm Chart

Development Setup

Security & Compliance

SOC 2 Compliant

GDPR Compliant

Data Encryption

Data Ownership

Performance Characteristics

What’s Next?

Sessions & Agent Debugging

Gateway Fallbacks

Prompt Management

Cost Tracking

Questions?

Get Started

AI Gateway

Observability

Prompt Management

Evaluation & Testing

Features

Self-Hosting

Integrations

Documentation Index

​Architecture Overview

​Core Components

​Worker (Cloudflare Workers)

​Jawn (Express + TypeScript)

​Web (Next.js)

​Data Storage

​Request Flow

​Key Features Enabled by Architecture

Edge Routing

Scalable Analytics

Flexible Storage

Multi-Provider

​Self-Hosting Options

​All-in-One Docker Container

​Docker Compose (Recommended)

​Production Helm Chart

​Development Setup

​Security & Compliance

SOC 2 Compliant

GDPR Compliant

Data Encryption

Data Ownership

​Performance Characteristics

​What’s Next?

Sessions & Agent Debugging

Gateway Fallbacks

Prompt Management

Cost Tracking

​Questions?

Architecture Overview

Core Components

Worker (Cloudflare Workers)

Jawn (Express + TypeScript)

Web (Next.js)

Data Storage

Request Flow

Key Features Enabled by Architecture

Self-Hosting Options

All-in-One Docker Container

Docker Compose (Recommended)

Production Helm Chart

Development Setup

Security & Compliance

Performance Characteristics

What’s Next?

Questions?