Documentation Index
Fetch the complete documentation index at: https://mintlify.com/helicone/helicone/llms.txt
Use this file to discover all available pages before exploring further.
Overview
For production workloads, Helicone provides a production-ready Helm chart that deploys all services to Kubernetes with:
- Horizontal auto-scaling
- High availability
- Resource management
- Service discovery
- Rolling updates
- Health checks and probes
Prerequisites
- Kubernetes 1.24 or later
- Helm 3.8 or later
- kubectl configured to access your cluster
- 16GB+ memory across nodes
- 100GB+ storage (persistent volumes)
Getting the Helm Chart
The Helm chart is available for enterprise customers. Contact us to get access:
Quick Start
Once you have access to the Helm chart:
Add the Helm repository
helm repo add helicone https://charts.helicone.ai
helm repo update
Create a namespace
kubectl create namespace helicone
Configure values
Create a values.yaml file with your configuration:# values.yaml
global:
domain: helicone.your-domain.com
auth:
secret: "your-secure-random-secret-key"
postgresql:
enabled: true
auth:
password: "secure-postgres-password"
primary:
persistence:
size: 100Gi
clickhouse:
enabled: true
persistence:
size: 500Gi
minio:
enabled: true
auth:
rootUser: admin
rootPassword: "secure-minio-password"
persistence:
size: 1Ti
jawn:
replicaCount: 3
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
web:
replicaCount: 2
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
Install Helicone
helm install helicone helicone/helicone \
-n helicone \
-f values.yaml
Verify installation
# Check pod status
kubectl get pods -n helicone
# Check services
kubectl get svc -n helicone
# View logs
kubectl logs -n helicone -l app=jawn -f
Architecture on Kubernetes
Helicone deploys the following workloads:
┌─────────────────────────────────────────────────────┐
│ Ingress / Load Balancer │
│ (helicone.your-domain.com) │
└──────────────┬────────────────────┬─────────────────┘
│ │
┌─────▼──────┐ ┌─────▼─────┐
│ Web │ │ Jawn │
│ (Next.js) │ │ (API) │
│ 2 replicas │ │ 3 replicas│
└─────┬──────┘ └─────┬─────┘
│ │
┌─────▼────────────────────▼─────┐
│ │
┌────▼────┐ ┌──────────┐ ┌─────────▼──┐
│PostgreSQL│ │ClickHouse│ │ MinIO │
│StatefulSet│ │StatefulSet│ │StatefulSet │
│(Primary +│ │(Cluster) │ │ (Cluster) │
│ Replica) │ │ │ │ │
└──────────┘ └──────────┘ └────────────┘
│ │ │
┌────▼────┐ ┌─────▼────┐ ┌─────▼────┐
│ PV: 100G│ │ PV: 500G │ │ PV: 1TB │
└─────────┘ └──────────┘ └──────────┘
Configuration Reference
Global Settings
global:
# Domain for ingress
domain: helicone.example.com
# Image registry (optional)
imageRegistry: docker.io
# Storage class for PVCs
storageClass: "fast-ssd"
Authentication
auth:
# Secret key for session encryption (REQUIRED)
secret: "change-me-to-random-32-char-string"
# Existing secret (optional)
existingSecret: "helicone-auth-secret"
existingSecretKey: "auth-secret"
PostgreSQL (Application Database)
postgresql:
enabled: true # Set to false to use external database
auth:
username: postgres
password: "secure-password"
database: helicone
primary:
persistence:
enabled: true
size: 100Gi
storageClass: "fast-ssd"
resources:
requests:
cpu: 2000m
memory: 4Gi
limits:
cpu: 4000m
memory: 8Gi
# External database configuration
external:
host: postgres.external.com
port: 5432
database: helicone
username: helicone_user
password: "password"
ClickHouse (Analytics Database)
clickhouse:
enabled: true # Set to false to use external ClickHouse
persistence:
enabled: true
size: 500Gi
storageClass: "fast-ssd"
resources:
requests:
cpu: 4000m
memory: 8Gi
limits:
cpu: 8000m
memory: 16Gi
# Replication (for HA)
replicaCount: 3
# External ClickHouse
external:
host: clickhouse.external.com
port: 8123
user: default
password: ""
MinIO (Object Storage)
minio:
enabled: true # Set to false to use S3/GCS
auth:
rootUser: admin
rootPassword: "secure-password"
persistence:
enabled: true
size: 1Ti
storageClass: "standard"
# For HA setup
mode: distributed
replicaCount: 4
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
# Or use external S3-compatible storage
s3:
endpoint: https://s3.amazonaws.com
region: us-east-1
bucket: helicone-storage
accessKeyId: "AKIA..."
secretAccessKey: "secret..."
Jawn (Backend API)
jawn:
replicaCount: 3
image:
repository: helicone/jawn
tag: latest
pullPolicy: IfNotPresent
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
env:
LOG_LEVEL: info
NODE_ENV: production
Web (Frontend)
web:
replicaCount: 2
image:
repository: helicone/web
tag: latest
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
Ingress
ingress:
enabled: true
className: nginx # or 'traefik', 'alb', etc.
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
hosts:
- host: helicone.example.com
paths:
- path: /
pathType: Prefix
service: web
- path: /v1
pathType: Prefix
service: jawn
tls:
- secretName: helicone-tls
hosts:
- helicone.example.com
Using External Managed Services
For production, we recommend using managed services:
AWS Example
# Disable bundled databases
postgresql:
enabled: false
external:
host: helicone.abc123.us-east-1.rds.amazonaws.com
port: 5432
database: helicone
username: helicone
password: "${POSTGRES_PASSWORD}" # Use secret
clickhouse:
enabled: false
external:
host: clickhouse.abc123.us-east-1.amazonaws.com
port: 8443
secure: true
minio:
enabled: false
s3:
endpoint: https://s3.us-east-1.amazonaws.com
region: us-east-1
bucket: helicone-prod-storage
accessKeyId: "${AWS_ACCESS_KEY_ID}"
secretAccessKey: "${AWS_SECRET_ACCESS_KEY}"
Monitoring and Observability
The Helm chart includes Prometheus metrics and health checks:
monitoring:
enabled: true
serviceMonitor:
enabled: true
namespace: monitoring
grafana:
enabled: true
dashboards:
enabled: true
Available Metrics
- Request latency (p50, p95, p99)
- Request volume
- Error rates
- Database connection pools
- Cache hit rates
Backup and Disaster Recovery
PostgreSQL Backups
postgresql:
backup:
enabled: true
schedule: "0 2 * * *" # Daily at 2 AM
retention: 30 # Keep 30 days
destination: s3://helicone-backups/postgres
ClickHouse Backups
clickhouse:
backup:
enabled: true
schedule: "0 3 * * *"
retention: 7
destination: s3://helicone-backups/clickhouse
Scaling
Manual Scaling
# Scale Jawn replicas
kubectl scale deployment/helicone-jawn -n helicone --replicas=5
# Scale Web replicas
kubectl scale deployment/helicone-web -n helicone --replicas=3
Auto-Scaling
HPA (Horizontal Pod Autoscaler) is configured in values.yaml:
jawn:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Upgrading
# Update Helm repo
helm repo update
# Check what will change
helm diff upgrade helicone helicone/helicone \
-n helicone \
-f values.yaml
# Perform upgrade
helm upgrade helicone helicone/helicone \
-n helicone \
-f values.yaml \
--wait
Troubleshooting
Check pod events and logs:kubectl describe pod -n helicone <pod-name>
kubectl logs -n helicone <pod-name> --previous
Database connection errors
Verify database connectivity:# Test from a debug pod
kubectl run -it --rm debug --image=postgres:17 -n helicone -- \
psql -h helicone-postgresql -U postgres
Check storage class and PVC status:kubectl get pvc -n helicone
kubectl describe pvc -n helicone <pvc-name>
Production Checklist
Next Steps
Architecture
Understand the system architecture
Enterprise Support
Get help with your production deployment