Track and optimize your LLM costs across all providers. Helicone provides detailed cost analytics and optimization tools to help you manage your AI budget effectively.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/helicone/helicone/llms.txt
Use this file to discover all available pages before exploring further.
How We Calculate Costs
Helicone uses two systems for cost calculation depending on your integration method:AI Gateway (100% Accurate)
When using Helicone’s AI Gateway, we have complete visibility into model usage and calculate costs precisely using our Model Registry v2 system.Best Effort (Without Gateway)
For direct provider integrations, we use our open-source cost repository with pricing for 300+ models. This provides best-effort cost estimates based on model detection and token counts.Cost not showing? If your model costs aren’t supported, join our Discord or email help@helicone.ai and we’ll add support quickly.
Understanding Unit Economics
The most critical aspect of cost tracking is understanding your unit economics - what drives costs in your application and how to optimize them.Sessions: Your Cost Foundation
Sessions group related requests to show the true cost of user interactions. Instead of seeing individual API calls, you see complete workflows:- TypeScript
- Python
- A support chat costs $0.12 on average with 5 API calls
- Document analysis workflows cost $0.45 with 12 API calls
- Quick queries cost $0.02 with a single call
Segmentation That Matters
Use custom properties to slice costs by the dimensions that matter to your business:- Do premium users justify their higher usage costs?
- Which features are cost-efficient vs. cost-intensive?
- How much are we spending on development vs. production?
- Which regions have the highest per-user costs?
Practical Cost Analysis
Track Baseline Costs
Start by understanding your current spending patterns:After a week, review your dashboard to identify:
- Daily average costs
- Cost per user/session
- Most expensive features
- Peak usage times
Identify Cost Drivers
Use custom properties to pinpoint expensive operations:Filter by properties in your dashboard to see:
- Which document sizes cost the most
- If long documents justify their cost
- Where to optimize token usage
Implement Cost Controls
Set up rate limits and alerts:Configure alerts in the Helicone dashboard:
- Daily spending threshold: $100
- User spending threshold: $10/day
- Error rate threshold: 5%
Optimize with Caching
Enable caching for repetitive queries:Best caching candidates:
- FAQ responses (90%+ savings)
- Product descriptions (85% savings)
- Static content generation (80% savings)
- Development/testing environments (95% savings)
AI Gateway Cost Optimization
The AI Gateway doesn’t just track costs - it actively optimizes them through intelligent routing.Automatic Model Selection
The Model Registry shows all supported models with real-time pricing across providers. The AI Gateway automatically routes to the cheapest option:How Automatic Optimization Works
- BYOK Priority - Uses your existing credits first (AWS, Azure, etc.)
- Cost-Based Routing - Automatically selects the cheapest available provider
- Smart Fallbacks - If one provider fails, routes to the next cheapest option
Cost-Based Model Selection
Route to different models based on query complexity:Cost Prevention & Alerts
Setting Smart Alerts
Configure cost alerts to catch spending issues before they become problems:- Graduated thresholds - Alert at 50%, 80%, 95% of budget
- Environment-specific limits - Higher for production, lower for dev
- User-level alerts - Track individual user spending
- Feature-level alerts - Monitor expensive features separately
Cost alerts rely on accurate cost data. See How We Calculate Costs above. If you see “cost not supported” for your model, contact us to add support.
Rate Limiting for Cost Control
Prevent runaway costs with rate limits:Analyzing Cost Trends
Query Session Costs
Retrieve cost data programmatically:Export for Analysis
Export cost data for deeper analysis:Automated Cost Reports
Get regular cost summaries delivered to your inbox or Slack channels.What Reports Include
- Weekly spending summaries and trends
- Model usage breakdown by cost
- Top cost drivers and expensive requests
- Week-over-week comparisons
- Optimization recommendations
Setting Up Reports
Configure automated reports in Settings → Reports to receive them via:- Email - Weekly digests to any email address
- Slack - Post to your team channels
Reports help you stay on top of costs without checking the dashboard daily. Perfect for finance teams and engineering managers tracking AI spend.
Best Practices
Start with Sessions
Start with Sessions
Always track complete workflows with sessions to understand true unit economics, not just per-request costs.
Tag Everything
Tag Everything
Use custom properties liberally - you can filter by them later but can’t add them retroactively.
Set Graduated Alerts
Set Graduated Alerts
Alert at 50%, 80%, and 95% of budget to give yourself time to respond without alert fatigue.
Cache Aggressively in Dev
Cache Aggressively in Dev
Use 100% caching in development environments to eliminate unnecessary costs during testing.
Review Weekly
Review Weekly
Set a recurring calendar event to review cost trends and identify optimization opportunities.
Next Steps
Set Up Alerts
Configure spending thresholds before they become problems
Enable Caching
Start saving immediately on repetitive requests
Configure Gateway
Let automatic routing optimize your costs
Track Sessions
Understand your true unit economics
