Documentation Index
Fetch the complete documentation index at: https://mintlify.com/helicone/helicone/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Helicone alerts monitor your AI application metrics and notify you when specific conditions are met. Set thresholds for cost, latency, errors, and custom metrics to stay informed about your system’s health and prevent issues before they impact users.Alerts help you:
- Monitor spending and prevent budget overruns
- Detect performance degradation early
- Track error rates and quality issues
- Ensure SLAs are maintained
- Get notified of unusual patterns
Key Benefits
Flexible Metrics
Alert on cost, latency, error rates, token usage, and custom properties
Smart Aggregation
Use sum, average, percentile, or count aggregations with time windows
Multi-Channel
Receive notifications via email, Slack, or both
Advanced Filtering
Apply filters to monitor specific users, models, or properties
Alert Types
Cost Alerts
Monitor spending to stay within budget:Latency Alerts
Detect performance issues:Error Rate Alerts
Monitor reliability:Token Usage Alerts
Track token consumption:Creating Alerts
Via API
Via Dashboard
Navigate to Alerts
Go to your Helicone Dashboard and click Alerts in the sidebar.
Create New Alert
Click Create Alert and configure:
- Alert name
- Metric to monitor
- Threshold value
- Aggregation method
- Time window
Set Filters (Optional)
Apply filters to monitor specific segments:
- Model name
- User ID
- Custom properties
- Request path
Alert Configuration
Metrics
| Metric | Description | Unit |
|---|---|---|
cost | Total cost of requests | USD |
latency | Request latency | milliseconds |
error_rate | Percentage of failed requests | decimal (0-1) |
total_tokens | Sum of input and output tokens | count |
prompt_tokens | Input tokens only | count |
completion_tokens | Output tokens only | count |
request_count | Number of requests | count |
Aggregation Methods
| Method | Description | Use Case |
|---|---|---|
sum | Total value in time window | Cost, token usage |
average | Mean value | Error rate, average latency |
p50 | 50th percentile | Median latency |
p75 | 75th percentile | Above-average latency |
p95 | 95th percentile | Tail latency, outliers |
p99 | 99th percentile | Worst-case latency |
count | Number of occurrences | Request volume |
Time Windows
| Window | Format | Use Case |
|---|---|---|
| 5 minutes | 5m | Real-time monitoring |
| 15 minutes | 15m | Short-term trends |
| 1 hour | 1h | Hourly budgets |
| 6 hours | 6h | Business hours |
| 1 day | 1d | Daily budgets |
| 1 week | 7d | Weekly planning |
Advanced Configuration
Grouping
Group alerts by dimension to get per-segment notifications:user- Alert per user IDmodel- Alert per model- Custom properties (e.g.,
team,environment)
Minimum Request Count
Avoid false positives from low traffic:Filters
Monitor specific segments using filter expressions:Notification Channels
Email Notifications
Add one or more email addresses:Slack Notifications
Connect Slack workspace and specify channels:- Install Helicone Slack app in your workspace
- Invite the bot to desired channels:
/invite @Helicone - Use channel names in alert configuration
Managing Alerts
List Alerts
Delete Alert
Common Alert Patterns
Budget protection
Budget protection
Set multiple cost alerts with increasing urgency:
SLA monitoring
SLA monitoring
Track P95 latency to ensure performance SLAs:
Model-specific monitoring
Model-specific monitoring
Alert on expensive models separately:
User quota enforcement
User quota enforcement
Track per-user usage:
Best Practices
- Set meaningful thresholds: Base thresholds on historical data and business requirements
- Use minimum request counts: Avoid noise from low-traffic periods
- Layer alerts: Create warning, critical, and emergency tiers
- Monitor trends: Use longer time windows to catch gradual increases
- Test alerts: Verify notification delivery before relying on alerts
- Document runbooks: Include action items for each alert type
- Review regularly: Adjust thresholds as usage patterns change
Troubleshooting
Not receiving notifications
Not receiving notifications
- Verify email addresses and Slack channels are correct
- Check spam folders for email notifications
- Ensure Helicone bot is in Slack channels
- Confirm alert is active and not deleted
Too many false positives
Too many false positives
- Increase
minimum_request_countto filter low-traffic noise - Adjust threshold based on normal variance
- Use longer time windows for smoother trends
- Add filters to focus on relevant requests
Missing critical alerts
Missing critical alerts
- Lower threshold to catch issues earlier
- Use shorter time windows for faster detection
- Remove
minimum_request_countif appropriate - Verify filters aren’t excluding relevant data
Related Features
Webhooks
Build custom notification systems with real-time webhooks
Cost Tracking
Analyze spending patterns and optimize costs
