Documentation Index Fetch the complete documentation index at: https://mintlify.com/helicone/helicone/llms.txt
Use this file to discover all available pages before exploring further.
The Requests page is your central hub for monitoring and debugging LLM requests. Every API call flowing through Helicone is captured with complete context, allowing you to trace issues, analyze performance, and understand how your AI application behaves in production.
What’s Captured
For every LLM request, Helicone records:
Request Details
Full request body (messages, parameters)
Model and provider information
Custom properties and metadata
User ID and session information
Response Details
Complete response body
Generated text and function calls
Finish reason and stop sequences
Token counts and cost
Performance Metrics
Total latency (start to finish)
Time to first token (TTFT)
Tokens per second
Request and response timestamps
Metadata
Request ID (for reference)
HTTP status codes
Error messages (if any)
Cache hit/miss status
Accessing Requests
Dashboard View
Visit helicone.ai/requests to see all your requests in a table view:
Real-time updates : New requests appear automatically
Sortable columns : Click column headers to sort by any field
Quick filters : Filter by model, status, user, or date range
Request drawer : Click any row to see full request details
Request Details Drawer
Click on any request to open a detailed view showing:
Messages
Request Body
Response Body
Metadata
View the conversation in a chat-like format:
System prompts and instructions
User messages with role indicators
Assistant responses with streaming indicators
Function/tool calls and responses
See the complete JSON request sent to the LLM provider: {
"model" : "gpt-4o-mini" ,
"messages" : [
{
"role" : "system" ,
"content" : "You are a helpful assistant."
},
{
"role" : "user" ,
"content" : "What is the capital of France?"
}
],
"temperature" : 0.7 ,
"max_tokens" : 150
}
See the complete JSON response from the provider: {
"id" : "chatcmpl-123" ,
"object" : "chat.completion" ,
"created" : 1677652288 ,
"model" : "gpt-4o-mini" ,
"choices" : [
{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "The capital of France is Paris."
},
"finish_reason" : "stop"
}
],
"usage" : {
"prompt_tokens" : 20 ,
"completion_tokens" : 8 ,
"total_tokens" : 28
}
}
View performance metrics and metadata:
Request ID: req_abc123xyz
Created: 2024-03-10 14:32:15 UTC
Latency: 1,234 ms
Time to First Token: 234 ms
Status: 200 OK
Provider: OpenAI
Model: gpt-4o-mini
Cost: $0.0042
User ID: user-123
Custom Properties: Environment: production, Feature: chat
Filtering Requests
Built-in Filters
Use the dashboard’s filter interface to narrow down requests:
Time Range
Last hour, day, week, month
Custom date range picker
Timezone-aware filtering
Model & Provider
Filter by specific model (e.g., gpt-4o-mini)
Filter by provider (OpenAI, Anthropic, etc.)
Include/exclude specific models
Status
Success (2xx responses)
Client errors (4xx)
Server errors (5xx)
Specific status codes
User & Properties
Filter by user ID
Filter by any custom property
Combine multiple property filters
Advanced Filtering
For complex queries, use the filter builder:
// Example: Production errors from last 24 hours
{
"AND" : [
{ "status" : { "gte" : 400 } },
{ "properties.Environment" : { "equals" : "production" } },
{ "created_at" : { "gte" : "2024-03-09T14:00:00Z" } }
]
}
Querying via API
Retrieve requests programmatically using the REST API:
Basic Query
curl --request POST \
--url https://api.helicone.ai/v1/request/query-clickhouse \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $HELICONE_API_KEY " \
--data '{
"filter": {
"request_response_rmt": {
"model": {
"equals": "gpt-4o-mini"
}
}
},
"limit": 100
}'
Filter by Custom Properties
Important: When filtering by custom properties, you MUST wrap the properties filter inside a request_response_rmt object.
curl --request POST \
--url https://api.helicone.ai/v1/request/query-clickhouse \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $HELICONE_API_KEY " \
--data '{
"filter": {
"request_response_rmt": {
"properties": {
"Environment": {
"equals": "production"
}
}
}
},
"limit": 100
}'
Complex Filters
Combine multiple conditions using AND/OR operators:
curl --request POST \
--url https://api.helicone.ai/v1/request/query-clickhouse \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $HELICONE_API_KEY " \
--data '{
"filter": {
"left": {
"request_response_rmt": {
"request_created_at": {
"gte": "2024-03-01T00:00:00Z"
}
}
},
"operator": "and",
"right": {
"left": {
"request_response_rmt": {
"model": {
"equals": "gpt-4o-mini"
}
}
},
"operator": "and",
"right": {
"request_response_rmt": {
"properties": {
"Environment": {
"equals": "production"
}
}
}
}
}
},
"limit": 1000
}'
Export Large Datasets
For exporting large amounts of data, use the CLI tool:
# Export all requests from last 30 days
HELICONE_API_KEY = "your-api-key" \
npx @helicone/export \
--start-date 2024-02-01 \
--limit 100000 \
--include-body
# Export with property filter to CSV
HELICONE_API_KEY = "your-api-key" \
npx @helicone/export \
--property Environment=production \
--format csv \
--include-body
Common Use Cases
Debug Failed Requests
Filter by status code (4xx or 5xx)
Look for patterns in error messages
Check request parameters and prompts
Verify custom properties (environment, version)
// Add debugging context to every request
const response = await client . chat . completions . create (
{ /* request */ },
{
headers: {
"Helicone-Property-Environment" : process . env . NODE_ENV ,
"Helicone-Property-Version" : packageJson . version ,
"Helicone-Property-RequestType" : "user_chat" ,
"Helicone-User-Id" : userId
}
}
);
Analyze Slow Requests
Sort by latency (descending)
Identify patterns in slow requests
Check prompt length and token counts
Compare across models and providers
// Query slow requests via API
const slowRequests = await fetch ( 'https://api.helicone.ai/v1/request/query-clickhouse' , {
method: 'POST' ,
headers: {
'Authorization' : `Bearer ${ HELICONE_API_KEY } ` ,
'Content-Type' : 'application/json'
},
body: JSON . stringify ({
filter: {
request_response_rmt: {
latency: { gte: 5000 } // >= 5 seconds
}
},
limit: 100
})
});
Track User-Specific Issues
Filter by user ID
Review their request history
Check for error patterns
Analyze usage patterns
// Tag all requests with user ID
const response = await client . chat . completions . create (
{ /* request */ },
{
headers: {
"Helicone-User-Id" : userId ,
"Helicone-Property-UserTier" : userTier ,
"Helicone-Property-Feature" : featureName
}
}
);
Monitor Cost by Feature
Filter by custom property (e.g., Feature)
Sum costs across requests
Compare costs across features
Identify cost optimization opportunities
// Tag requests by feature
const features = [ 'chat' , 'summarize' , 'translate' , 'analyze' ];
for ( const feature of features ) {
await client . chat . completions . create (
{ /* request */ },
{
headers: {
"Helicone-Property-Feature" : feature ,
"Helicone-Property-Environment" : "production"
}
}
);
}
// Query costs by feature via dashboard or API
Custom Request IDs
Provide your own request ID for easy reference:
import { randomUUID } from "crypto" ;
const requestId = randomUUID ();
const response = await client . chat . completions . create (
{ /* request */ },
{
headers: {
"Helicone-Request-Id" : requestId
}
}
);
// Later, query by this ID
const requestDetails = await fetch (
`https://api.helicone.ai/v1/request/ ${ requestId } `
);
Excluding Sensitive Data
Omit request or response bodies for sensitive data:
const response = await client . chat . completions . create (
{
model: "gpt-4o-mini" ,
messages: [{ role: "user" , content: "Sensitive information..." }]
},
{
headers: {
"Helicone-Omit-Request" : "true" , // Don't log request body
"Helicone-Omit-Response" : "true" // Don't log response body
}
}
);
Time to First Token (TTFT)
For streaming requests, Helicone tracks when the first token arrives:
const stream = await client . chat . completions . create (
{
model: "gpt-4o-mini" ,
messages: [{ role: "user" , content: "Write a story..." }],
stream: true
},
{
headers: {
"Helicone-Property-Feature" : "story_generation"
}
}
);
// TTFT is automatically tracked and visible in the dashboard
Latency Analysis
Analyze latency patterns:
p50 (median) : Typical latency
p95 : 95th percentile - catches slow outliers
p99 : 99th percentile - identifies worst-case performance
Sessions Group related requests into sessions for workflow tracking
Custom Properties Add metadata to requests for filtering and analysis
User Metrics Analyze per-user costs and usage patterns
Alerts Get notified about errors, rate limits, or cost thresholds
Questions?
Need help or have questions? We’re here to help: