Requests - Helicone

The Requests page is your central hub for monitoring and debugging LLM requests. Every API call flowing through Helicone is captured with complete context, allowing you to trace issues, analyze performance, and understand how your AI application behaves in production.

What’s Captured

For every LLM request, Helicone records:

Request Details

Full request body (messages, parameters)
Model and provider information
Custom properties and metadata
User ID and session information

Response Details

Complete response body
Generated text and function calls
Finish reason and stop sequences
Token counts and cost

Performance Metrics

Total latency (start to finish)
Time to first token (TTFT)
Tokens per second
Request and response timestamps

Metadata

Request ID (for reference)
HTTP status codes
Error messages (if any)
Cache hit/miss status

Accessing Requests

Dashboard View

Visit helicone.ai/requests to see all your requests in a table view:

Real-time updates: New requests appear automatically
Sortable columns: Click column headers to sort by any field
Quick filters: Filter by model, status, user, or date range
Request drawer: Click any row to see full request details

Request Details Drawer

Click on any request to open a detailed view showing:

Messages
Request Body
Response Body
Metadata

View the conversation in a chat-like format:

System prompts and instructions
User messages with role indicators
Assistant responses with streaming indicators
Function/tool calls and responses

See the complete JSON request sent to the LLM provider:

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 150
}

See the complete JSON response from the provider:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 8,
    "total_tokens": 28
  }
}

View performance metrics and metadata:

Request ID: req_abc123xyz
Created: 2024-03-10 14:32:15 UTC
Latency: 1,234 ms
Time to First Token: 234 ms
Status: 200 OK
Provider: OpenAI
Model: gpt-4o-mini
Cost: $0.0042
User ID: user-123
Custom Properties: Environment: production, Feature: chat

Filtering Requests

Built-in Filters

Use the dashboard’s filter interface to narrow down requests: Time Range

Last hour, day, week, month
Custom date range picker
Timezone-aware filtering

Model & Provider

Filter by specific model (e.g., gpt-4o-mini)
Filter by provider (OpenAI, Anthropic, etc.)
Include/exclude specific models

Status

Success (2xx responses)
Client errors (4xx)
Server errors (5xx)
Specific status codes

User & Properties

Filter by user ID
Filter by any custom property
Combine multiple property filters

Advanced Filtering

For complex queries, use the filter builder:

// Example: Production errors from last 24 hours
{
  "AND": [
    { "status": { "gte": 400 } },
    { "properties.Environment": { "equals": "production" } },
    { "created_at": { "gte": "2024-03-09T14:00:00Z" } }
  ]
}

Querying via API

Retrieve requests programmatically using the REST API:

Basic Query

curl --request POST \
  --url https://api.helicone.ai/v1/request/query-clickhouse \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer $HELICONE_API_KEY" \
  --data '{
  "filter": {
    "request_response_rmt": {
      "model": {
        "equals": "gpt-4o-mini"
      }
    }
  },
  "limit": 100
}'

Filter by Custom Properties

Important: When filtering by custom properties, you MUST wrap the properties filter inside a request_response_rmt object.

curl --request POST \
  --url https://api.helicone.ai/v1/request/query-clickhouse \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer $HELICONE_API_KEY" \
  --data '{
  "filter": {
    "request_response_rmt": {
      "properties": {
        "Environment": {
          "equals": "production"
        }
      }
    }
  },
  "limit": 100
}'

Complex Filters

Combine multiple conditions using AND/OR operators:

curl --request POST \
  --url https://api.helicone.ai/v1/request/query-clickhouse \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer $HELICONE_API_KEY" \
  --data '{
  "filter": {
    "left": {
      "request_response_rmt": {
        "request_created_at": {
          "gte": "2024-03-01T00:00:00Z"
        }
      }
    },
    "operator": "and",
    "right": {
      "left": {
        "request_response_rmt": {
          "model": {
            "equals": "gpt-4o-mini"
          }
        }
      },
      "operator": "and",
      "right": {
        "request_response_rmt": {
          "properties": {
            "Environment": {
              "equals": "production"
            }
          }
        }
      }
    }
  },
  "limit": 1000
}'

Export Large Datasets

For exporting large amounts of data, use the CLI tool:

# Export all requests from last 30 days
HELICONE_API_KEY="your-api-key" \
  npx @helicone/export \
  --start-date 2024-02-01 \
  --limit 100000 \
  --include-body

# Export with property filter to CSV
HELICONE_API_KEY="your-api-key" \
  npx @helicone/export \
  --property Environment=production \
  --format csv \
  --include-body

Common Use Cases

Debug Failed Requests

Filter by status code (4xx or 5xx)
Look for patterns in error messages
Check request parameters and prompts
Verify custom properties (environment, version)

// Add debugging context to every request
const response = await client.chat.completions.create(
  { /* request */ },
  {
    headers: {
      "Helicone-Property-Environment": process.env.NODE_ENV,
      "Helicone-Property-Version": packageJson.version,
      "Helicone-Property-RequestType": "user_chat",
      "Helicone-User-Id": userId
    }
  }
);

Analyze Slow Requests

Sort by latency (descending)
Identify patterns in slow requests
Check prompt length and token counts
Compare across models and providers

// Query slow requests via API
const slowRequests = await fetch('https://api.helicone.ai/v1/request/query-clickhouse', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${HELICONE_API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    filter: {
      request_response_rmt: {
        latency: { gte: 5000 } // >= 5 seconds
      }
    },
    limit: 100
  })
});

Track User-Specific Issues

Filter by user ID
Review their request history
Check for error patterns
Analyze usage patterns

// Tag all requests with user ID
const response = await client.chat.completions.create(
  { /* request */ },
  {
    headers: {
      "Helicone-User-Id": userId,
      "Helicone-Property-UserTier": userTier,
      "Helicone-Property-Feature": featureName
    }
  }
);

Monitor Cost by Feature

Filter by custom property (e.g., Feature)
Sum costs across requests
Compare costs across features
Identify cost optimization opportunities

// Tag requests by feature
const features = ['chat', 'summarize', 'translate', 'analyze'];

for (const feature of features) {
  await client.chat.completions.create(
    { /* request */ },
    {
      headers: {
        "Helicone-Property-Feature": feature,
        "Helicone-Property-Environment": "production"
      }
    }
  );
}

// Query costs by feature via dashboard or API

Request Metadata

Custom Request IDs

Provide your own request ID for easy reference:

import { randomUUID } from "crypto";

const requestId = randomUUID();

const response = await client.chat.completions.create(
  { /* request */ },
  {
    headers: {
      "Helicone-Request-Id": requestId
    }
  }
);

// Later, query by this ID
const requestDetails = await fetch(
  `https://api.helicone.ai/v1/request/${requestId}`
);

Excluding Sensitive Data

Omit request or response bodies for sensitive data:

const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Sensitive information..." }]
  },
  {
    headers: {
      "Helicone-Omit-Request": "true",   // Don't log request body
      "Helicone-Omit-Response": "true"   // Don't log response body
    }
  }
);

Performance Metrics

Time to First Token (TTFT)

For streaming requests, Helicone tracks when the first token arrives:

const stream = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Write a story..." }],
    stream: true
  },
  {
    headers: {
      "Helicone-Property-Feature": "story_generation"
    }
  }
);

// TTFT is automatically tracked and visible in the dashboard

Latency Analysis

Analyze latency patterns:

p50 (median): Typical latency
p95: 95th percentile - catches slow outliers
p99: 99th percentile - identifies worst-case performance

Sessions

Group related requests into sessions for workflow tracking

Custom Properties

Add metadata to requests for filtering and analysis

User Metrics

Analyze per-user costs and usage patterns

Alerts

Get notified about errors, rate limits, or cost thresholds

Questions?

Need help or have questions? We’re here to help:

Discord Community: Join our Discord server for quick help
GitHub Issues: Report bugs or request features on GitHub
Documentation: Check our full documentation for more guides

Get Started

AI Gateway

Observability

Prompt Management

Evaluation & Testing

Features

Self-Hosting

Integrations

Documentation Index

​What’s Captured

Request Details

Response Details

Performance Metrics

Metadata

​Accessing Requests

​Dashboard View

​Request Details Drawer

​Filtering Requests

​Built-in Filters

​Advanced Filtering

​Querying via API

​Basic Query

​Filter by Custom Properties

​Complex Filters

​Export Large Datasets

​Common Use Cases

​Debug Failed Requests

​Analyze Slow Requests

​Track User-Specific Issues

​Monitor Cost by Feature

​Request Metadata

​Custom Request IDs

​Excluding Sensitive Data

​Performance Metrics

​Time to First Token (TTFT)

​Latency Analysis

​Related Features

Sessions

Custom Properties

User Metrics

Alerts

​Questions?

What’s Captured

Accessing Requests

Dashboard View

Request Details Drawer

Filtering Requests

Built-in Filters

Advanced Filtering

Querying via API

Basic Query

Filter by Custom Properties

Complex Filters

Export Large Datasets

Common Use Cases

Debug Failed Requests

Analyze Slow Requests

Track User-Specific Issues

Monitor Cost by Feature

Request Metadata

Custom Request IDs

Excluding Sensitive Data

Performance Metrics

Time to First Token (TTFT)

Latency Analysis

Related Features

Questions?