Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/helicone/helicone/llms.txt

Use this file to discover all available pages before exploring further.

Helicone provides multiple ways to integrate with OpenAI, from simple proxy routing to zero-latency async logging.

Integration Methods

Proxy Integration

Route requests through Helicone’s proxy. Simple setup with minimal code changes.

AI Gateway

Use the AI Gateway for access to multiple providers including OpenAI.

Async Logging

Zero-latency logging using OpenLLMetry. No proxy required.

Azure OpenAI

Special integration for Azure-hosted OpenAI models.

Quick Start

1

Get your API keys

You’ll need:
2

Update your base URL

Change the base URL to route requests through Helicone:
import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://oai.helicone.ai/v1",
  apiKey: process.env.OPENAI_API_KEY,
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
  },
});
3

Make requests as usual

Use the OpenAI SDK normally:
const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" },
  ],
});

console.log(response.choices[0].message.content);
4

View your logs

All requests are now logged to your Helicone dashboard.

Streaming Support

Helicone fully supports OpenAI streaming:
const stream = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Write a poem" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Custom Properties

Track custom metadata with your requests:
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  },
  {
    headers: {
      "Helicone-User-Id": "user-123",
      "Helicone-Session-Id": "session-456",
      "Helicone-Property-Environment": "production",
      "Helicone-Property-App": "chatbot",
    },
  }
);

Session Tracking

Track multi-turn conversations and agent workflows:
import { v4 as uuidv4 } from "uuid";

const sessionId = uuidv4();

// First request
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Path": "/chat/greeting",
      "Helicone-Session-Name": "Customer Chat",
    },
  }
);

// Follow-up request in same session
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [
      { role: "user", content: "Hello!" },
      { role: "assistant", content: "Hi! How can I help?" },
      { role: "user", content: "Tell me about your services" },
    ],
  },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Path": "/chat/services",
    },
  }
);
Learn more about Session Tracking.

Response Caching

Reduce costs and latency with response caching:
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "What is 2+2?" }],
  },
  {
    headers: {
      "Helicone-Cache-Enabled": "true",
    },
  }
);

// Subsequent identical requests return cached responses
Learn more about Response Caching.

Rate Limiting

Control usage per user or API key:
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  },
  {
    headers: {
      "Helicone-RateLimit-Policy": "100;w=60;s=user", // 100 requests per 60 seconds per user
      "Helicone-User-Id": "user-123",
    },
  }
);
Learn more about Rate Limiting.

Function Calling

Helicone fully supports OpenAI function calling:
const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "What's the weather in Paris?" }],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get the current weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city name",
            },
          },
          required: ["location"],
        },
      },
    },
  ],
});

// Function calls are logged with full details
console.log(response.choices[0].message.tool_calls);

Vision Models

Use GPT-4 Vision with Helicone:
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        {
          type: "image_url",
          image_url: {
            url: "https://example.com/image.jpg",
          },
        },
      ],
    },
  ],
});

Embeddings

Track embedding requests:
const embedding = await client.embeddings.create(
  {
    model: "text-embedding-3-small",
    input: "The quick brown fox jumps over the lazy dog",
  },
  {
    headers: {
      "Helicone-Property-Purpose": "document-indexing",
    },
  }
);

Azure OpenAI

Integrate with Azure-hosted OpenAI:
import { AzureOpenAI } from "openai";

const client = new AzureOpenAI({
  baseURL: "https://azure.helicone.ai",
  apiKey: process.env.AZURE_OPENAI_API_KEY,
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    "Helicone-Azure-Base-Url": process.env.AZURE_OPENAI_ENDPOINT,
  },
  apiVersion: "2024-02-01",
});

const response = await client.chat.completions.create({
  model: "gpt-4", // Your Azure deployment name
  messages: [{ role: "user", content: "Hello!" }],
});

Error Tracking

Helicone automatically tracks errors and retry attempts:
try {
  const response = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  });
} catch (error) {
  // Error is logged to Helicone with full context
  console.error("Request failed:", error);
}
View errors in your dashboard with:
  • Error message and type
  • Request parameters
  • Retry attempts
  • User and session context

Zero-Latency Integration

For production applications where latency is critical, use async logging:
import { HeliconeAsyncLogger } from "@helicone/async";
import OpenAI from "openai";

const logger = new HeliconeAsyncLogger({
  apiKey: process.env.HELICONE_API_KEY,
  providers: { openAI: OpenAI },
});
logger.init();

// Use OpenAI SDK normally - no proxy latency
const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});
Learn more about Async Logging.

Best Practices

Never hardcode API keys. Use environment variables:
# .env
OPENAI_API_KEY=sk-...
HELICONE_API_KEY=sk-helicone-...
Always include user and session IDs for better observability:
headers: {
  "Helicone-User-Id": userId,
  "Helicone-Session-Id": sessionId,
}
Add context with custom properties:
headers: {
  "Helicone-Property-Environment": "production",
  "Helicone-Property-Feature": "chatbot",
  "Helicone-Property-Version": "v1.2.3",
}
Reduce costs with caching:
headers: {
  "Helicone-Cache-Enabled": "true",
}

Troubleshooting

  1. Verify your Helicone API key is correct
  2. Check that the Helicone-Auth header is formatted correctly: Bearer sk-helicone-...
  3. Ensure you’re using the correct base URL: https://oai.helicone.ai/v1
  4. Check your network allows connections to helicone.ai
  • Make sure you’re passing your OpenAI API key (not Helicone key) as the apiKey
  • Pass your Helicone API key in the Helicone-Auth header
  • Format: Helicone-Auth: Bearer sk-helicone-...
  • Helicone proxy adds ~20-50ms latency
  • For zero-latency, use async logging
  • Check your geographic region - we have global endpoints

Next Steps

Session Tracking

Track multi-turn conversations

Custom Properties

Add custom metadata

Response Caching

Reduce costs with caching

Prompt Management

Version and deploy prompts