Chat Completions API

The Chat Completions endpoint follows the OpenAI-compatible format, making it a drop-in replacement for applications already using the OpenAI API. All requests flow through the AI gateway for routing, DLP scanning, and audit logging.

Endpoint

POST https://api.workjet.dev/v1/chat/completions

Authentication

Include your API key in the Authorization header. See Authentication for details.

Authorization: Bearer wj_live_...

Request Body

Field Type Required Description
model string Yes Model ID (e.g., claude-4-sonnet) or tier name (standard, premium, enterprise)
messages array Yes Array of message objects with role and content
stream boolean No Enable streaming responses. Default: false
max_tokens integer No Maximum number of tokens to generate. Overridden by gateway limits if lower.
temperature number No Sampling temperature (0.0 to 2.0). Default: 1.0
top_p number No Nucleus sampling parameter. Default: 1.0

Message Object

Field Type Description
role string One of: system, user, assistant
content string The message content

Request Example

curl -X POST https://api.workjet.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wj_live_a1b2c3d4..." \
  -d '{
    "model": "claude-4-sonnet",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant that provides concise answers."
      },
      {
        "role": "user",
        "content": "What is the Model Context Protocol?"
      }
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'

Response Format

{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1713100335,
  "model": "claude-4-sonnet",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The Model Context Protocol (MCP) is an open standard..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 156,
    "total_tokens": 198
  }
}

Streaming

Set "stream": true to receive server-sent events (SSE) as the model generates tokens. Each event contains a partial response chunk:

curl -X POST https://api.workjet.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wj_live_a1b2c3d4..." \
  -d '{
    "model": "claude-4-sonnet",
    "messages": [{ "role": "user", "content": "Hello!" }],
    "stream": true
  }'

Streamed response (SSE format):

data: {"id":"chatcmpl_abc123","choices":[{"delta":{"role":"assistant","content":"Hello"},"index":0}]}

data: {"id":"chatcmpl_abc123","choices":[{"delta":{"content":"!"},"index":0}]}

data: {"id":"chatcmpl_abc123","choices":[{"delta":{"content":" How"},"index":0}]}

data: {"id":"chatcmpl_abc123","choices":[{"delta":{"content":" can I help?"},"index":0,"finish_reason":"stop"}]}

data: [DONE]

Error Responses

Status Code Description
400 bad_request Invalid request body (missing messages, invalid model, etc.)
401 unauthorized Missing or invalid authentication
403 dlp_blocked Request blocked by DLP policy
429 rate_limited Rate limit exceeded
502 provider_error Upstream AI provider returned an error
503 provider_unavailable All configured providers for the requested tier are unavailable

OpenAI compatibility: If your application already uses the OpenAI SDK, you can point it at https://api.workjet.dev/v1 and use your Workjet API key as a drop-in replacement. The request and response formats are compatible.

Next Steps