AI Gateway

The Workjet AI Gateway is the single chokepoint through which every AI request flows. Whether a message comes from interactive chat, a background assistant, or a scheduled engine, it passes through the gateway before reaching any AI provider. This architecture gives IT and compliance teams complete visibility and control over how AI is used across the organization.

How Requests Flow

Every AI interaction follows the same path:

Desktop App sends a chat completion request to the gateway
Gateway authenticates the request (session cookie or API key)
DLP pipeline scans the request for sensitive data (credit cards, SSNs, API keys, etc.)
Rate limiter checks per-user request limits
Model router selects the appropriate AI provider based on tier and priority rules
AI Provider (Anthropic, OpenAI, Google, or Ollama) processes the request
Gateway scans the response for sensitive data
Audit logger records the full interaction (user, model, tokens, cost, DLP result)
Response streams back to the desktop app

Desktop App                    AI Gateway                      AI Provider
+-----------+    request    +----------------+    routed    +-------------+
|  Chat /   | -----------> | Auth           | ----------> | Anthropic   |
| Assistant | <----------- | DLP Scan       | <---------- | OpenAI      |
|  Engine   |   response   | Rate Limit     |   response  | Google      |
+-----------+              | Route          |             | Ollama      |
                           | Audit Log      |             +-------------+
                           +----------------+

Why a Single Chokepoint?

Routing all AI traffic through one gateway provides several critical benefits:

Visibility: Every AI interaction is logged with user identity, model, tokens, cost, and DLP results
Cost control: Set monthly budgets and track spending by model and user
Data protection: Scan every request and response for sensitive data before it reaches external providers
Model governance: Control which models are available and how requests are routed
Rate limiting: Prevent abuse with per-user request limits
Compliance: Immutable audit trail for regulatory requirements

Architecture: The gateway runs on Cloudflare Workers at the edge, providing low-latency request processing. Authentication state is managed via Cloudflare KV, DLP policies are stored in D1, and audit logs are written to R2 for durable storage.

Gateway Features

Feature	Description	Learn More
Model Routing	Tier-based routing with provider fallback and token limits	Model Routing
DLP	Pattern-based scanning with redact, block, and warn actions	DLP Policies
Audit Logging	Immutable record of every AI interaction	Audit Logging
Cost Controls	Per-model rates, monthly budgets, usage breakdown	Cost Controls
Rate Limiting	Per-user limits with configurable windows and thresholds	Rate Limiting

Managed vs. Self-Hosted

Workjet offers two deployment options for the gateway:

Managed (default): The gateway runs at api.workjet.dev, fully managed by Workjet. No infrastructure to maintain.
Self-hosted (Enterprise): Deploy the gateway on your own Cloudflare account for complete data sovereignty. Available on the Enterprise plan.

Next Steps

Configure model routing for your organization
Set up DLP policies to protect sensitive data
Review audit logs for compliance
Manage cost controls and budgets