DLP Pipeline

The Data Loss Prevention pipeline is a core security component of the Workjet gateway. It scans every AI request and response for sensitive data patterns, applying configurable actions to protect information before it reaches external AI providers.

Pipeline Flow

  1. Request arrives: A chat completion or automation request enters the gateway
  2. Authentication: The request is authenticated (session or API key)
  3. Request scan: The DLP pipeline scans the message content against all active patterns
  4. Action applied: If a match is found, the configured action is applied (redact, block, or warn)
  5. Processing: If not blocked, the request proceeds to the AI provider
  6. Response scan: The AI response is scanned by the same DLP patterns
  7. Action applied: If a match is found in the response, the action is applied
  8. Audit logged: The DLP result (clean, redacted, blocked, warned) is recorded
  9. Response delivered: The (potentially redacted) response is sent to the client

  Request → Auth → DLP Scan → [Redact/Block/Pass] → AI Provider
                                                          ↓
  Client  ← Audit Log ← DLP Scan ← [Redact/Block/Pass] ← Response
  

Pattern Matching

The DLP pipeline uses regex-based pattern matching. Each pattern runs against the full message content. Built-in patterns are optimized for common sensitive data types:

  • Credit cards: Luhn-validated number patterns for Visa, MasterCard, Amex, Discover
  • SSNs: US Social Security Number format with context-aware matching
  • Email addresses: RFC-compliant email pattern
  • Phone numbers: US and international formats with common separators
  • AWS keys: AKIA prefix pattern for AWS access key IDs
  • API secrets: Common API key prefixes (sk_live, bearer, api_key, etc.)

Performance

The DLP pipeline adds minimal latency to request processing:

  • Pattern matching runs in under 1ms for typical messages
  • All patterns are compiled once at worker startup
  • Scanning is performed in-memory with no disk I/O
  • The pipeline processes both request and response in the same worker execution

Configuration

DLP policies are managed per-tenant via the Portal gateway admin. See DLP Policies for configuration details including custom patterns, actions, and per-tenant policies.

Testing tip: Use the "warn" action initially when deploying new patterns. This logs detections without disrupting user workflows, letting you tune patterns for accuracy before switching to "redact" or "block."

Next Steps