Open Source · Apache 2.0

The AI Gateway
Control Plane.

Route, govern, and observe every LLM call — across Azure, AWS, Anthropic, GCP, and Ollama. Single binary. No Kubernetes. Ships in 30 seconds.

Book a Demo View on GitHub

OpenAI CompatiblePolicy EnforcementCost TrackingMCP GatewayRBAC + Teams

6Providers

1Binary

30sSetup

Live alpha — request access

localhost:8000 · Knull Admin

AI GATEWAY · LAST 30 DAYS

$47.85Total Spend

Requests68.4k

Tokens5.2M

Tok / Req76

Request Throughput

Daily — last 7 days

MTWTFSS

Cost Flow

Provider routing

~0msAPI Key Lookup

~0msBudget Check

6+LLM Providers

Drop-inOpenAI Compatible

SQLite / PGDatabase

Single FileBinary Size

NeverK8s Required

Built-inMCP Tools

EnterpriseRBAC

Real-timeCost Tracking

Unix SocketExtProc Transport

NativeAgents

~0msAPI Key Lookup

~0msBudget Check

6+LLM Providers

Drop-inOpenAI Compatible

SQLite / PGDatabase

Single FileBinary Size

NeverK8s Required

Built-inMCP Tools

EnterpriseRBAC

Real-timeCost Tracking

Unix SocketExtProc Transport

NativeAgents

CAPABILITIES

[ CAPABILITIES ]

EVERYTHING AN
AI GATEWAY NEEDS.

Purpose-built for teams shipping AI to production. Not a SaaS. Not a platform. A binary you run anywhere.

Data Plane

Multi-Provider Routing

Send OpenAI-format requests to any LLM provider. Azure, AWS Bedrock, Anthropic, GCP Vertex, Cohere, or local Ollama — Knull transforms and routes transparently.

Zero code changes
Automatic format translation
Fallback routing

Control Plane

Policy Enforcement

Per-key budget limits, model allow/deny lists, and token quotas. Enforced at the request layer in the ExtProc — before traffic ever reaches your provider.

Budget limits (USD/tokens)
Allow/deny model lists
Real-time enforcement

Access Control

RBAC & Teams

Teams, users, API keys, and roles. Enterprise-grade access control with full audit logs. Manage who can call which models, with what budget.

Team-scoped API keys
Role-based permissions
Immutable audit trail

Tool Execution

MCP Gateway

Proxy Model Context Protocol tool servers through the data plane. GitHub, Filesystem, custom tools — all with API key auth, usage tracking, and policy enforcement.

SSE & HTTP transport
Session management
Auth-enforced tool calls

Agentic Loops

Built-in Agents

Native agentic loop with tool use. Run agents via the REST API — model call, tool execution, result injection, repeat — until completion. No external orchestrator needed.

Multi-turn loops
MCP tool integration
Configurable max iterations

Analytics

Full Observability

Token usage, USD cost tracking, latency histograms, and model performance — all in real-time. Prometheus metrics and a built-in analytics dashboard.

Cost per model/key/team
Latency percentiles
Prometheus-compatible

WORKFLOW

[ HOW IT WORKS ]

FROM CONFIG TO COMPLETION IN 30 SECONDS.

Step 01 / 04

Write your config

One YAML file. List your providers with API keys, set your gateway port. No CRDs, no Helm charts, no service mesh.

Write your config

Start the binary

Send requests

Observe everything

write_your_config.sh

gateways:
  - name: aigw
    port: 1975
models:
  - id: gpt-4o-mini
    provider: azure
    apiKey: ${AZURE_API_KEY}
  - id: claude-3-5-sonnet
    provider: anthropic
    apiKey: ${ANTHROPIC_KEY}
  - id: gemini-2
    provider: gcp_vertex
    gcpProject: ${GCP_PROJECT}

SCROLL FOR NEXT STEP

INTEGRATION

[ CODE FIRST ]

WORKS WITH EVERY
OPENAI CLIENT.

Knull is a drop-in replacement for any OpenAI-compatible client. Zero code changes. Route to Azure, AWS, Anthropic, GCP, or Ollama — all through the same endpoint.

→OpenAI SDK, LangChain, LlamaIndex — all work unchanged

→API keys never leave your infra — Knull holds credentials

→Switch providers without touching application code

→Streaming, function calling, and JSON mode supported

1975

Data Plane Port

8000

Admin API Port

1064

Metrics Port

HTTP/1.1

Transport

# Knull Core Configuration
# Works with any OpenAI-compatible client

gateways:
  - name: aigw-run
    port: 1975

models:
  # Azure OpenAI
  - id: gpt-4o-mini
    provider: azure
    endpoint: ${AZURE_OPENAI_ENDPOINT_HOSTNAME}
    apiKey: ${AZURE_OPENAI_API_KEY}

  # AWS Bedrock (Claude 3.5)
  - id: claude-3-5-sonnet
    provider: aws_anthropic
    endpoint: bedrock-runtime.us-east-1.amazonaws.com
    awsRegion: us-east-1
    awsAccessKey: ${AWS_ACCESS_KEY_ID}
    awsSecretKey: ${AWS_SECRET_ACCESS_KEY}

  # Anthropic Direct
  - id: claude-opus-4
    provider: anthropic
    apiKey: ${ANTHROPIC_API_KEY}

  # GCP Vertex
  - id: gemini-2-pro
    provider: gcp_vertex
    endpoint: us-central1-aiplatform.googleapis.com
    gcpProject: ${GCP_PROJECT_ID}
    gcpRegion: us-central1

  # Local Ollama
  - id: llama-3
    provider: openai_compatible
    endpoint: localhost:11434

PROVIDERS

[ PROVIDER SUPPORT ]

ROUTE TO ANY PROVIDER.
ONE ENDPOINT.

One port. One API format. Every provider. Knull handles request transformation, credential injection, and protocol differences transparently.

OpenAI

OpenAI native

GPT-4o, GPT-4o-mini and all OpenAI models

SUPPORTED

Azure OpenAI

HTTP/1.1 enforced

Azure deployments with auto API key injection

SUPPORTED

AWS Bedrock

SigV4 signing

Claude, Llama, Titan via inference profiles

SUPPORTED

Anthropic

Anthropic native

Direct Anthropic API with streaming

SUPPORTED

GCP Vertex

OAuth2 bearer

Gemini models via Vertex AI endpoint

SUPPORTED

Cohere

Cohere native

Command models with full chat support

SUPPORTED

Ollama

OpenAI compatible

Local models — Llama, Mistral, Phi, etc.

SUPPORTED

Custom

OpenAI compatible

Any OpenAI-compatible endpoint with schema override

SUPPORTED

Any OpenAI-compatible endpoint works via provider: openai_compatible

INTERNALS

[ CONTROL PLANE ]

Six layers between your request and the provider.

Policy enforcement, cost attribution, smart routing, and full observability — built into every call. Not bolted on afterward.

[ REQUEST TRACE · LIVE ]

6 control layers, every request

01 · API Gateway

02 · Policy Engine

03 · Cost Aggregator

04 · Smart Router

05 · Observability

06 · MCP Gateway

API Gateway · Active

→  INTERCEPTED
POST /v1/chat/completions
key: sk-prod-az-8f3c…
team: ml-infra · model: gpt-4o
✓  routed to policy engine

Overhead< 2ms

Layer 01· OPENAI-COMPATIBLE · PORT 1975

Drop-in. No code changes.

Your apps keep calling the OpenAI API. Knull intercepts every request at the edge — adding auth, routing, and policy — without a single line of application code changing.

OpenAI SDK compatible, zero refactoring

Virtual key auth + TLS termination

Sub-2ms latency overhead

Overhead< 2ms

Layer 02· RBAC · RATE LIMITS · ALLOWLISTS

Enforce rules before tokens fire.

Define exactly who can call what model, when, and how much — at the team, key, or request level. Policy violations are rejected inline. No exceptions.

Team + key-level RBAC, model allowlists

Per-team budget caps with hard enforcement

Rate limiting — configurable per key

Eval latency< 0.5ms

Layer 03· REAL-TIME SPEND ATTRIBUTION

No more end-of-month bill surprises.

Every token counted, attributed, and capped in real time — by team, by key, by model. Automatic budget stops before overruns happen. Export to finance tools, not just dashboards.

Token-level cost per team / key / model

Hard budget stops — no partial overruns

Normalized cost across all providers

Granularity$0.0001

Layer 04· FAILOVER · COST ROUTING · CANARY

The right provider, every request.

Route by cost, latency, or custom rules. Fail over automatically when a provider is down. A/B test models in production. All config-driven — no deploys.

Automatic provider failover < 50ms

Cost-optimized and latency-aware routing

Canary routing for model experiments

Failover< 50ms

Layer 05· AUDIT · ANALYTICS · ANOMALIES

Full audit trail. Zero extra setup.

Every LLM call logged with full context — model, team, tokens, cost, latency. Export to Datadog or Grafana natively. Detect spend anomalies before they become incidents.

Immutable audit log, every request

Prometheus metrics + Grafana dashboards

Anomaly detection with configurable alerts

RetentionConfigurable

Layer 06· TOOL PROXY · AGENT ACCESS CONTROL

AI agents that use tools — safely.

Proxy any MCP tool server through the same control plane. Policy, cost tracking, and audit logging for agent tool calls — not just model calls. One gateway for all AI activity.

Unified MCP tool registry for all agents

Per-agent access control policies

Full tool call session audit trail

ProtocolMCP 2024-11

Exposed ports

:1975DATA PLANELLM proxy ingress

:8000ADMIN APIConfig + management

:9856MCP PROXYTool call gateway

:1064METRICSPrometheus endpoint

[ QUICKSTART ]

START IN 30 SECONDS.

Single binary. Self-contained. Write your config, run the binary, send requests. That's the entire setup.

$
# Build from source
git clone https://github.com/knull-sh
cd knull
make build

# Run with your config
./bin/knull run examples/knull.yaml

Book a Demo View on GitHub

Apache 2.0 License · Free to use, self-host, and modify

The AI GatewayControl Plane.

EVERYTHING ANAI GATEWAY NEEDS.

Multi-Provider Routing

Policy Enforcement

RBAC & Teams

MCP Gateway

Built-in Agents

Full Observability

FROM CONFIG TO COMPLETION IN 30 SECONDS.

Write your config

WORKS WITH EVERYOPENAI CLIENT.

ROUTE TO ANY PROVIDER.ONE ENDPOINT.

Six layers between your request and the provider.

Drop-in. No code changes.

Enforce rules before tokens fire.

No more end-of-month bill surprises.

The right provider, every request.

Full audit trail. Zero extra setup.

AI agents that use tools — safely.

START IN 30 SECONDS.

The AI Gateway
Control Plane.

EVERYTHING AN
AI GATEWAY NEEDS.

WORKS WITH EVERY
OPENAI CLIENT.

ROUTE TO ANY PROVIDER.
ONE ENDPOINT.