One API.
Every model.
A unified, model-agnostic gateway that routes to OpenAI, Anthropic, Google, and more. Add MCP tools from our marketplace with a single parameter.
How the API works
Your app talks to one endpoint. We handle authentication, provider routing, MCP tool execution, and streaming -- all behind a single OpenAI-compatible interface.
AI Providers
MCP Servers
Every major provider, one interface
Switch between models with a single parameter. No SDK changes, no provider-specific code. The same request format works everywhere.
OpenAI
GPT-5, o3, DALL-E, Whisper
Anthropic
Opus 4.6, Sonnet, Haiku
Gemini 3.0 Flash, Pro
xAI
Grok 4, Grok 4 Mini
DeepSeek
Chat, Coder, Reasoner
Mistral
Large, Medium
Plus Groq, Fireworks, and more. New providers added regularly.
Everything you need in one gateway
Drop-in OpenAI compatibility plus the features you actually need for production agents.
Multi-provider routing
Native MCP support
Bring Your Own Key
Structured outputs
Vision & multimodal
Real-time streaming
Multi-model handoffs
Rate limiting & metering
Your secrets never leave your machine
DAuth is our managed authorization system. Remote MCP servers use your local credentials without ever seeing them -- credentials are isolated in a sealed execution boundary.
Zero secret leakage
Credentials are encrypted client-side and decrypted only inside a sealed execution boundary. Your code never sees raw secrets.
Sender-constrained tokens
Demonstrating Proof-of-Possession (DPoP) binds tokens cryptographically to the client. A stolen token is useless without the private key.
Networkless execution
Credential decryption and API calls happen entirely within an isolated enclave. Raw secrets never traverse the network.
Lifecycle of a request
Every API call follows the same five-stage pipeline. Click a stage to see what happens under the hood.
Validate API key, check org status, load tier limits and rate quotas.
Select the target provider, map model parameters, apply BYOK overrides if present.
Resolve MCP slugs, establish server connections, run tool calls server-side.
SSE stream incremental deltas back to the client as they are generated.
Meter token usage, emit rate-limit headers, return the final structured response.
Start in minutes
Drop-in compatible with the OpenAI SDK. Switch your base URL and you're done.
from dedalus_labs import Dedalus
client = Dedalus(api_key="your-api-key")
response = client.chat.completions.create(
model="openai/gpt-5",
messages=[
{"role": "user", "content": "Search for the latest AI news"}
],
mcp_servers=["tsion/exa"],
stream=True,
)
for chunk in response:
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Endpoints at a glance
Chat completions, embeddings, image generation, audio, OCR, and more. Every endpoint follows the same auth and streaming patterns.
Core
/v1/chat/completionsChat with any model, stream responses, call MCP tools/v1/modelsList all available models across providers/v1/embeddingsGenerate vector embeddings with OpenAI or GoogleMedia
/v1/images/generationsGenerate images with DALL-E and GPT Image/v1/audio/speechText-to-speech, transcription, and translation/v1/ocrExtract text from images and documentsManagement
/v1/private/keysCreate, rotate, and manage API keys/v1/private/subscription/statusCheck subscription tier, rate limits, and usageShip your first request in 30 seconds
Get an API key, pick a model, attach MCP tools. That's it.