Unified API access
Talk to OpenAI, Anthropic, Groq, Cohere, Ollama, DeepSeek, Cloudflare, Google, Mistral and Moonshot through one OpenAI-compatible endpoint.
Supported providers
Open-source, cloud-native proxy unifying OpenAI, Anthropic, Groq, Cohere, Ollama, DeepSeek, Cloudflare, Google, Mistral and Moonshot behind a single OpenAI-compatible API.

~10.8 MB static binary. Minimal CPU and memory footprint. Designed to scale horizontally with HPA in Kubernetes.
No analytics, no telemetry phoning home. Self-host anywhere - on-prem, cloud, or air-gapped.
Building against multiple LLM providers means juggling SDKs, API quirks, auth schemes, and streaming protocols that drift constantly. Inference Gateway sits in front of every provider and exposes a single, stable, OpenAI-compatible surface so your application code never has to care which model is on the other end.
Inference Gateway acts as an intermediary between your applications and various LLM providers. By standardising the API interactions, it lets you:
Native support for the Model Context Protocol lets LLMs automatically access external tools and data sources. With MCP integration, you can:
# Enable MCP with multiple servers
export MCP_ENABLE=true
export MCP_SERVERS="http://filesystem-server:8081/mcp,http://search-server:8082/mcp"
# LLMs automatically get access to all available tools
curl -X POST http://localhost:8080/v1/chat/completions \
-d '{"model": "deepseek/deepseek-v4-flash", "messages": [{"role": "user", "content": "List files and search for recent AI news"}]}'Learn more about MCP Integration and explore the examples.
Agent-to-Agent support lets LLMs coordinate with multiple specialised agents in a single conversation. Agents can:
The best way to use A2A is through the Inference Gateway CLI, which provides seamless integration with A2A agents:
# Install the CLI
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash
# Initialize and start chatting
infer init
infer chat
# Delegate tasks to A2A agents
> "Schedule a team meeting for tomorrow at 2 PM"
> "Check my calendar for conflicts this week"Learn more about A2A Integration and see how to build your own agents.
Prefer to define an agent as code? The Agent Definition Language (ADL) describes an entire A2A agent - provider, model, tools, skills, server, and deployment - in a single declarative agent.yaml file. The ADL CLI turns that manifest into an enterprise-ready Go or Rust project, so the agent stays version-controlled and reproducible.
# Scaffold, validate, and generate an A2A agent from a declarative manifest
adl init my-weather-agent
adl validate agent.yaml
adl generate --file agent.yaml --output ./my-weather-agentRead the Agent Definition Language overview to see how ADL, the ADL CLI, and the ADK fit together, or jump straight to the canonical spec at adl.inference-gateway.com.
Inference Gateway is an open-source project maintained by a growing community. Contributions are welcome on GitHub.