Unified API access
Talk to OpenAI, Anthropic, Groq, Cohere, Ollama, DeepSeek, Cloudflare, Google, Mistral and Moonshot through one OpenAI-compatible endpoint.
Supported providers
Open-source, cloud-native proxy unifying OpenAI, Anthropic, Groq, Cohere, Ollama, DeepSeek, Cloudflare, Google, Mistral and Moonshot behind a single OpenAI-compatible API.

~10.8 MB static binary. Minimal CPU and memory footprint. Designed to scale horizontally with HPA in Kubernetes.
No analytics, no telemetry phoning home. Self-host anywhere - on-prem, cloud, or air-gapped.
Building against multiple LLM providers means juggling SDKs, API quirks, auth schemes, and streaming protocols that drift constantly. Inference Gateway sits in front of every provider and exposes a single, stable, OpenAI-compatible surface so your application code never has to care which model is on the other end.
Inference Gateway acts as an intermediary between your applications and various LLM providers. By standardising the API interactions, it lets you:
Native support for the Model Context Protocol lets LLMs automatically access external tools and data sources. With MCP integration, you can:
# Enable MCP with multiple servers
export MCP_ENABLE=true
export MCP_SERVERS="http://filesystem-server:8081/mcp,http://search-server:8082/mcp"
# LLMs automatically get access to all available tools
curl -X POST http://localhost:8080/v1/chat/completions \
-d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "List files and search for recent AI news"}]}'Learn more about MCP Integration and explore the examples.
Agent-to-Agent support lets LLMs coordinate with multiple specialised agents in a single conversation. Agents can:
The best way to use A2A is through the Inference Gateway CLI, which provides seamless integration with A2A agents:
# Install the CLI
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash
# Initialize and start chatting
infer init
infer chat
# Delegate tasks to A2A agents
> "Schedule a team meeting for tomorrow at 2 PM"
> "Check my calendar for conflicts this week"Learn more about A2A Integration and see how to build your own agents.
Inference Gateway is an open-source project maintained by a growing community. Contributions are welcome on GitHub.