Architecture Overview
This document provides a high-level overview of the architecture of the Inference Gateway. The Inference-Gateway is designed to be modular and extensible, allowing easy integration of new models and providers.
General Overview
Kubernetes Setup
The Inference Gateway is designed to run on Kubernetes. The following diagram shows the high-level architecture of the Inference Gateway running on Kubernetes.