Architecture Overview

This document provides a high-level overview of the architecture of the Inference Gateway. The Inference-Gateway is designed to be modular and extensible, allowing easy integration of new models and providers.

General Overview

Kubernetes Setup

The Inference Gateway is designed to run on Kubernetes. The following diagram shows the high-level architecture of the Inference Gateway running on Kubernetes.