Deployment with Helm

This guide explains how to deploy the Inference Gateway using the official Helm charts.

Prerequisites

Kubernetes cluster (v1.30+)
Helm installed (v3.8+)
kubectl configured to access your cluster

Installation Options

There are two primary deployment options:

Basic Gateway Deployment: Only deploys the Inference Gateway service
UI with Gateway Deployment: Deploys both the Inference Gateway UI and the Gateway service

Option 1: Basic Gateway Deployment

Install the chart directly from OCI registry:

Terminal

helm upgrade --install inference-gateway oci://ghcr.io/inference-gateway/charts/inference-gateway \
  --namespace inference-gateway \
  --create-namespace \
  --version 0.6.2

To enable ingress for the basic deployment:

Terminal

helm upgrade --install inference-gateway oci://ghcr.io/inference-gateway/charts/inference-gateway \
  --namespace inference-gateway \
  --create-namespace \
  --set ingress.enabled=true \
  --version 0.6.2

Option 2: UI with Gateway Deployment

Setting Up Required Secrets

Before deploying the UI with Gateway, you may need to set up required secrets:

Terminal

# Create namespace if it doesn't exist
kubectl create namespace inference-gateway --dry-run=client -o yaml | kubectl apply -f -

# Create secret for API keys (example for DeepSeek)
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: inference-gateway
  namespace: inference-gateway
  annotations:
    meta.helm.sh/release-name: inference-gateway-ui
    meta.helm.sh/release-namespace: inference-gateway
  labels:
    app.kubernetes.io/managed-by: Helm
type: Opaque
stringData:
  DEEPSEEK_API_KEY: your-deepseek-api-key
EOF

Basic UI Deployment

Terminal

helm upgrade --install inference-gateway-ui \
  oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
  --version 0.7.1 \
  --create-namespace \
  --namespace inference-gateway \
  --set gateway.enabled=true \
  --set gateway.envFrom.secretRef=inference-gateway

UI Deployment with Ingress

Terminal

helm upgrade --install inference-gateway-ui \
  oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
  --version 0.7.1 \
  --create-namespace \
  --namespace inference-gateway \
  --set gateway.enabled=true \
  --set gateway.envFrom.secretRef=inference-gateway \
  --set ingress.enabled=true \
  --set ingress.className=nginx \
  --set "ingress.hosts[0].host=ui.inference-gateway.local" \
  --set "ingress.hosts[0].paths[0].path=/" \
  --set "ingress.hosts[0].paths[0].pathType=Prefix" \
  --set "ingress.tls[0].secretName=inference-gateway-ui-tls" \
  --set "ingress.tls[0].hosts[0]=ui.inference-gateway.local"

TLS Certificate Management

Option 1: Manual TLS Secret Creation

For production deployments, you can manually create a TLS secret with your certificates:

Terminal

kubectl create secret tls inference-gateway-ui-tls \
  --namespace inference-gateway \
  --cert=/path/to/tls.crt \
  --key=/path/to/tls.key

Option 2: cert-manager (Recommended)

For production deployments, we recommend using cert-manager to automate certificate management. This approach automates certificate issuance and renewal:

Install cert-manager if not already installed:

Terminal

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.17.1 \
  --set crds.enabled=true

Create an Issuer or ClusterIssuer (example with Let's Encrypt):

Terminal

kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: [email protected]
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
EOF

Update your Helm installation to use cert-manager annotations:

Terminal

helm upgrade --install inference-gateway-ui \
  oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
  --version 0.7.1 \
  --create-namespace \
  --namespace inference-gateway \
  --set gateway.enabled=true \
  --set gateway.envFrom.secretRef=inference-gateway \
  --set ingress.enabled=true \
  --set ingress.className=nginx \
  --set "ingress.hosts[0].host=your-domain.example.com" \
  --set "ingress.hosts[0].paths[0].path=/" \
  --set "ingress.hosts[0].paths[0].pathType=Prefix" \
  --set "ingress.tls[0].secretName=inference-gateway-ui-tls" \
  --set "ingress.tls[0].hosts[0]=your-domain.example.com" \
  --set "ingress.annotations.cert-manager\.io/cluster-issuer=letsencrypt-prod"

With this setup, cert-manager will automatically request and manage TLS certificates from Let's Encrypt.

Configuration

To see the available configurations, run:

Terminal

# For basic gateway
helm show values oci://ghcr.io/inference-gateway/charts/inference-gateway

# For UI with gateway
helm show values oci://ghcr.io/inference-gateway/charts/inference-gateway-ui

Horizontal Pod Autoscaling

Terminal

--set-string "env[1].value=1" \
--set-string "env[2].name=INFERENCE_GATEWAY_URL" \
--set-string "env[2].value=http://inference-gateway:8080/v1"

Accessing the UI

If you haven't configured ingress, you can use port forwarding:

Terminal

kubectl port-forward -n inference-gateway svc/inference-gateway-ui 3000:80

Then access the UI at http://localhost:3000

Upgrading

To upgrade an existing installation:

Terminal

# For basic gateway
helm upgrade --install inference-gateway oci://ghcr.io/inference-gateway/charts/inference-gateway \
  --namespace inference-gateway \
  -f values.yaml \
  --version 0.6.2

# For UI with gateway
helm upgrade --install inference-gateway-ui oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
  --namespace inference-gateway \
  -f values.yaml \
  --version 0.7.1

Uninstalling

To completely remove the installation:

Terminal

# For basic gateway
helm uninstall inference-gateway --namespace inference-gateway

# For UI with gateway
helm uninstall inference-gateway-ui --namespace inference-gateway

# Remove the namespace
kubectl delete namespace inference-gateway

Checking Deployment Status

Terminal

# Check running pods
kubectl get pods -n inference-gateway

# Check services
kubectl get svc -n inference-gateway

# Check ingress (if enabled)
kubectl get ingress -n inference-gateway

Chart Sources

The Helm chart sources are available at: