Deployment with Helm
This guide explains how to deploy the Inference Gateway using the official Helm charts.
Prerequisites
- Kubernetes cluster (v1.30+)
- Helm installed (v3.8+)
- kubectl configured to access your cluster
Installation Options
There are two primary deployment options:
- Basic Gateway Deployment: Only deploys the Inference Gateway service
- UI with Gateway Deployment: Deploys both the Inference Gateway UI and the Gateway service
Option 1: Basic Gateway Deployment
Install the chart directly from OCI registry:
helm upgrade --install my-release oci://ghcr.io/inference-gateway/charts/inference-gateway \
--namespace inference-gateway \
--create-namespace \
--version 0.5.0
To enable ingress for the basic deployment:
helm upgrade --install my-release oci://ghcr.io/inference-gateway/charts/inference-gateway \
--namespace inference-gateway \
--create-namespace \
--set ingress.enabled=true \
--version 0.5.0
Option 2: UI with Gateway Deployment
Setting Up Required Secrets
Before deploying the UI with Gateway, you may need to set up required secrets:
# Create namespace if it doesn't exist
kubectl create namespace inference-gateway --dry-run=client -o yaml | kubectl apply -f -
# Create secret for API keys (example for DeepSeek)
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: inference-gateway
namespace: inference-gateway
annotations:
meta.helm.sh/release-name: inference-gateway-ui
meta.helm.sh/release-namespace: inference-gateway
labels:
app.kubernetes.io/managed-by: Helm
type: Opaque
stringData:
DEEPSEEK_API_KEY: your-deepseek-api-key
EOF
Basic UI Deployment
helm upgrade --install inference-gateway-ui \
oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
--version 0.5.0 \
--create-namespace \
--namespace inference-gateway \
--set gateway.enabled=true \
--set gateway.envFrom.secretRef=inference-gateway
UI Deployment with Ingress
helm upgrade --install inference-gateway-ui \
oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
--version 0.5.0 \
--create-namespace \
--namespace inference-gateway \
--set gateway.enabled=true \
--set gateway.envFrom.secretRef=inference-gateway \
--set ingress.enabled=true \
--set ingress.className=nginx \
--set "ingress.hosts[0].host=ui.inference-gateway.local" \
--set "ingress.hosts[0].paths[0].path=/" \
--set "ingress.hosts[0].paths[0].pathType=Prefix" \
--set "ingress.tls[0].secretName=inference-gateway-ui-tls" \
--set "ingress.tls[0].hosts[0]=ui.inference-gateway.local"
TLS Certificate Management
Option 1: Manual TLS Secret Creation
For production deployments, you can manually create a TLS secret with your certificates:
kubectl create secret tls inference-gateway-ui-tls \
--namespace inference-gateway \
--cert=/path/to/tls.crt \
--key=/path/to/tls.key
Option 2: cert-manager (Recommended)
For production deployments, we recommend using cert-manager to automate certificate management. This approach automates certificate issuance and renewal:
- Install cert-manager if not already installed:
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.17.1 \
--set crds.enabled=true
- Create an Issuer or ClusterIssuer (example with Let's Encrypt):
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
EOF
- Update your Helm installation to use cert-manager annotations:
helm upgrade --install inference-gateway-ui \
oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
--version 0.5.0 \
--create-namespace \
--namespace inference-gateway \
--set gateway.enabled=true \
--set gateway.envFrom.secretRef=inference-gateway \
--set ingress.enabled=true \
--set ingress.className=nginx \
--set "ingress.hosts[0].host=your-domain.example.com" \
--set "ingress.hosts[0].paths[0].path=/" \
--set "ingress.hosts[0].paths[0].pathType=Prefix" \
--set "ingress.tls[0].secretName=inference-gateway-ui-tls" \
--set "ingress.tls[0].hosts[0]=your-domain.example.com" \
--set "ingress.annotations.cert-manager\.io/cluster-issuer=letsencrypt-prod"
With this setup, cert-manager will automatically request and manage TLS certificates from Let's Encrypt.
Configuration
To see the available configurations, run:
# For basic gateway
helm show values oci://ghcr.io/inference-gateway/charts/inference-gateway
# For UI with gateway
helm show values oci://ghcr.io/inference-gateway/charts/inference-gateway-ui
Common Configuration Options
Resource Limits
--set resources.limits.cpu=500m \
--set resources.limits.memory=512Mi \
--set resources.requests.cpu=100m \
--set resources.requests.memory=128Mi
Environment Variables for UI
--set-string "env[0].name=NODE_ENV" \
--set-string "env[0].value=production" \
--set-string "env[1].name=NEXT_TELEMETRY_DISABLED" \
--set-string "env[1].value=1" \
--set-string "env[2].name=INFERENCE_GATEWAY_URL" \
--set-string "env[2].value=http://inference-gateway:8080/v1"
Accessing the UI
If you haven't configured ingress, you can use port forwarding:
kubectl port-forward -n inference-gateway svc/inference-gateway-ui 3000:80
Then access the UI at http://localhost:3000
Upgrading
To upgrade an existing installation:
# For basic gateway
helm upgrade --install my-release oci://ghcr.io/inference-gateway/charts/inference-gateway \
--namespace inference-gateway \
-f values.yaml \
--version 0.5.0
# For UI with gateway
helm upgrade --install inference-gateway-ui oci://ghcr.io/inference-gateway/charts/inference-gateway-ui \
--namespace inference-gateway \
-f values.yaml \
--version 0.5.0
Uninstalling
To completely remove the installation:
# For basic gateway
helm uninstall my-release --namespace inference-gateway
# For UI with gateway
helm uninstall inference-gateway-ui --namespace inference-gateway
# Remove the namespace
kubectl delete namespace inference-gateway
Checking Deployment Status
# Check running pods
kubectl get pods -n inference-gateway
# Check services
kubectl get svc -n inference-gateway
# Check ingress (if enabled)
kubectl get ingress -n inference-gateway
Chart Sources
The Helm chart sources are available at: