/ KUBERNETES SERVICE

Managed Kubernetes built for AI

Tscale Kubernetes Service delivers a production-grade, GPU-aware managed K8s — with one-click GPU operators, intelligent auto-scaling, and multi-tenant isolation. Standard upstream Kubernetes, optimised for the workloads that matter.

Get Started Contact Sales

Cluster Healthy tscale-k8s-prod-01

inference-llm-7b-9d8f7 Running

embedding-bge-2c4a1 Running

training-job-4218-7b1e Running

vllm-router-6f3a2 Running

notebook-jupyter-89c4 Running

data-loader-spark-44a Pending

5 running · 1 pending 6 / 24 pods

Upstream Kubernetes

Vanilla upstream Kubernetes, no forks. Every kubectl, every Helm chart, every operator you already use works out of the box — no proprietary lock-in.

GPU Native

NVIDIA Device Plugin, GPU Operator, and DCGM come pre-installed. Schedule pods onto specific GPUs, share devices with MIG, and monitor utilisation out of the box.

Production Hardened

HA control plane, encrypted secrets at rest, CIS-benchmarked nodes, and 24/7 SRE. Production-grade from day one, with a 99.95% uptime SLA.

/ GPU OPERATORS

One-click operators for the GPU stack

Tscale ships every operator your AI workloads need, pre-configured and tested together. No more wiring up NVIDIA drivers, runtime configs, and device plugins at 2am — just enable and ship.

NVIDIA GPU Operator — drivers, container toolkit, device plugin, and DCGM exporter, all from a single Helm install.
MIG manager — slice H100/A100 GPUs into isolated instances for multi-tenant inference workloads.
KubeRay operator — first-class Ray cluster support for distributed training and serving.
Volcano scheduler — gang scheduling, fair-share, and topology-aware placement for batch AI jobs.
cert-manager + sealed-secrets — encrypted GitOps-friendly secret management, out of the box.

nvidia-gpu-operator

v23.9 · namespace: gpu-operator

Ready

mig-manager

v0.6 · 14 MIG slices

Ready

kuberay-operator

v1.2 · 3 Ray clusters

Ready

volcano-scheduler

v1.9 · 8 queues

Ready

cert-manager

v1.14 · 27 certs issued

Ready

sealed-secrets

v0.27 · 142 secrets

Ready

Cluster pods · 24h 142 / 200

00:00 06:00 12:00 18:00 24:00

Avg Util 78%

Scale Up 42s

Cost / hr $52

/ AUTO-SCALING

Right-size, automatically to your traffic

Tscale’s auto-scaler understands GPU workloads. Scale up fast when a request spike hits your inference endpoint, scale down to zero when notebooks go idle — and never over-provision a single GPU.

HPA + Cluster Autoscaler

Standard Kubernetes HPA scales pods. Cluster Autoscaler scales nodes. Both tuned for GPU workloads.
KEDA event-driven scaling

Scale from queue depth, request rate, Kafka lag, or any custom Prometheus metric.
Scale to zero

Dev environments and bursty workloads scale to zero — and warm back up in seconds when a request arrives.

/ PLATFORM

A complete cloud-native stack

Tscale’s Kubernetes Service ships with the entire ecosystem of CNCF projects you’d otherwise have to install, configure, and maintain yourself. Production-grade from the moment you create a cluster.

Core Platform

Kubernetes 1.30+ (latest stable)
HA control plane (3 etcd replicas)
Cilium CNI with eBPF
CoreDNS with split-horizon
Gateway API ingress

GPU & Acceleration

NVIDIA GPU Operator
MIG slicing support
DCGM + Prometheus exporter
AMD ROCm operator
Time-slicing & MPS

Storage

High-performance CSI driver
ReadWriteMany (RWX) volumes
S3-compatible object storage
Lustre CSI for HPC
Snapshots & clones

Operators

cert-manager
sealed-secrets
Volcano (gang scheduling)
KubeRay
Knative (serverless)

Observability

Prometheus + Grafana
Loki log aggregation
Jaeger distributed tracing
DCGM GPU dashboards

Developer Tooling

kubectl + Helm + Argo CD
Tscale Radar (REST + CLI)
Terraform provider
VS Code extension

Performance

99.95% UPTIME SLA

Production control plane

Multi-AZ control plane with automatic failover and 24/7 SRE on call.

42-SECOND SCALE

Pod to ready in 42s

From CPU spike to new pod scheduled and serving traffic in under a minute.

ZERO DOWNTIME

Rolling upgrades

K8s version upgrades happen without dropping a single connection.

CIS BENCHMARKED

Hardened by default

Every node ships hardened to CIS Kubernetes Benchmark Level 1 — no extra work for you.

Built for AI-native teams

Multi-tenant Isolation

Namespaces, RBAC, network policies, and resource quotas out of the box. Run dozens of teams on one cluster without stepping on each other.

Learn More

Secure & Compliant

SOC 2 Type II controls, encrypted secrets, audit logging, and network policies — for teams shipping AI in regulated industries.

Learn More

Pair with the rest of the stack

Kubernetes Service is the runtime for everything Tscale builds. Combine it with managed Slurm, inference, and fine-tuning to run the full AI lifecycle on one platform.

/ KUBERNETES SERVICE

Production K8s, without the operator tax

Spin Up a Cluster

Managed Kubernetes built for AI

Upstream Kubernetes

GPU Native

Production Hardened

One-click operators for the GPU stack

Right-size, automatically to your traffic

HPA + Cluster Autoscaler

KEDA event-driven scaling

Scale to zero

A complete cloud-native stack

Core Platform

GPU & Acceleration

Storage

Operators

Observability

Developer Tooling

Performance

Built for AI-native teams

Multi-tenant Isolation

Secure & Compliant

Pair with the rest of the stack

INFERENCE

MANAGED SLURM

FINE TUNING

Production K8s, without the operator tax