Kubernetes Service — Tscale | Managed K8s for AI Workloads
/ KUBERNETES SERVICE

Managed Kubernetes built for AI

Tscale Kubernetes Service delivers a production-grade, GPU-aware managed K8s — with one-click GPU operators, intelligent auto-scaling, and multi-tenant isolation. Standard upstream Kubernetes, optimised for the workloads that matter.

Upstream Kubernetes

Vanilla upstream Kubernetes, no forks. Every kubectl, every Helm chart, every operator you already use works out of the box — no proprietary lock-in.

GPU Native

NVIDIA Device Plugin, GPU Operator, and DCGM come pre-installed. Schedule pods onto specific GPUs, share devices with MIG, and monitor utilisation out of the box.

Production Hardened

HA control plane, encrypted secrets at rest, CIS-benchmarked nodes, and 24/7 SRE. Production-grade from day one, with a 99.95% uptime SLA.

/ GPU OPERATORS

One-click operators for the GPU stack

Tscale ships every operator your AI workloads need, pre-configured and tested together. No more wiring up NVIDIA drivers, runtime configs, and device plugins at 2am — just enable and ship.

  • NVIDIA GPU Operator — drivers, container toolkit, device plugin, and DCGM exporter, all from a single Helm install.
  • MIG manager — slice H100/A100 GPUs into isolated instances for multi-tenant inference workloads.
  • KubeRay operator — first-class Ray cluster support for distributed training and serving.
  • Volcano scheduler — gang scheduling, fair-share, and topology-aware placement for batch AI jobs.
  • cert-manager + sealed-secrets — encrypted GitOps-friendly secret management, out of the box.
/ AUTO-SCALING

Right-size, automatically to your traffic

Tscale’s auto-scaler understands GPU workloads. Scale up fast when a request spike hits your inference endpoint, scale down to zero when notebooks go idle — and never over-provision a single GPU.

  • HPA + Cluster Autoscaler

    Standard Kubernetes HPA scales pods. Cluster Autoscaler scales nodes. Both tuned for GPU workloads.

  • KEDA event-driven scaling

    Scale from queue depth, request rate, Kafka lag, or any custom Prometheus metric.

  • Scale to zero

    Dev environments and bursty workloads scale to zero — and warm back up in seconds when a request arrives.

/ PLATFORM

A complete cloud-native stack

Tscale’s Kubernetes Service ships with the entire ecosystem of CNCF projects you’d otherwise have to install, configure, and maintain yourself. Production-grade from the moment you create a cluster.

Core Platform

  • Kubernetes 1.30+ (latest stable)
  • HA control plane (3 etcd replicas)
  • Cilium CNI with eBPF
  • CoreDNS with split-horizon
  • Gateway API ingress

GPU & Acceleration

  • NVIDIA GPU Operator
  • MIG slicing support
  • DCGM + Prometheus exporter
  • AMD ROCm operator
  • Time-slicing & MPS

Storage

  • High-performance CSI driver
  • ReadWriteMany (RWX) volumes
  • S3-compatible object storage
  • Lustre CSI for HPC
  • Snapshots & clones

Operators

  • cert-manager
  • sealed-secrets
  • Volcano (gang scheduling)
  • KubeRay
  • Knative (serverless)

Observability

  • Prometheus + Grafana
  • Loki log aggregation
  • Jaeger distributed tracing
  • DCGM GPU dashboards

Developer Tooling

  • kubectl + Helm + Argo CD
  • Tscale Radar (REST + CLI)
  • Terraform provider
  • VS Code extension

Performance

99.95% UPTIME SLA
Production control plane

Multi-AZ control plane with automatic failover and 24/7 SRE on call.

42-SECOND SCALE
Pod to ready in 42s

From CPU spike to new pod scheduled and serving traffic in under a minute.

ZERO DOWNTIME
Rolling upgrades

K8s version upgrades happen without dropping a single connection.

CIS BENCHMARKED
Hardened by default

Every node ships hardened to CIS Kubernetes Benchmark Level 1 — no extra work for you.

Built for AI-native teams

Multi-tenant Isolation

Namespaces, RBAC, network policies, and resource quotas out of the box. Run dozens of teams on one cluster without stepping on each other.

Learn More

Secure & Compliant

SOC 2 Type II controls, encrypted secrets, audit logging, and network policies — for teams shipping AI in regulated industries.

Learn More

Pair with the rest of the stack

Kubernetes Service is the runtime for everything Tscale builds. Combine it with managed Slurm, inference, and fine-tuning to run the full AI lifecycle on one platform.

/ KUBERNETES SERVICE

Production K8s, without the operator tax

Spin Up a Cluster