60-Second Provisioning
From `tscale instances create` to SSH-ready, your GPU is online in under a minute. Cold-start a cluster of 64 H100s in the time it takes to make coffee.
Tscale Instances deliver dedicated, single-tenant GPU servers — H100, H200, A100, L40S, MI300X — ready in under 60 seconds, billed by the second, no per-API markup. The cloud GPU, finally done right.
From `tscale instances create` to SSH-ready, your GPU is online in under a minute. Cold-start a cluster of 64 H100s in the time it takes to make coffee.
No 60-minute minimum, no rounding up, no per-API markup. Stop a job at 14:23:07, pay for 14:23:07. The cloud GPU billing experience you’ve been waiting for.
Every instance is a dedicated physical GPU — no noisy neighbours, no shared MIG, no oversubscription. Whatever you run, you get the full FLOPS.
From the budget-friendly L40S to the new H200 and NVIDIA Blackwell, Tscale’s catalog covers every workload profile. Mix and match across regions — every instance is identical, predictable, and dedicated.
The newest Hopper generation — 141GB HBM3e, 4.8TB/s bandwidth, optimised for the largest LLM training runs.
The workhorse of modern AI — 80GB HBM3, NVLink, transformer engine, FP8 precision. The default for serious workloads.
The proven default — 80GB HBM2e, NVLink, ideal for training and inference at the best dollar per FLOP.
The inference sweet spot — 48GB GDDR6, AV1 encoder, and tensor cores. Perfect for vLLM at 100+ tok/s.
AMD’s flagship — 192GB HBM3, ROCm 6.2, and exceptional FP6/FP8 throughput. Run massive models without model sharding.
The next generation — 192GB HBM3e, 8TB/s, 5th-gen tensor cores, FP4/FP6 precision for trillion-parameter training.
Tscale’s pricing model is built for AI workloads that ramp up and down. Per-second billing, no API markup, no data egress fees, no surprise tier-2 storage charges — just the GPU.
Instances aren’t just bare servers — they’re a complete cloud platform. Storage, networking, snapshots, and integrations come standard with every instance you launch.
Average cold-start time across all instance types. Hot pools for popular SKUs cut this to under 15 seconds.
Every instance gets the full 200/400 Gbps line rate. No oversubscription, no shared uplinks.
Same NVIDIA H100s, same SXM5 boards, 80% lower bill. No data egress, no per-API markup.
Hot-standby failover for control plane, redundant power & networking, 24/7 SRE on call.
Standard Linux. Root access. Bring your own kernel modules. No proprietary abstraction layer between you and the silicon.
Learn MoreData residency in your jurisdiction. Encryption at rest and in transit. Optional dedicated tenancy for sensitive workloads.
Learn MoreInstances are the building blocks. Combine them with managed Slurm, Kubernetes, and inference to run the full AI lifecycle on one platform — without leaving Tscale.