Networking — Tscale | High-Bandwidth Fabric for AI Workloads
/ NETWORKING

High-bandwidth fabric built for AI

Tscale Networking delivers the fabric that AI workloads actually need — 200/400/800 Gbps Ethernet, InfiniBand NDR, RoCE v2, and intelligent routing. Sub-microsecond latency, zero packet loss, full bisection bandwidth.

Sub-µs Latency

Cut-through switching, RoCE v2 zero-copy, and adaptive routing deliver sub-microsecond hop latency. Your NCCL all-reduces finish faster — your training runs shorter.

Full Bisection

Non-blocking Clos topology at 200/400/800 Gbps. Every server can talk to every other server at full line rate — no oversubscription, no hotspots.

Secure By Default

Wire-speed MACsec encryption on every link. Private VPCs, BGP communities, and segmentation baked into the fabric — no overlay tax.

/ FABRIC OPTIONS

Pick the right interconnect for the job

Different AI workloads have different network profiles. Tscale lets you mix and match fabrics per workload — RDMA over Converged Ethernet for inference, InfiniBand for tightly-coupled training, standard Ethernet for everything else.

  • 200/400/800 Gbps Ethernet — the universal fabric, with RoCE v2 for GPU-direct RDMA.
  • InfiniBand NDR / HDR — sub-microsecond latency for the largest multi-GPU training jobs.
  • NVLink / NVSwitch — 900 GB/s GPU-to-GPU bandwidth inside the chassis.
  • Adaptive routing — packets re-route around congestion in real time, no dropped flows.
/ INTERCONNECT PERFORMANCE

Distributed training at line rate

When 64 GPUs run an all-reduce, every microsecond of fabric latency becomes minutes of job time. Tscale’s interconnect sustains 94% of peak throughput across the entire 64-GPU job — not just the average, the worst case.

  • 94% peak throughput

    Sustained across 64-GPU all-reduces, validated on real training workloads.

  • 142 µs all-reduce

    Across 64 GPUs on a 200G RoCE fabric. Faster means shorter training runs.

  • Zero packet loss

    PFC + ECN on every port, tuned for lossless RDMA. No silent retransmits.

/ PLATFORM

Everything around the fabric

Networking is more than wires and switches. Tscale’s platform includes routing, security, observability, and automation — all built for AI workloads from day one.

Topology

  • Clos (leaf-spine) fabric
  • Non-blocking at all speeds
  • Adaptive routing (DLB)
  • Explicit congestion control
  • Cut-through switching

Routing

  • BGP communities
  • Private VPCs
  • Floating IPs & anycast
  • Load balancers (L4/L7)
  • Direct Connect to your DC

Security

  • Wire-speed MACsec
  • Wire-speed IPsec
  • Microsegmentation
  • DDoS protection
  • Network policy as code

Observability

  • Per-flow telemetry (sFlow)
  • Streaming metrics
  • RDMA counter dashboards
  • Congestion heatmaps
  • Alerting & PagerDuty

Interconnects

  • NVLink / NVSwitch
  • InfiniBand NDR / HDR
  • RoCE v2 (RDMA over Ethernet)
  • Standard TCP/IP
  • Direct RDMA to storage

Automation

  • Terraform provider
  • BGP API (Radar)
  • Auto-VPC provisioning
  • Cable & patch management
  • Capacity planning

Performance

800 GBPS LINE RATE
Every port, every link

Non-blocking Clos fabric — no oversubscription, no per-rack limits, no surprise throttling.

0.6 µs HOP LATENCY
With InfiniBand NDR

Sub-microsecond across the fabric, validated on real training jobs in production.

ZERO PACKET LOSS
Lossless RDMA fabric

PFC + ECN end-to-end, tuned for NCCL and RCCL. No silent retransmits, no stalls.

99.99% FABRIC SLA
Backed by enterprise SLA

Redundant spines, hitless upgrades, 24/7 SRE on call for fabric-wide incidents.

For teams that need predictable performance

Multi-region Peering

Private interconnects between Tscale regions with sub-10ms RTT. Run a multi-region training job without the public internet in the path.

Learn More

Direct Connect

Dedicated 10/100/400 Gbps private lines from your data centre to Tscale. Predictable latency, no shared infrastructure, no surprises.

Learn More

The fabric is the foundation

Networking ties every Tscale service together. Pair it with compute, instances, and managed Slurm to run distributed training without bottlenecks.

/ NETWORKING

Train at line rate

Talk to a Network Engineer