AI training fabric design

Pick the right fabric for your training cluster

InfiniBand or RoCE? We design, build, and tune the interconnect for large-scale GPU training — from intra-server NVLink all the way to cross-site WAN, for on-prem, colocated, or ApeTops-hosted clusters.

Request a design consultation Audit an existing fabric

Three planes of communication

A training cluster moves traffic across three distinct planes — each with different bandwidth, latency, and topology constraints.

Inside the server

NVLink, NVSwitch, and PCIe traffic between GPUs, CPUs, and NICs within a single chassis — where raw bandwidth dominates.

Inside the cluster

Server-to-server traffic across the training fabric — where collective-op latency and non-blocking bisection bandwidth determine throughput.

Across clusters

WAN traffic between sites for federated training, data staging, and DR — where deterministic latency and throughput guarantees matter.

Reference: A100 server interconnect

What a single DGX-class node looks like under the hood — the baseline we design fabrics around.

Per-server components

2× CPU sockets
8× A100 GPUs
6× NVSwitch chips
4× PCIe Gen4 switch chips
8× InfiniBand compute NICs
2× InfiniBand storage NICs (BlueField-3 DPU)

Link bandwidths

Link	Bandwidth
NVLink (A100, per GPU)	600 GB/s
NVLink (A800, per GPU)	400 GB/s
GPU ↔ NIC over PCIe	32 GB/s
InfiniBand HDR	200 Gbps (25 GB/s)

InfiniBand vs. RoCE

Both are viable at scale. We pick based on cluster size, budget envelope, and operational maturity.

InfiniBand

Closed, vertically integrated architecture
Roughly 20%+ collective-op performance edge over RoCE
Higher CapEx — component pricing materially above Ethernet
Best fit: small-to-medium training clusters where every percent matters

RoCE

Open Ethernet-based ecosystem with broad vendor choice
Significantly lower cost per port at scale
Fast-moving technology curve — 800G fabrics now shipping
Best fit: mid-to-large training clusters with strong network ops teams

What we deliver

End-to-end fabric engagement — from napkin sketch to production-ready cluster.

Topology design: fat-tree, dragonfly+, rail-optimized

Non-blocking 400G IB and 800G RoCE fabric builds

Rack-and-stack, cabling, and optical budget validation

Congestion-control and QoS tuning (ECN, DCQCN, adaptive routing)

NCCL / SHARP / SHArP-aware collective optimization

End-to-end benchmarking: all-reduce, all-to-all, MFU targets

Cross-site WAN interconnect for federated training

Private and public deployments — on your floor or in ours

Planning a training cluster?

Share your GPU count, model size, and training targets — we'll come back within two business days with a fabric recommendation, BOM, and deployment timeline.

Request a design consultation Browse all services

Pick the right fabric for your training cluster

Three planes of communication

Inside the server

Inside the cluster

Across clusters

Reference: A100 server interconnect

Per-server components

Link bandwidths

InfiniBand vs. RoCE

InfiniBand

RoCE

What we deliver

Planning a training cluster?

Other services

High-Performance Compute

Inference Compute

Server Colocation

GPU Repair & Maintenance

Private Network

Managed Operations

Hardware & Appliances