GPU 规格与对比

本页面提供 GPU 规格对比:NVIDIA、AMD 等厂商加速器的显存容量、显存带宽、FLOPS、互连拓扑与功耗规格。

图表中的模型、GPU、框架与指标名称均沿用业界通用英文名称。

GPU 规格

InferenceX™ 基准测试中使用的 GPU 硬件规格,包括计算性能、显存带宽和互联详情。

GPU显存显存带宽FP4 TFLOP/s1FP8 TFLOP/s1BF16 TFLOP/s1纵向扩展纵向扩展带宽域内 GPU 数纵向扩展域显存纵向扩展域显存带宽纵向扩展拓扑纵向扩展交换机每 GPU 横向扩展带宽横向扩展技术横向扩展交换机横向扩展拓扑NIC
H100 SXMNVIDIA
80 GB3.35 TB/s1,979989NVLink 4.0450 GB/s80.64 TB26.8 TB/s7.2Tbit/s NVSwitch Gen 3.0400 Gbit/sRoCEv2 Ethernet25.6T Arista Tomahawk4 7060DX5-64SConnectX-7 2x200GbE
H200 SXMNVIDIA
141 GB4.8 TB/s1,979989NVLink 4.0450 GB/s81.13 TB38.4 TB/s7.2Tbit/s NVSwitch Gen 3.0400 Gbit/sInfiniBand NDR25.6T NVIDIA Quantum-2 QM9790ConnectX-7 400G
B200 SXMNVIDIA
180 GB8 TB/s9,0004,5002,250NVLink 5.0900 GB/s81.44 TB64 TB/s28.8Tbit/s NVSwitch Gen 4.0400 Gbit/sgIB RoCEv2 Ethernet12.8T Whitebox Leaf Tomahawk3 & 25.6T Whitebox Tomahawk4ConnectX-7 400GbE
B300 SXMNVIDIA
268 GB8 TB/s13,5004,5002,250NVLink 5.0900 GB/s82.14 TB64 TB/s28.8Tbit/s NVSwitch Gen 4.0800 Gbit/sRoCEv2 Ethernet51.2T NVIDIA Spectrum-X SN5600ConnectX-8 2x400GbE
GB200 NVL72NVIDIA
192 GB8 TB/s10,0005,0002,500NVLink 5.0900 GB/s7213.82 TB576 TB/s28.8Tbit/s NVSwitch Gen 4.0N/A2N/A2N/A2N/A2N/A2
GB300 NVL72NVIDIA
288 GB8 TB/s15,0005,0002,500NVLink 5.0900 GB/s7220.74 TB576 TB/s28.8Tbit/s NVSwitch Gen 4.0N/A2N/A2N/A2N/A2N/A2
MI300XAMD
192 GB5.3 TB/s2,6151,307Infinity Fabric448 GB/s81.54 TB42.4 TB/s400 Gbit/sRoCEv2 Ethernet51.2T Tomahawk5Pollara 400GbE
MI325XAMD
256 GB6 TB/s2,6151,307Infinity Fabric448 GB/s82.05 TB48 TB/s400 Gbit/sRoCEv2 Ethernet51.2T Tomahawk5Pollara 400GbE
MI355XAMD
288 GB8 TB/s10,0665,0332,5165th Gen Infinity Fabric576 GB/s82.3 TB64 TB/s400 Gbit/sRoCEv2 Ethernet51.2T Arista Tomahawk5 DCS-7060X6-64PEPollara 400GbE

1 密集 Tensor Core 峰值 TFLOP/s(不含稀疏加速)。

2 InferenceX™ 机柜级测试不使用横向扩展。

横向扩展拓扑图

每台服务器的横向扩展网络拓扑,展示 GPU → NIC → Leaf 交换机的连接方式。

H100 SXM

8-rail optimized · RoCEv2 Ethernet

H200 SXM

8-rail optimized · InfiniBand NDR

B200 SXM

4-rail optimized · gIB RoCEv2 Ethernet

B300 SXM

8-rail optimized · RoCEv2 Ethernet

MI300X

8-rail optimized · RoCEv2 Ethernet

MI325X

8-rail optimized · RoCEv2 Ethernet

MI355X

8-rail optimized · RoCEv2 Ethernet

纵向扩展拓扑图

节点内纵向扩展互联拓扑,展示 GPU → NVSwitch 或 GPU 直连方式。

H100 SXM

Switched 4-rail Optimized · NVLink 4.0

H200 SXM

Switched 4-rail Optimized · NVLink 4.0

B200 SXM

Switched 2-rail Optimized · NVLink 5.0

B300 SXM

Switched 2-rail Optimized · NVLink 5.0

GB200 NVL72

Switched 18-rail Optimized · NVLink 5.0

GB300 NVL72

Switched 18-rail Optimized · NVLink 5.0

MI300X

Full Mesh · Infinity Fabric

MI325X

Full Mesh · Infinity Fabric

MI355X

Full Mesh · 5th Gen Infinity Fabric