MiniMax M3 428B · GPU comparison

MiniMax M3 428B — GB200 NVL72 vs MI355X

Head-to-head AI inference benchmark comparison of GB200 NVL72 (NVIDIA Blackwell) and MI355X (AMD CDNA 4) on MiniMax M3 428B. Latency, throughput, and cost across LLM workloads. Use the chart controls below to switch sequences, precisions, and metrics — same interactions as the main inference chart.

Setting 52 tok/s/user as the target on MiniMax M3 428B, GB200 NVL72 produces 1598 tok/s/GPU ($0.38 per million tokens) and MI355X produces 2784 ($0.15). MI355X is 157% cheaper per token; MI355X delivers 74% more tok/s/GPU.

At 87 tok/s/user interactivity on MiniMax M3 428B, GB200 NVL72 delivers 638 tok/s/GPU at $1.00 per million tokens; MI355X delivers 1636 tok/s/GPU at $0.25. MI355X is 296% cheaper per token; MI355X delivers 156% more tok/s/GPU at this point.

GB200 NVL72 posts 211 tok/s/GPU for $2.86 per million tokens at 122 tok/s/user on MiniMax M3 428B; MI355X posts 1048 tok/s/GPU for $0.39. MI355X is 633% cheaper per token; MI355X delivers 397% more tok/s/GPU. (Numbers reflect the default 1k/1k · fp8 selection for this URL — table and chart below update if you change sequence, precision, or model in the controls.)

View performance-per-dollar view →

Interpolated from real benchmark data. Edit target interactivity values below to compare at different operating points.

Metric	Interactivity (tok/s/user)	Interactivity (tok/s/user)	Interactivity (tok/s/user)
Throughput (tok/s/gpu)	GB200 NVL72:1598.2MI355X:2783.6	GB200 NVL72:638.3MI355X:1636.2	GB200 NVL72:210.9MI355X:1048.2
Cost ($/M tok)	GB200 NVL72:$0.382MI355X:$0.149	GB200 NVL72:$0.996MI355X:$0.251	GB200 NVL72:$2.862MI355X:$0.391
tok/s/MW	GB200 NVL72:761024MI355X:1050417	GB200 NVL72:303973MI355X:617433	GB200 NVL72:100447MI355X:395557
Concurrency	GB200 NVL72:~180MI355X:~112	GB200 NVL72:~74MI355X:~41	GB200 NVL72:~18MI355X:~19

Inference Performance

Inference performance metrics across different models, hardware configurations, and serving parameters.

Model

ISL / OSL

Precision

Y-Axis Metric

GPU Config

Quick Filters

Vendor:

Aggregation:

Spec Decoding: