MiniMax M3 428B · GPU comparison

MiniMax M3 428B — MI325X vs MI355X

Head-to-head AI inference benchmark comparison of MI325X (AMD CDNA 3) and MI355X (AMD CDNA 4) on MiniMax M3 428B. Latency, throughput, and cost across LLM workloads. Use the chart controls below to switch sequences, precisions, and metrics — same interactions as the main inference chart.

MI325X posts 787 tok/s/GPU for $0.45 per million tokens at 39 tok/s/user on MiniMax M3 428B; MI355X posts 1574 tok/s/GPU for $0.26. MI355X is 75% cheaper per token; MI355X delivers 100% more tok/s/GPU.

Throughput at 69 tok/s/user on MiniMax M3 428B: MI325X hits 351 tok/s/GPU, MI355X hits 510. Per-million costs land at $1.02 and $0.81 respectively. MI355X is 27% cheaper per token; MI355X delivers 45% more tok/s/GPU.

MI325X / MI355X on MiniMax M3 428B at 100 tok/s/user: 194 / 183 tok/s/GPU, $1.83 / $2.24 per million tokens. MI325X is 22% cheaper per token; MI325X delivers 6% more tok/s/GPU. (Numbers reflect the default 1k/1k · fp8 selection for this URL — table and chart below update if you change sequence, precision, or model in the controls.)

View performance-per-dollar view →

Interpolated from real benchmark data. Edit target interactivity values below to compare at different operating points.
Metric
Interactivity (tok/s/user)
Interactivity (tok/s/user)
Interactivity (tok/s/user)
Throughput (tok/s/gpu)
MI325X:786.6MI355X:1574.0
MI325X:351.1MI355X:509.8
MI325X:194.3MI355X:182.9
Cost ($/M tok)
MI325X:$0.452MI355X:$0.258
MI325X:$1.021MI355X:$0.806
MI325X:$1.832MI355X:$2.239
tok/s/MW
MI325X:360812MI355X:593952
MI325X:161059MI355X:192372
MI325X:89139MI355X:69008
Concurrency
MI325X:~87MI355X:~86
MI325X:~22MI355X:~33
MI325X:~8MI355X:~8

Inference Performance

Inference performance metrics across different models, hardware configurations, and serving parameters.