GPU Precision Comparisons

34 head-to-head precision comparisons across DeepSeek V4 Pro 1.6T, DeepSeek R1, Kimi K2.5/K2.6/K2.7-Code 1T, GLM 5/5.1, MiniMax M3 428B, MiniMax M2.5/M2.7, Qwen 3.5 397B-A17B, and Llama 3.3 70B. See how FP4, FP8, BF16, INT4, and more quantization levels affect throughput, cost, and interactivity on the same GPU — each page renders the inference chart and an interpolated comparison table.

DeepSeek V4 Pro 1.6T

1 precision comparison with benchmark data on DeepSeek V4 Pro 1.6T.

DeepSeek R1

5 precision comparisons with benchmark data on DeepSeek R1.

GB200 NVL72

NVIDIA · Blackwell

GB300 NVL72

NVIDIA · Blackwell

Kimi K2.5/K2.6/K2.7-Code 1T

3 precision comparisons with benchmark data on Kimi K2.5/K2.6/K2.7-Code 1T.

GLM 5/5.1

4 precision comparisons with benchmark data on GLM 5/5.1.

GB300 NVL72

NVIDIA · Blackwell

MiniMax M3 428B

3 precision comparisons with benchmark data on MiniMax M3 428B.

MiniMax M2.5/M2.7

5 precision comparisons with benchmark data on MiniMax M2.5/M2.7.

GB200 NVL72

NVIDIA · Blackwell

GB300 NVL72

NVIDIA · Blackwell

Llama 3.3 70B

2 precision comparisons with benchmark data on Llama 3.3 70B.