DeepSeek V4 Pro 1.6T · GPU comparison

DeepSeek V4 Pro 1.6T — B300 vs GB200 NVL72

Head-to-head AI inference benchmark comparison of B300 (NVIDIA Blackwell) and GB200 NVL72 (NVIDIA Blackwell) on DeepSeek V4 Pro 1.6T. Latency, throughput, and cost across LLM workloads. Use the chart controls below to switch sequences, precisions, and metrics — same interactions as the main inference chart.

Near the low end of the 6–152 tok/s/user interactivity band, at 43 tok/s/user on DeepSeek V4 Pro 1.6T: B300 runs 2131 tok/s/GPU at $0.30/M tokens, GB200 NVL72 runs 2401 at $0.26/M. GB200 NVL72 is 19% cheaper per token; GB200 NVL72 delivers 13% more tok/s/GPU.

Setting 79 tok/s/user as the target on DeepSeek V4 Pro 1.6T, B300 produces 1272 tok/s/GPU ($0.51 per million tokens) and GB200 NVL72 produces 583 ($1.05). B300 is 106% cheaper per token; B300 delivers 118% more tok/s/GPU.

At 116 tok/s/user interactivity on DeepSeek V4 Pro 1.6T, B300 delivers 656 tok/s/GPU at $1.00 per million tokens; GB200 NVL72 delivers 195 tok/s/GPU at $3.37. B300 is 237% cheaper per token; B300 delivers 236% more tok/s/GPU at this point. (Numbers reflect the default 8k/1k · fp4 selection for this URL — table and chart below update if you change sequence, precision, or model in the controls.)

View performance-per-dollar view →

Interpolated from real benchmark data. Edit target interactivity values below to compare at different operating points.
Metric
Interactivity (tok/s/user)
Interactivity (tok/s/user)
Interactivity (tok/s/user)
Throughput (tok/s/gpu)
B300:2130.6GB200 NVL72:2401.3
B300:1272.0GB200 NVL72:582.9
B300:655.8GB200 NVL72:195.4
Cost ($/M tok)
B300:$0.305GB200 NVL72:$0.257
B300:$0.511GB200 NVL72:$1.051
B300:$0.999GB200 NVL72:$3.368
tok/s/MW
B300:981833GB200 NVL72:1143485
B300:586191GB200 NVL72:277571
B300:302216GB200 NVL72:93050
Concurrency
B300:~28GB200 NVL72:~125
B300:~8GB200 NVL72:~47
B300:~4GB200 NVL72:~7

Inference Performance

Inference performance metrics across different models, hardware configurations, and serving parameters.