GPU comparison
B300 vs GB200 NVL72
Head-to-head AI inference benchmark comparison of B300 (NVIDIA Blackwell) and GB200 NVL72 (NVIDIA Blackwell). Latency, throughput, and cost across LLM workloads. Use the chart controls below to switch models, sequences, precisions, and metrics — same interactions as the main inference chart.
Interpolated from real benchmark data. Edit target interactivity values below to compare at different operating points.
| Metric | Interactivity (tok/s/user) | Interactivity (tok/s/user) | Interactivity (tok/s/user) |
|---|---|---|---|
| Throughput (tok/s/gpu) | B300:8611.9GB200 NVL72:11779.6 | B300:1855.1GB200 NVL72:4506.3 | B300:1064.2GB200 NVL72:1028.9 |
| Cost ($/M tok) | B300:$0.075GB200 NVL72:$0.052 | B300:$0.351GB200 NVL72:$0.135 | B300:$0.604GB200 NVL72:$0.601 |
| tok/s/MW | B300:3968598GB200 NVL72:5609342 | B300:854865GB200 NVL72:2145869 | B300:490420GB200 NVL72:489935 |
| Concurrency | B300:~222GB200 NVL72:~943 | B300:~46GB200 NVL72:~203 | B300:~34GB200 NVL72:~19 |
Inference Performance
Inference performance metrics across different models, hardware configurations, and serving parameters.