Open Source Continuous Inference Benchmark Trusted by GigaWatt Token Factories
Full Dashboard
Every model, GPU, framework, and metric. Fully configurable inference benchmark charts with date ranges, concurrency sweeps, and raw data export.
Compare NVIDIA B200, H200, H100, AMD MI355X, MI325X, MI300X and more across DeepSeek, gpt-oss, Llama, Qwen, and other models.
Quick Comparisons
Jump straight into the most popular GPU inference benchmark comparisons, curated and ready to explore.
GB200 NVL72 vs B200 — Multi vs Single Node
GB200 NVL72 Dynamo TRT vs B200 Dynamo TRT on DeepSeek R1 (8k/1k) at FP4.
B200 vs H200 — Blackwell vs Hopper
Blackwell B200 vs Hopper H200 Dynamo TRT throughput per GPU on DeepSeek R1 (8k/1k) at FP8.
AMD MI300X → MI325X → MI355X
Three generations of AMD Instinct on SGLang at FP8. Generational throughput scaling on DeepSeek R1 (8k/1k).
H100 vs GB300 Disagg — DeepSeek
H100 FP8 disagg vs GB300 FP8 disagg vs GB300 FP4 disagg on DeepSeek R1 (8k/1k).
Disagg B200 SGLang vs MI355X vs B200 TRT
Disaggregated B200 Dynamo SGLang vs MI355X MoRI SGLang vs B200 Dynamo TRT on DeepSeek R1 (8k/1k) at FP8.
MI355X SGLang Disagg Over Time — DeepSeek (FP8)
MI355X SGLang disaggregated inference on DeepSeek R1 (8k/1k) FP8. Tracks throughput improvements over time.