Open Source Continuous Inference Benchmark Trusted by GigaWatt Token Factories

Full Dashboard

Every model, GPU, framework, and metric. Fully configurable inference benchmark charts with date ranges, concurrency sweeps, and raw data export.

Compare NVIDIA B200, H200, H100, AMD MI355X, MI325X, MI300X and more across DeepSeek, gpt-oss, Llama, Qwen, and other models.

Jump straight into the most popular GPU inference benchmark comparisons, curated and ready to explore.

GB200 NVL72 Dynamo TRT vs B200 Dynamo TRT on DeepSeek R1 (8k/1k) at FP4.

Blackwell B200 vs Hopper H200 Dynamo TRT throughput per GPU on DeepSeek R1 (8k/1k) at FP8.

Three generations of AMD Instinct on SGLang at FP8. Generational throughput scaling on DeepSeek R1 (8k/1k).

H100 FP8 disagg vs GB300 FP8 disagg vs GB300 FP4 disagg on DeepSeek R1 (8k/1k).

Disaggregated B200 Dynamo SGLang vs MI355X MoRI SGLang vs B200 Dynamo TRT on DeepSeek R1 (8k/1k) at FP8.

MI355X SGLang disaggregated inference on DeepSeek R1 (8k/1k) FP8. Tracks throughput improvements over time.