InferenceXbySemiAnalysis logo
HomeDashboardSupportersArticlesAbout
Star750

Open Source Continuous Inference Benchmark Trusted by GigaWatt Token Factories

Full Dashboard

Every model, GPU, framework, and metric. Fully configurable inference benchmark charts with date ranges, concurrency sweeps, and raw data export.

Compare NVIDIA B200, H200, H100, AMD MI355X, MI325X, MI300X and more across DeepSeek, gpt-oss, Llama, Qwen, and other models.

Open Dashboard

Quick Comparisons

Jump straight into the most popular GPU inference benchmark comparisons, curated and ready to explore.

GB200 NVL72 vs B200 — Multi vs Single Node

GB200 NVL72 Dynamo TRT vs B200 Dynamo TRT on DeepSeek R1 (8k/1k) at FP4.

DeepSeekGB200B200DynamoFP4NVL72

B200 vs H200 — Blackwell vs Hopper

Blackwell B200 vs Hopper H200 Dynamo TRT throughput per GPU on DeepSeek R1 (8k/1k) at FP8.

DeepSeekB200H200DynamoFP8

AMD MI300X → MI325X → MI355X

Three generations of AMD Instinct on SGLang at FP8. Generational throughput scaling on DeepSeek R1 (8k/1k).

DeepSeekMI300XMI325XMI355XSGLangFP8

H100 vs GB300 Disagg — DeepSeek

H100 FP8 disagg vs GB300 FP8 disagg vs GB300 FP4 disagg on DeepSeek R1 (8k/1k).

DeepSeekH100GB300DisaggFP8FP4

Disagg B200 SGLang vs MI355X vs B200 TRT

Disaggregated B200 Dynamo SGLang vs MI355X MoRI SGLang vs B200 Dynamo TRT on DeepSeek R1 (8k/1k) at FP8.

DeepSeekB200MI355XDynamoMoRIFP8Disagg

MI355X SGLang Disagg Over Time — DeepSeek (FP8)

MI355X SGLang disaggregated inference on DeepSeek R1 (8k/1k) FP8. Tracks throughput improvements over time.

DeepSeekMI355XSGLangFP8DisaggTimeline
SemiAnalysis logo

Continuous open-source inference benchmarking. Real-world, reproducible, auditable performance data trusted by trillion dollar AI infrastructure operators like OpenAI, Oracle, Microsoft, etc.

SemiAnalysisMain SiteNewsletterAbout
LegalPrivacy PolicyCookie Policy
ContributeBenchmarksFrontend

If this data helps your work, consider starring us on GitHub or sharing with your network.

© 2026 semianalysis.com. All rights reserved.