InferenceX

(formerly InferenceMAX)

BySemiAnalysis logo
DashboardMediaSupportersArticlesStar
DashboardMediaSupportersArticles

Articles

Insights on AI inference benchmarking, GPU performance, and ML infrastructure.

Allannouncementbenchmarkgpuinference
February 16, 2026·45 min read

InferenceX v2: NVIDIA Blackwell Vs AMD vs Hopper - Formerly InferenceMAX

GB300 NVL72, MI355X, B200, H100, Disaggregated Serving, Wide Expert Parallelism, Large Mixture of Experts, SGLang, vLLM, TRTLLM

benchmarkgpuinferenceannouncement
October 9, 2025·37 min read

InferenceMAX: Open Source Inference Benchmarking

NVIDIA GB200 NVL72, AMD MI355X, Throughput Token per GPU, Latency Tok/s/user, Perf per Dollar, Cost per Million Tokens, Tokens per Provisioned Megawatt, DeepSeek R1 670B, GPTOSS 120B, Llama3 70B

benchmarkgpuinferenceannouncement
SemiAnalysis logo

Continuous open-source inference benchmarking. Real-world, reproducible, auditable performance data trusted by trillion dollar AI infrastructure operators like OpenAI, Oracle, Microsoft, etc.

SemiAnalysisMain SiteNewsletterAbout
LegalPrivacy PolicyCookie Policy
ContributeBenchmarksFrontend

If this data helps your work, consider starring us on GitHub or sharing with your network.

© 2026 semianalysis.com. All rights reserved.