GPU Speculative Decoding Comparisons

49 speculative decoding comparisons across DeepSeek V4 Pro 1.6T, DeepSeek R1, GLM 5/5.1, MiniMax M3 428B, and Qwen 3.5 397B-A17B. Each page compares inference with the speculative decoding method (MTP, EAGLE, etc.) enabled versus disabled on the same model and GPU — throughput, cost, and interactivity at matched operating points.