Accuracy Evals

Benchmark results showing model quality versus throughput trade-offs across different GPUs, quantization levels, and inference configurations.