GeistHaus
log in · sign up

InferenceMAX™: Open Source Inference Benchmarking

newsletter.semianalysis.com

NVIDIA GB200 NVL72, AMD MI355X, Throughput Token per GPU, Latency Tok/s/user, Perf per Dollar, Tokens per Provisioned Megawatt, DeepSeek R1 670B, GPTOSS 120B, Llama3 70B

3 pages link to this URL