MLPerf Inference tests see the new Azure ND GB300 v6 VMs achieve token performance that ‘fundamentally alters the calculus of ...
The result represents a 27% improvement from the previous Azure ND GB200 v6 benchmark of 865,000 tokens per second. Each ...
Microsoft sets AI inference speed record with Azure ND GB300 v6 VMs, achieving 1.1M tokens/sec using Nvidia GB300 GPUs.
An industry record made possible by our longstanding co-innovation with NVIDIA and expertise in running AI at production ...