NVIDIA Blackwell Tops MLPerf Training 6.0 Across Every AI Benchmark

June 16, 2026

NVIDIA said its Blackwell platform led every category in MLPerf Training 6.0, the latest peer-reviewed industry benchmark for AI training performance, with the fastest time to train across all seven tests and the largest Blackwell-based training run submitted to the benchmark so far.

The results focused on NVIDIA’s GB200 NVL72 and GB300 NVL72 rack-scale systems, which are designed to support large AI training workloads by connecting 72 GPUs into a unified pool of compute and memory through fifth-generation NVIDIA NVLink Switches. The company said that architecture helped its systems handle increasingly demanding workloads, including new mixture-of-experts pretraining benchmarks added in this round.

MLPerf Training 6.0 introduced DeepSeek-V3 671B and GPT-OSS-20B workloads, reflecting the growing use of mixture-of-experts models. NVIDIA said its platform was the only one submitted across all seven benchmarks and produced the fastest training time in each category.

NVIDIA also reported that GB300 NVL72 delivered up to 1.6 times faster training than GB200 NVL72 at the same scale. The company attributed the improvement to Blackwell Ultra capabilities including higher compute density with NVFP4, expanded memory and a higher power ceiling that allows the GPU to sustain peak performance.

Scale was another focus of the benchmark results. On DeepSeek-V3 671B, NVIDIA said it scaled to 8,192 GPUs using GB200 NVL72 systems, marking the largest Blackwell-based submission in MLPerf Training to date. The company also submitted results using 5,120 GPUs on Llama 3.1 405B with GB200 NVL72 systems.

Partner submissions played a major role in the results. Microsoft Azure scaled Llama 3.1 405B training to 8,192 GPUs with GB200 NVL72 systems and reached the benchmark’s reference quality target in 7.07 minutes, which NVIDIA said was the fastest time to train for that test. CoreWeave reached the quality target for DeepSeek-V3 671B in 2.02 minutes at 8,192-GPU scale using GB300 NVL72 systems connected with Spectrum-X Ethernet.

NVIDIA framed the benchmark performance around both speed and reliability. The company said production AI training jobs can run for weeks or months across large GPU clusters, making uptime and recovery critical to overall throughput. Its platform includes manufacturing tests, chip-level monitoring, self-healing capabilities and network rerouting features intended to reduce interruptions.

For recovery, NVIDIA pointed to its Resiliency Extension, or NVRx, which is designed to detect faults, monitor cluster health and resume jobs from recent checkpoints when interruptions occur rather than restarting an entire training run.

NVIDIA said 19 ecosystem partners submitted results in this MLPerf round, including Microsoft Azure, CoreWeave, Dell Technologies, Google Cloud, Cisco, Fujitsu, Hewlett Packard Enterprise, Lambda, Nebius, Supermicro and others. The company also highlighted customers using Blackwell infrastructure for demanding AI workloads, including Cohere, Midjourney, Thinking Machines Lab and Higgsfield.

The results reinforce NVIDIA’s pitch that faster training infrastructure can shorten model development cycles, reduce training costs and help AI companies move more quickly from experimentation to deployment.

This analysis is based on reporting from Nvidia.

Image courtesy of Nvidia.

This article was generated with AI assistance and reviewed for accuracy and quality.

Last updated: June 16, 2026

Report Error

About this article: This article was generated with AI assistance and reviewed by our editorial team to ensure it follows our editorial standards for accuracy and independence. We maintain strict fact-checking protocols and cite all sources.

Word count: 494Reading time: 0 minutes

Explore More AI Resources

Continue with high-value guides related to this topic.

Compare AI Models

See ChatGPT, Claude, and Gemini side-by-side in one place.

Best AI Newsletters

Find top AI newsletters and subscribe to ChatAI Daily.

AI FAQ

Quick answers about ChatAI, AI tools, and multi-model chat.

AI Tools

Use free AI tools for summarization, translation, and more.

📧 Stay Updated

Get the latest AI news delivered to your inbox every morning.

Continue Reading

Runway Launches Media Router to Automatically Choose the Best AI Model for Every Request

Runway has launched Media Router through its Runway Dev platform, introducing a system that automatically selects image, video, and audio generation models based on developer-defined priorities such...

July 23, 2026•5 min read

NVIDIA's New Spectrum-6 Ethernet Switch Targets Gigascale AI Data Centers

NVIDIA has introduced Spectrum-6, a 102.4-terabit-per-second Ethernet switch system designed for the company's next generation of AI infrastructure. Built as part of the NVIDIA Vera Rubin platform,...

July 21, 2026•5 min read

Google's New Frozen v2 AI Chip Could Dramatically Boost Gemini Efficiency

Alphabet is developing a new custom AI server chip designed to improve the efficiency of its Gemini models, according to a report from The Information. The chip, internally known as "Frozen v2," is...

July 20, 2026•5 min read

Explore All Articles

NVIDIA Blackwell Tops MLPerf Training 6.0 Across Every AI Benchmark

Explore More AI Resources

Compare AI Models

Best AI Newsletters

AI FAQ

AI Tools

AI Tools for this Article

📧 Stay Updated

Related Articles

Runway Launches Media Router to Automatically Choose the Best AI Model for Every Request

NVIDIA's New Spectrum-6 Ethernet Switch Targets Gigascale AI Data Centers

Google's New Frozen v2 AI Chip Could Dramatically Boost Gemini Efficiency

Continue Reading

Runway Launches Media Router to Automatically Choose the Best AI Model for Every Request

NVIDIA's New Spectrum-6 Ethernet Switch Targets Gigascale AI Data Centers

Google's New Frozen v2 AI Chip Could Dramatically Boost Gemini Efficiency

Stay Ahead of AI

Go Premium

Follow Our Community

ChatAI

Go Premium

ChatAI

Follow Our Community