Supermicro's NVIDIA HGX B200 Systems Lead MLPerf Inference Benchmarks

Supermicro has announced that its NVIDIA HGX B200 systems have achieved top performance in the MLPerf Inference v5.0 benchmarks, outperforming previous systems by generating three times more tokens per second.

Supermicro has announced its NVIDIA HGX B200 systems have achieved industry-leading performance in the MLPerf Inference v5.0 benchmarks, according to a press release. The systems, featuring 8-GPU configurations, demonstrated more than three times the token generation per second compared to previous generations.

The benchmarks highlighted the performance of both air-cooled and liquid-cooled systems, with the air-cooled B200 system matching the liquid-cooled system's performance within the operating margin. Supermicro's systems excelled in various benchmarks, including Llama2-70B and Llama3.1-405B, showcasing significant improvements in token generation rates.

Supermicro's systems, such as the SYS-421GE-NBRT-LCC and SYS-A21GE-NBRT, achieved top positions in several categories, including the Mixtral 8x7B Inference and Mixture of Experts benchmarks. These systems delivered impressive results, with the air-cooled and liquid-cooled NVIDIA B200 systems generating over 1,000 tokens per second for large models like Llama3.1-405b.

The company continues to offer a comprehensive AI portfolio with over 100 GPU-optimized systems, providing both air-cooled and liquid-cooled options to meet diverse workload requirements. Supermicro's collaboration with NVIDIA ensures that their systems remain at the forefront of AI performance and innovation.

We hope you enjoyed this article.

Consider subscribing to one of several newsletters we publish like Silicon Brief.

Also, consider following our LinkedIn page AI Chips & Datacenters.

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates