
CoreWeave Sets New AI Benchmark with NVIDIA GB200 Superchips
CoreWeave has achieved a new record in AI inferencing benchmarks using NVIDIA GB200 Grace Blackwell Superchips, announced in a press release. The company reported delivering 800 tokens per second (TPS) on the Llama 3.1 405B model, one of the largest open-source models, using a CoreWeave instance equipped with two NVIDIA Grace CPUs and four NVIDIA Blackwell GPUs.
Additionally, CoreWeave submitted results for NVIDIA H200 GPU instances, achieving 33,000 TPS on the Llama 2 70B model, marking a 40% improvement over previous NVIDIA H100 instances. These achievements underscore CoreWeave's position as a leading provider of cloud infrastructure services optimized for AI applications.
We hope you enjoyed this article.
Consider subscribing to one of several newsletters we publish like Silicon Brief.
Also, consider following us on social media:
More from: Chips & Data Centers
Marvell Introduces Advanced Packaging for AI Accelerators
MinIO AIStor Integrates AWS S3 Express API for Enhanced AI Workloads
EdgeMode Acquires Synthesis Analytics to Boost AI Data Center Capabilities
Strider and SCSP Report Highlights China's AI Infrastructure Expansion
Subscribe to Silicon Brief
Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.
Whitepaper
Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation
The 2025 AI Index by Stanford HAI provides a comprehensive overview of the global state of artificial intelligence, highlighting significant advancements in AI capabilities, investment, and regulation. The report details improvements in AI performance, increased adoption in various sectors, and the growing global optimism towards AI, despite ongoing challenges in reasoning and trust. It serves as a critical resource for policymakers, researchers, and industry leaders to understand AI's rapid evolution and its implications.
Read more