NVIDIA Unveils Rubin CPX GPU for Long-Context AI Inference

September 09, 2025

NVIDIA has announced the Rubin CPX GPU, designed for handling large context windows in AI applications, expected to be available by the end of 2026.

NVIDIA has announced the Rubin CPX GPU, a new class of graphics processing unit designed for massive-context inference, announced in a press release. This GPU is part of NVIDIA's Rubin series and is optimized for processing context windows larger than 1 million tokens, making it suitable for tasks such as video generation and software development.

The Rubin CPX is integrated into the NVIDIA Vera Rubin NVL144 CPX platform, which offers 8 exaflops of AI performance and 100TB of fast memory in a single rack. This platform is expected to deliver 7.5 times more AI performance than previous systems, enabling companies to monetize AI applications at an unprecedented scale.

AI innovators like Cursor, Runway, and Magic are exploring the potential of Rubin CPX to accelerate their applications. The GPU is expected to be available by the end of 2026, marking a significant advancement in NVIDIA's AI computing capabilities.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Silicon Brief or Daily AI Brief.

Also, consider following us on social media:

AI Chips & Datacenters AI Brief AI Brief (X)

More from: Data Centers

10/30

Delta Electronics Unveils AI Data Center Microgrid Solution at Energy Taiwan 2025

10/30

Emerald AI and NVIDIA Unveil Power-Flexible AI Factory in Virginia

10/29

Cisco Expands AI Infrastructure Portfolio with NVIDIA Collaboration

10/29

Fermi America Secures Amarillo Water Deal and Carson County Tax Abatement for 11GW AI Energy Campus

10/29

Vertiv Releases Gigawatt-Scale Architectures for NVIDIA Omniverse DSX

Subscribe to Silicon Brief

Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Categories

Companies

Resources

NVIDIA Unveils Rubin CPX GPU for Long-Context AI Inference

We hope you enjoyed this article.

More from: Data Centers

Subscribe to Silicon Brief

Market report

2025 Generative AI in Professional Services Report

You May Also Like

Intel Introduces Crescent Island GPU for AI Inference Workloads

Qubrid AI Launches Advanced Playground for Inferencing and RAG on NVIDIA Infrastructure

XConn and MemVerge Demonstrate 100TiB CXL Memory Pool for AI Workloads at OCP Summit

ASUS Showcases AI Factory and New NVIDIA HGX B300 Servers at OCP 2025

Microsoft Azure Launches First NVIDIA GB300 NVL72 Supercomputing Cluster for OpenAI

Chan Zuckerberg Initiative and NVIDIA Expand Partnership on Virtual Cell Models

NVIDIA Expands Omniverse Platform to Power U.S. AI Manufacturing and Robotics

Vertiv Releases Gigawatt-Scale Architectures for NVIDIA Omniverse DSX

GSI Technology’s Compute-in-Memory APU Matches GPU AI Performance with 98% Lower Energy Use

Qualcomm Unveils AI200 and AI250 Chips for Data Centers

GIGABYTE Launches AI TOP ATOM Supercomputer Powered by NVIDIA GB10

AI Chip Market Projected to Reach $460.9 Billion by 2034