NVIDIA Unveils Rubin CPX GPU for Long-Context AI Inference
NVIDIA has announced the Rubin CPX GPU, a new class of graphics processing unit designed for massive-context inference, announced in a press release. This GPU is part of NVIDIA's Rubin series and is optimized for processing context windows larger than 1 million tokens, making it suitable for tasks such as video generation and software development.
The Rubin CPX is integrated into the NVIDIA Vera Rubin NVL144 CPX platform, which offers 8 exaflops of AI performance and 100TB of fast memory in a single rack. This platform is expected to deliver 7.5 times more AI performance than previous systems, enabling companies to monetize AI applications at an unprecedented scale.
AI innovators like Cursor, Runway, and Magic are exploring the potential of Rubin CPX to accelerate their applications. The GPU is expected to be available by the end of 2026, marking a significant advancement in NVIDIA's AI computing capabilities.
We hope you enjoyed this article.
Consider subscribing to one of our newsletters like Silicon Brief or Daily AI Brief.
Also, consider following us on social media:
More from: Data Centers
Subscribe to Silicon Brief
Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.
Market report
2025 Generative AI in Professional Services Report
This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.
Read more