AWS and Cerebras Partner to Deliver Fastest AI Inference on Bedrock

March 16, 2026

Amazon Web Services and Cerebras Systems are collaborating to deploy Cerebras CS-3 systems in AWS data centers, combining them with AWS Trainium chips to deliver the fastest AI inference speeds through Amazon Bedrock.

AWS and Cerebras Partner to Deliver Fastest AI Inference on Bedrock

Amazon Web Services and Cerebras Systems announced a partnership to provide what they describe as the fastest AI inference available in the cloud, according to a press release. The collaboration will deploy Cerebras CS-3 systems in AWS data centers and make them accessible through Amazon Bedrock in the coming months.

The joint solution combines AWS Trainium processors, optimized for prefill computation, with Cerebras CS-3 hardware, optimized for decode operations. These components are connected through Amazon’s Elastic Fabric Adapter networking, enabling a disaggregated inference architecture that separates the two stages of AI inference—prefill and decode—for greater efficiency and speed.

As detailed by Cerebras, this configuration allows Trainium to handle compute-intensive prefill tasks while the CS-3 focuses on generating output tokens. The companies claim this setup will deliver up to five times more high-speed token capacity within the same hardware footprint compared to traditional GPU-based systems.

The service, available via Amazon Bedrock, will support leading open-source large language models as well as Amazon’s own Nova models. AWS plans to roll out the integrated Trainium and CS-3 inference capability globally later this year.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Silicon Brief or Daily AI Brief.

Also, consider following us on social media:

AI Chips & Datacenters AI Brief AI Brief (X)

More from: Data Centers

04/30

OG&E to Power Three New Google Data Centers in Oklahoma

04/30

Tencent Cloud Wins Two Awards at NAB Show 2026 for AI Video Solutions

04/30

ValorC3 Secures Financing from Apterra for Boise Data Center Development

04/30

Processing in Memory AI Chips Market Projected to Reach $44 Billion by 2032

04/30

Anshtern Debuts Full Stack Intelligent Computing and Embodied Intelligence Engine

Subscribe to Silicon Brief

Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.

Market report

AI’s Time-to-Market Quagmire: Why Enterprises Struggle to Scale AI Innovation

ModelOp

The 2025 AI Governance Benchmark Report by ModelOp provides insights from 100 senior AI and data leaders across various industries, highlighting the challenges enterprises face in scaling AI initiatives. The report emphasizes the importance of AI governance and automation in overcoming fragmented systems and inconsistent practices, showcasing how early adoption correlates with faster deployment and stronger ROI.

Categories

Companies

Resources

AWS and Cerebras Partner to Deliver Fastest AI Inference on Bedrock

We hope you enjoyed this article.

More from: Data Centers

Subscribe to Silicon Brief

Market report

AI’s Time-to-Market Quagmire: Why Enterprises Struggle to Scale AI Innovation

You May Also Like

OpenAI Commits Over $20 Billion to Cerebras Chips, Gains Potential Equity Stake

Anthropic Expands AWS Partnership with $100 Billion Cloud Commitment

SambaNova and Intel Reveal Heterogeneous AI Inference Blueprint Using Xeon 6 and RDUs

OpenAI Models and Codex Now Available on AWS Bedrock

EvoChip.ai Reports 40x Faster AI Inference with AltiCoreAI

AWS Adds Optimized Deployments for Foundation Models in SageMaker JumpStart

TD SYNNEX Expands AI Infrastructure Portfolio With NVIDIA HGX B300 Clusters on Nebius Cloud

Intel and Google Expand Collaboration on AI and Cloud Infrastructure

C3 AI Introduces C3 Code for Automated Enterprise AI Application Development

Infor and AWS Introduce Agentic AI for Manufacturing Operations

Amazon CEO Highlights Chip Momentum and Satellite Ambitions in 2026 Letter

Quantum Computing Inc. Makes NeuraWave Available for Deployment