OpenAI Adds New Voice Intelligence Models to Its API

May 08, 2026
OpenAI has introduced three new voice intelligence models to its API, enabling developers to build apps that can converse, translate, and transcribe in real time. The models include GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper.

OpenAI has added new voice intelligence capabilities to its API, announced in a press release. The update introduces three models designed for live conversation, translation, and transcription.

The new GPT-Realtime-2 model uses GPT-5-class reasoning to manage complex spoken interactions while maintaining conversational flow. It supports features such as parallel tool calls, adjustable reasoning levels, and longer context windows. The model is built to handle tasks like scheduling, answering questions, or executing commands through natural speech.

GPT-Realtime-Translate enables live translation across more than 70 input languages and 13 output languages, maintaining conversational pace and fluency. It is designed for use in customer support, education, media, and multilingual events.

GPT-Realtime-Whisper provides streaming speech-to-text transcription for live scenarios, allowing captions, meeting notes, and other text outputs to appear as people speak. The Realtime API includes safeguards that monitor sessions and can halt conversations detected as violating harmful content policies.

All three models are available through the Realtime API. GPT-Realtime-2 is billed by token usage, while GPT-Realtime-Translate and GPT-Realtime-Whisper are billed per minute.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like AI in Education or Daily AI Brief.

Also, consider following us on social media:

Subscribe to AI in Education

Weekly newsletter about AI in education. Covers AI-driven software for educators, schools, general innovations and regulatory updates.

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Read more