OpenAI's GDPval Benchmark Evaluates AI in Real-World Jobs

September 25, 2025
OpenAI has introduced GDPval, a new benchmark to assess AI models like GPT-5 against human professionals across various industries. The evaluation spans 44 occupations and aims to measure AI's performance on economically valuable tasks.

OpenAI has introduced a new benchmark called GDPval to evaluate how its AI models, including GPT-5, perform compared to human professionals across a wide range of industries. In a company announcement, OpenAI detailed that GDPval spans 44 occupations within nine key industries contributing significantly to the U.S. GDP, such as healthcare, finance, and manufacturing.

The GDPval benchmark is designed to test AI models on real-world tasks that are economically valuable. It involves tasks like creating legal briefs, engineering blueprints, and customer support conversations, which are evaluated by experienced professionals. The initial results show that GPT-5 and Anthropic's Claude Opus 4.1 are approaching the quality of work produced by industry experts, with GPT-5 being ranked as better than or on par with experts 40.6% of the time.

OpenAI acknowledges that GDPval is an early step and plans to expand the benchmark to include more interactive workflows and context-rich tasks. This initiative aims to provide a clearer picture of how AI can support professionals in their day-to-day work, potentially allowing them to focus on more meaningful tasks.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Daily AI Brief.

Also, consider following us on social media:

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Read more