Patronus AI Introduces Multimodal LLM-as-a-Judge for Image Evaluation

Patronus AI has launched the industry's first Multimodal LLM-as-a-Judge, a tool designed to enhance image-to-text applications by evaluating AI systems, as announced in a press release.

Patronus AI has launched the industry's first Multimodal LLM-as-a-Judge (MLLM-as-a-Judge), a tool designed to enhance the evaluation of image-to-text applications, announced in a press release. This new capability allows developers to score and optimize multimodal AI systems by assessing various criteria such as text presence, grid structure, spatial orientation, and object identification.

The Judge-Image tool, powered by Google Gemini, offers several out-of-box evaluation criteria, including caption hallucination detection, object description verification, and object location accuracy. It also tests OCR extraction accuracy for tabular data, AI-generated brand asset accuracy, and scene description validity.

Etsy, a leading e-commerce platform, is already utilizing Patronus AI's MLLM-as-a-Judge to reduce AI hallucinations in product image captions, thereby optimizing their multimodal AI system. Patronus AI plans to expand its evaluation capabilities to include audio and vision features in future releases.

We hope you enjoyed this article.

Consider subscribing to one of several newsletters we publish. For example, in the Daily AI Brief you can read the most up to date AI news round-up 6 days per week.

Also, consider following us on social media:

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates

Market report

2025 Generative AI in Professional Services Report

Thomson Reuters

This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.

Read more