LLMs Enhance Error Detection in Radiology Reports
Fine-tuned large language models (LLMs) have demonstrated significant potential in enhancing error detection in radiology reports, as reported in a recent study published in Radiology. The research involved models such as GPT-4 and Llama-3-70B-Instruct, which were tested on a dataset of 614 chest x-ray reports, both with and without errors.
The study, led by researchers from Weill Cornell Medicine, involved fine-tuning three models—BiomedBERT, Llama-3-8Binstruct, and Llama-3-70B-Instruct—using a combination of synthetic and real-world data. The fine-tuned Llama-3-70B-Instruct model achieved the highest performance, with an overall macro F1 score of 0.780, excelling in detecting various error types such as negation, left/right, interval change, and transcription errors.
The use of synthetic data was crucial in training these models, allowing for a diverse and comprehensive dataset that maintained patient privacy. Despite the promising results, the study's authors noted that further evaluation is needed to assess the models' applicability and reliability in daily clinical practice. They emphasized the importance of fine-tuning and prompt design in optimizing LLM performance for specific medical tasks.
We hope you enjoyed this article.
Consider subscribing to one of several newsletters we publish like Life AI Weekly.
Also, consider following us on social media:
More from: Healthcare & Life Sciences
Subscribe to Life AI Weekly
Weekly coverage of AI applications in healthcare, drug development, biotechnology research, and genomics breakthroughs.
Market report
2025 Generative AI in Professional Services Report
This report by Thomson Reuters explores the integration and impact of generative AI technologies, such as ChatGPT and Microsoft Copilot, within the professional services sector. It highlights the growing adoption of GenAI tools across industries like legal, tax, accounting, and government, and discusses the challenges and opportunities these technologies present. The report also examines professionals' perceptions of GenAI and the need for strategic integration to maximize its value.
Read more