Predicting AI Performance - The Imaging Wire

How can you predict whether an AI algorithm will fall short for a particular clinical use case such as detecting cancer? Researchers in Radiology took a crack at this conundrum by developing what they call an “uncertainty quantification” metric to predict when an AI algorithm might be less accurate.

AI is rapidly moving into wider clinical use, with a number of exciting studies published in just the last few months showing how AI can help radiologists interpret screening mammograms or direct which women should get supplemental breast MRI.

But AI isn’t infallible. And unlike a human radiologist who might be less confident in a particular diagnosis, an AI algorithm doesn’t have a built-in hedging mechanism.

So researchers from Denmark and the Netherlands decided to build one. They took publicly available AI algorithms and tweaked their code so they produced “uncertainty quantification” scores with their predictions.

They then tested how well the scores predicted AI performance in a dataset of 13k images for three common tasks covering some of the deadliest types of cancer:

1) detecting pancreatic ductal adenocarcinoma on CT
2) detecting clinically significant prostate cancer on MRI
3) predicting pulmonary nodule malignancy on low-dose CT

Researchers classified the highest 80% of the AI predictions as “certain,” and the remaining 20% as “uncertain,” and compared AI’s accuracy in both groups, finding …

AI led to significant accuracy improvements in the “certain” group for pancreatic cancer (80% vs. 59%), prostate cancer (90% vs. 63%), and pulmonary nodule malignancy prediction (80% vs. 51%)
AI accuracy was comparable to clinicians when its predictions were “certain” (80% vs. 78%, P=0.07), but much worse when “uncertain” (50% vs. 68%, P<0.001)
Using AI to triage “uncertain” cases produced overall accuracy improvements for pancreatic and prostate cancer (+5%) and lung nodule malignancy prediction (+6%) compared to a no-triage scenario

How would uncertainty quantification be used in clinical practice? It could play a triage role, deprioritizing radiologist review of easier cases while helping them focus on more challenging studies. It’s a concept similar to the MASAI study of mammography AI.

The Takeaway

Like MASAI, the new findings present exciting new possibilities for AI implementation. They also present a framework within which AI can be implemented more safely by alerting clinicians to cases in which AI’s analysis might fall short – and enabling humans to step in and pick up the slack.

Digital Health Wire

Cardiac Wire

Get every issue of The Imaging Wire, delivered right to your inbox.

You might also like

Top 6 Radiology Trends from 2025’s First Half July 7, 2025

AI and Legal Liability in Radiology June 30, 2025

PET Radiotracers Drive News from SNMMI 2025 June 26, 2025

You might also like..

Digital Health Wire

Cardiac Wire

You're signed up!

You're all set!