A team of Duke University radiologists and computer engineers unveiled a new mammography AI platform that could be an important step towards developing truly interpretable AI.
Explainable History – Healthcare leaders have been calling for explainable imaging AI for some time, but explainability efforts have been mainly limited to saliency / heat maps that show what part of an image influenced a model’s prediction (not how or why).
Duke’s Interpretable Model – Duke’s new AI platform analyzes mammography exams for potentially cancerous lesions to help physicians determine if a patient should receive a biopsy, while supporting its predictions with image and case-based explanations.
Training Interpretability – The Duke team trained their AI platform to locate and evaluate lesions following a process that human radiology educators and students would utilize:
- First, they trained the AI model to detect suspicious lesions and to ignore healthy tissues
- Then they had radiologists label the edges of the lesions
- Then they trained the model to compare those lesion edges with lesion edges from an archive of images with confirmed outcomes
Interpretable Predictions – This training process allowed the AI model to identify suspicious lesions, highlight the classification-relevant parts of the image, and explain its findings by referencing confirmed images.
Interpretable Results – Like many AI models, this early version could not identify cancerous lesions as accurately as human radiologists. However, it matched the performance of existing “black box” AI systems and the team was able to see why their AI model made its mistakes.
It seems like concerns over AI performance are growing at about the same pace as actual AI adoption, making explainability / interpretability increasingly important. Duke’s interpretable AI platform might be in its early stages, but its use of previous cases to explain findings seems like a promising (and straightforward) way to achieve that goal, while improving diagnosis in the process.
Many folks view explainability as a crucial next step for AI, but a new Lancet paper from a team of AI heavyweights argues that explainability might do more harm than good in the short-term, and AI stakeholders would be better off increasing their focus on validation.
The Old Theory – For as long as we’ve been covering AI, really smart and well-intentioned people have warned about the “black-box” nature of AI decision making and forecasted that explainable AI will lead to more trust, less bias, and greater adoption.
The New Theory – These black-box concerns and explainable AI forecasts might be logical, but they aren’t currently realistic, especially for patient-level decision support. Here’s why:
- Explainability methods describe how AI systems work, not how decisions are made
- AI explanations can be unreliable and/or superficial
- Most medical AI decisions are too complex to explain in an understandable way
- Humans over-trust computers, so explanations can hurt their ability to catch AI mistakes
- AI explainability methods (e.g heat maps) require human interpretation, risking confirmation bias
- Explainable AI adds more potential error sources (AI tool + AI explanation + human interpretation)
- Although we still can’t fully explain how acetaminophen works, we don’t question whether it works, because we’ve tested it extensively
The Explainability Alternative – Until suitable explainability methods emerge, the authors call for “rigorous internal and external validation of AI models” to make sure AI tools are consistently making the right recommendations. They also advised clinicians to remain cautious when referencing AI explanations and warned that policymakers should resist making explainability a requirement.
Explability’s Short-Term Role – Explainability definitely still has a role in AI safety, as it’s “incredibly useful” for model troubleshooting and systems audits, which can improve model performance and identify failure modes or biases.
The Takeaway – It appears we might not be close enough to explainable AI to make it a part of short-term AI strategies, policies, or procedures. That might be hard to accept for the many people who view the need for AI explainability as undebatable, and it makes AI validation and testing more important than ever.