How Should AI Be Monitored?

Once an AI algorithm has been approved and moves into clinical use, how should its performance be monitored? This question was top of mind at last week’s meeting of the FDA’s new Digital Health Advisory Committee.

AI has the potential to radically reshape healthcare and help clinicians manage more patients with fewer staff and other resources. 

  • But AI also represents a regulatory challenge because it’s constantly learning, such that after a few years an AI algorithm might be operating much differently from the version first approved by the FDA – especially with generative AI. 

This conundrum was a point of discussion at last week’s DHAC meeting, which was called specifically to focus on regulation of generative AI, and could result in new rules covering all AI algorithms. (An executive summary that outlines the FDA’s thinking is available for download.)

Radiology was well-represented at DHAC, understandable given it has the lion’s share of authorized algorithms (73% of 950 devices at last count). 

  • A half-dozen radiology AI experts gave presentations over two days, including Parminder Bhatia of GE HealthCare; Nina Kottler, MD, of Radiology Partners; Pranav Rajpurkar, PhD, of Harvard; and Keith Dreyer, DO, PhD, and Bernardo Bizzo, MD, PhD, both of Mass General Brigham and the ACR’s Data Science Institute.  

Dreyer and Bizzo directly addressed the question of post-market AI surveillance, discussing ongoing efforts to track AI performance, including … 

The Takeaway

Last week’s DHAC meeting offers a fascinating glimpse at the issues the FDA is wrestling with as it contemplates stronger regulation of generative AI. Fortunately, radiology has blazed a trail in setting up structures like ARCH-AI and Assess-AI to monitor AI performance, and the FDA is likely to follow the specialty’s lead as it develops a regulatory framework.

Real-World Stroke AI Implementation

Time is brain. That simple saying encapsulates the urgency in diagnosing and treating stroke, when just a few hours can mean a huge difference in a patient’s recovery. A new study in Clinical Radiology shows the potential for Nicolab’s StrokeViewer AI software to improve stroke diagnosis, but also underscores the challenges of real-world AI implementation.

Early stroke research recommended that patients receive treatment – such as with mechanical thrombectomy – within 6-8 hours of stroke onset. 

  • CT is a favored modality to diagnose patients, and the time element is so crucial that some health networks have implemented mobile stroke units with ambulances outfitted with on-board CT scanners. 

AI is another technology that can help speed time to diagnosis. 

  • AI analysis of CT angiography scans can help identify cases of acute ischemic stroke missed by radiologists, in particular cases of large vessel occlusion, for which one study found a 20% miss rate. 

The U.K.’s National Health Service has been looking closely at AI to provide 24/7 LVO detection and improve accuracy in an era of workforce shortages.

  • StrokeView is a cloud-based AI solution that analyzes non-contrast CT, CT angiography, and CT perfusion scans and notifies clinicians when a suspected LVO is detected. Reports can be viewed via PACS or with a smartphone.  

In the study, NHS researchers shared their experiences with StrokeView, which included difficulties with its initial implementation but ultimately improved performance after tweaks to the software.  

  • For example, researchers encountered what they called “technical failures” in the first phase of implementation, mostly related to issues like different protocol names radiographers used for CTA scans that weren’t recognized by the software. 

Nicolab was notified of the issue, and the company performed training sessions with radiographers. A second implementation took place, and researchers found that across 125 suspected stroke cases  … 

  • Sensitivity was 93% in both phases of the study.
  • Specificity rose from the first to second implementation (91% to 94%).
  • The technical failure rate dropped (25% to 17%).
  • Only two cases of technical failure occurred in the last month of the study.

The Takeaway

The new study is a warts-and-all description of a real-world AI implementation. It shows the potential of AI to improve clinical care for a debilitating condition, but also that success may require additional work on the part of both clinicians and AI developers.

Time to Embrace X-Ray AI for Early Lung Cancer Detection

Each year approximately 2 billion chest X-rays are performed globally. They are fast, noninvasive, and a relatively inexpensive radiological examination for front-line diagnostics in outpatient, emergency, or community settings. 

  • But beyond the simplicity of CXR lies a secret weapon in the fight against lung cancer: artificial intelligence. 

Be it serendipitous screening, opportunistic detection, or incidental identification, there is potential for AI incorporated into CXR to screen patients for disease when they are getting an unrelated medical examination. 

  • This could include the patient in the ER undergoing a CXR for suspected broken ribs after a fall, or an individual referred by their doctor for a CXR with suspected pneumonia. These people, without symptoms, may unknowingly have small yet growing pulmonary nodules. 

AI can find these abnormalities and flag them to clinicians as a suspicious finding for further investigation. 

  • This has the potential to find nodules earlier, in the very early stages of lung cancer when it is easier to biopsy or treat. 

Indeed, only 5.8% of eligible ex-smoking Americans undergo CT-based lung cancer screening. 

  • So the ability to cast the detection net wider through incidental pulmonary nodule detection has significant merits. 

Early global studies into the power of AI for incidental pulmonary nodules (IPNs) shows exciting promise.

  • The latest evidence shows one lung cancer detected for every 1,120 CXRs has major implications to diagnose and treat people earlier – and potentially save lives. 

The qXR-LN chest X-ray AI algorithm from Qure.ai is raising the bar for incidental pulmonary nodule detection. In a retrospective study performed on missed or mislabelled US CXR data, qXR-LN achieved an impressive negative predictive value of 96% and an AUC score of 0.99 for detection of pulmonary nodules. 

  • By acting as a second pair of eyes for radiologists, qXR-LN can help detect subtle anatomical anomalies that may otherwise go unnoticed, particularly in asymptomatic patients.

The FDA-cleared solution serves as a crucial second reader, assisting in the review of chest radiographs on the frontal projection. 

  • In another multicenter study involving 40 sites from across the U.S., the qXR-LN algorithm demonstrated an impressive AUC of 94% for scan-level nodule detection, highlighting its potential to significantly impact patient outcomes by identifying early signs of lung cancer that can be easily missed. 

The Takeaway 

By harnessing the power of AI for opportunistic lung cancer surveillance, healthcare providers can adopt a proactive approach to early detection, without significant new investment, and ultimately improving patient survival rates.

Qure.ai will be exhibiting at RSNA 2024, December 1-4. Visit booth #4941 for discussion, debate, and demonstrations.

Sources

AI-based radiodiagnosis using Chest X-rays: A review. Big Data Analytics for Social Impact, Volume 6 – 2023

Results from a feasibility study for integrated TB & lung cancer screening in Vietnam, Abstract presentation UNION CONF 2024: 2560   

Performance of a Chest Radiography AI Algorithm for Detection of Missed or Mislabelled Findings: A Multicenter Study. Diagnostics 12, no. 9 (2022): 2086

Qure.ai. Qure.ai’s AI-Driven Chest X-ray Solution Receives FDA Clearance for Enhanced Lung Nodule Detection. Qure.ai, January 7, 2024

Mammography AI Predicts Cancer Before It’s Detected

A new study highlights the predictive power of AI for mammography screening – before cancers are even detected. Researchers in a study JAMA Network Open found that risk scores generated by Lunit’s Insight MMG algorithm predicted which women would develop breast cancer – years before radiologists found it on mammograms. 

Mammography image analysis has always been one of the most promising use cases for AI – even dating back to the days of computer-aided detection in the early 2000s. 

  • Most mammography AI developers have focused on helping radiologists identify suspicious lesions on mammograms, or triage low-risk studies so they don’t require extra review.

But a funny thing has happened during clinical use of these algorithms – radiologists found that AI-generated risk scores appeared to predict future breast cancers before they could be seen on mammograms. 

  • Insight MMG marks areas of concern and generates a risk score of 0-100 for the presence of breast cancer (higher numbers are worse). 

Researchers decided to investigate the risk scores’ predictive power by applying Insight MMG to screening mammography exams acquired in the BreastScreen Norway program over three biennial rounds of screening from 2004 to 2018. 

  • They then correlated AI risk scores to clinical outcomes in exams for 116k women for up to six years after the initial screening round.

Major findings of the study included … 

  • AI risk scores were higher for women who later developed cancer, 4-6 years before the cancer was detected.
  • The difference in risk scores increased over three screening rounds, from 21 points in the first round to 79 points in the third round.
  • Risk scores had very high accuracy by the third round (AUC=0.93).
  • AI scores were more accurate than existing risk tools like the Tyrer-Cuzick model.

How could AI risk scores be used in clinical practice? 

  • Women without detectable cancer but with high scores could be directed to shorter screening intervals or screening with supplemental modalities like ultrasound or MRI.

The Takeaway
It’s hard to overstate the significance of the new results. While AI for direct mammography image interpretation still seems to be having trouble catching on (just like CAD did), risk prediction is a use case that could direct more effective breast screening. The study is also a major coup for Lunit, continuing a string of impressive clinical results with the company’s technology.

AI Recon Cuts CT Radiation Dose

Artificial intelligence got its start in radiology as a tool to help medical image interpretation, but much of AI’s recent progress is in data reconstruction: improving images before radiologists even get to see them. Two new studies underscore the potential of AI-based reconstruction to reduce CT radiation dose while preserving image quality. 

Radiology vendors and clinicians have been remarkably successful in reducing CT radiation dose over the past two decades, but there’s always room for improvement. 

  • In addition to adjusting CT scanning protocols like tube voltage and current, data reconstruction protocols have been introduced to take images acquired at lower radiation levels and “boost” them to look like full-dose images. 

The arrival of AI and other deep learning-based technologies has turbocharged these efforts. 

They compared DLIR operating at high strength to GE’s older ASiR-V protocol in CCTA scans with lower tube voltage (80 kVp), finding that deep learning reconstruction led to …

  • 42% reduction in radiation dose (2.36 mSv vs. 4.07)
  • 13% reduction in contrast dose (50 mL vs. 58 mL).
  • Better signal- and contrast-to-noise ratios.
  • Higher image quality ratings.

In the second study, researchers from China including two employees of United Imaging Healthcare used a deep learning reconstruction algorithm to test ultralow-dose CT scans for coronary artery calcium scoring. 

  • They wanted to see if CAC scoring could be performed with lower tube voltage and current (80 kVp/20 mAs) and how the protocol compared to existing low-dose scans.

In tests with 156 patients, they found the ultralow-dose protocol produced …

  • Lower radiation dose (0.09 vs. 0.49 mSv).
  • No difference in CAC scoring or risk categorization. 
  • Higher contrast-to-noise ratio.

The Takeaway

AI-based data reconstruction gives radiologists the best of both worlds: lower radiation dose with better-quality images. These two new studies illustrate AI’s potential for lowering CT dose to previously unheard-of levels, with major benefits for patients.

Imaging News from ESC 2024

The European Society of Cardiology annual meeting concluded on September 2 in London, with around 32k clinicians from 171 countries attending some 4.4k presentations. Organizers reported that attendance finally rebounded to pre-COVID numbers. 

While much of ESC 2024 focused on treatments for cardiovascular disease, diagnosis with medical imaging still played a prominent role. 

  • Cardiac CT dominated many ESC sessions, and AI showed it is nearly as hot in cardiology as it is in radiology. 

Major imaging-related ESC presentations included…

  • A track on cardiac CT that underscored CT’s prognostic value:
    • Myocardial revascularization patients who got FFR-CT had lower hazard ratios for MACE and all-cause mortality (HR=0.73 and 0.48).
    • Incidental coronary artery anomalies appeared on 1.45% of CCTA scans for patients with suspected coronary artery disease.
  • AI flexed its muscles in a machine learning track:
    • AI of low-dose CT scans had an AUC of 0.95 for predicting pulmonary congestion, a sign of acute heart failure. 
    • Echocardiography AI identified HFpEF with higher AUC than clinical models (0.75 vs. 0.69).
    • AI of transthoracic echo detected hypertrophic cardiomyopathy with AUC=0.85.

Another ESC hot topic was CT for calculating coronary artery calcium (CAC) scores, a possible predictor of heart disease. Sessions found … 

  • AI-generated volumetry of cardiac chambers based on CAC scans better predicted cardiovascular events than Agatston scores over 15 years of follow-up in an analysis of 5.8k patients from the MESA study. 
  • AI-CAC with CT was comparable to cardiac MRI read by humans for predicting atrial fibrillation (0.802 vs. 0.798) and stroke (0.762 vs. 0.751) over 15 years, which could give an edge to AI-CAC given its automated nature.
  • An AI algorithm enabled opportunistic screening of CAC quantification from non-gated chest CT scans of 631 patients, finding high CAC scores in 13%. Many got statins, while 22 got additional imaging and 2 intervention.
  • AI-generated CAC scores were also highlighted in a Polish study, detecting CAC on contrast CT at a rate comparable to humans on non-contrast CT (77% vs. 79%), possibly eliminating the need for additional non-contrast CT.  

The Takeaway

This week’s ESC 2024 sessions demonstrate the vital role of imaging in diagnosing and treating cardiovascular disease. While radiologists may not control the patients, they can always apply knowledge of advances in other disciplines to their work.

AI Detects Interval Cancer on Mammograms

In yet another demonstration of AI’s potential to improve mammography screening, a new study in Radiology shows that Lunit’s Insight MMG algorithm detected nearly a quarter of interval cancers missed by radiologists on regular breast screening exams. 

Breast screening is one of healthcare’s most challenging cancer screening exams, and for decades has been under attack by skeptics who question its life-saving benefit relative to “harms” like false-positive biopsies.  

  • But AI has the potential to change the cost-benefit equation by detecting a higher percentage of early-stage cancers and improving breast cancer survival rates. 

Indeed, 2024 has been a watershed year for mammography AI. 

U.K. researchers used Insight MMG (also used in the BreastScreen Norway trial) to analyze 2.1k screening mammograms, of which 25% were interval cancers (cancers occurring between screening rounds) and the rest normal. 

  • The AI algorithm generates risk scores from 0-100, with higher scores indicating likelihood of malignancy, and this study was set at a 96% specificity threshold, equivalent to the average 4% recall rate in the U.K. national breast screening program.

In analyzing the results, researchers found … 

  • AI flagged 24% of the interval cancers and correctly localized 77%.
  • AI localized a higher proportion of node-positive than node-negative cancers (24% vs. 16%).
  • Invasive tumors had higher median risk scores than noninvasive (62 vs. 33), with median scores of 26 for normal mammograms.

Researchers also tested AI at a lower specificity threshold of 90%. 

  • AI detected more interval cancers at this level, but in real-world practice this would bump up recall rates.  

It’s also worth noting that Insight MMG is designed for the analysis of 2D digital mammography, which is more common in Europe than DBT. 

  • For the U.S., Lunit is emphasizing its recently cleared Insight DBT algorithm, which may perform differently.  

The Takeaway

As with the MASAI and BreastScreen Norway results, the new study points to an exciting role for AI in making mammography screening more accurate with less drain on radiologist resources. But as with those studies, the new results must be interpreted against Europe’s double-reading paradigm, which differs from the single-reading protocol used in the U.S. 

FDA Keeps Pace on AI Approvals

The FDA has updated its list of AI- and machine learning-enabled medical devices that have received regulatory authorization. The list is a closely watched barometer of the health of the AI sector, and the update shows the FDA is keeping a brisk pace of authorizations.

The FDA has maintained double-digit growth of AI authorizations for the last several years, a pace that reflects the growing number of submissions it’s getting from AI developers. 

  • Indeed, data compiled by regulatory expert Bradley Merrill Thompson show how the number of FDA authorizations has been growing rapidly since the dawn of the medical AI era in around 2016 (see also our article on AI safety below). 

The new FDA numbers show that …

  • The FDA has now authorized 950 AI/ML-enabled devices since it began keeping track
  • Device authorizations are up 15% for the first half of 2024 compared to the same period the year before (107 vs. 93)
  • The pace could grow even faster in late 2024 – in 2023, FDA in the second half authorized 126 devices, up 35% over the first half
  • At that pace, the FDA should hit just over 250 total authorizations in 2024 
  • This would represent 14% growth over 220 authorizations in 2023, and compares to growth of 14% in 2022 and 15% in 2021
  • As with past updates, radiology makes up the lion’s share of AI/ML authorizations, but had a 73% share in the first half, down from 80% for all of 2023
  • Siemens Healthineers led in all H1 2024 clearances with 11, bringing its total to 70 (66 for Siemens and four for Varian). GE HealthCare remains the leader with 80 total clearances after adding three in H1 2024 (GE’s total includes companies it has acquired, like Caption Health and MIM Software). There’s a big drop off after GE and Siemens, including Canon Medical (30), Aidoc (24), and Philips (24).

The FDA’s list includes both software-only algorithms as well as hardware devices like scanners that have built-in AI capabilities, such as a mobile X-ray unit that can alert users to emergent conditions. 

  • Indeed, many of the authorizations on the FDA’s list are for updated versions of already-cleared products rather than brand-new solutions – a trend that tends to inflate radiology’s share of approvals.

The Takeaway

The new FDA numbers on AI/ML regulatory authorizations are significant not only for revealing the growth in approvals, but also because the agency appears to be releasing the updates more frequently – perhaps a sign it is practicing what it preaches when it comes to AI openness and transparency. 

Better Prostate MRI with AI

A homegrown AI algorithm was able to detect clinically significant prostate cancer on MRI scans with the same accuracy as experienced radiologists. In a new study in Radiology, researchers say the algorithm could improve radiologists’ ability to detect prostate cancer on MRI, with fewer false positives.

In past issues of The Imaging Wire, we’ve discussed the need to improve on existing tools like PSA tests to make prostate cancer screening more precise with fewer false positives and less need for patient work-up.

  • Adding MRI to prostate screening protocols is a step forward, but MRI is an expensive technology that requires experienced radiologists to interpret.

Could AI help? In the new study, researchers tested a deep learning algorithm developed at the Mayo Clinic to detect clinically significant prostate cancer on multiparametric (mpMRI) scans.

  • In an interesting wrinkle, the Mayo algorithm does not indicate tumor location, so a second algorithm – called Grad-CAM – was employed to localize tumors.

The Mayo algorithm was trained on a population of 5k patients with a cancer prevalence similar to a screening population, then tested in an external test set of 204 patients, finding …

  • No statistically significant difference in performance between the Mayo algorithm and radiologists based on AUC (0.86 vs. 0.84, p=0.68)
  • The highest AUC was with the combination of AI and radiologists (0.89, p<0.001)
  • The Grad-CAM algorithm was accurate in localizing 56 of 58 true-positive exams

An editorial noted that the study employed the Mayo algorithm on multiparametric MRI exams.

  • Prostate cancer imaging is moving from mpMRI toward biparametric MRI (bpMRI) due to its faster scan times and lack of contrast, and if validated on bpMRI, AI’s impact could be even more dramatic.

The Takeaway
The current study illustrates the exciting developments underway to make prostate imaging more accurate and easier to perform. They also support the technology evolution that could one day make prostate cancer screening a more widely accepted test.

US + Mammo vs. Mammo + AI for Dense Breasts

Artificial intelligence may represent radiology’s future, but for at least one clinical application traditional imaging seems to be the present. In a new study in Radiology, ultrasound was more effective than AI for supplemental imaging of women with dense breast tissue. 

Dense breast tissue has long presented problems for breast imaging specialists. 

  • Women with dense breasts are at higher risk of breast cancer, but traditional screening modalities like X-ray mammography don’t work very well (sensitivity of 30-48%), creating the need for supplemental imaging tools like ultrasound and MRI.

In the new study, researchers from South Korea tested the use of Lunit’s Insight MMG mammography AI algorithm in 5.7k women without symptoms who had breast tissue classified as heterogeneously (63%) or extremely dense (37%). 

  • AI’s performance was compared to both mammography alone as well as to mammography with ultrasound, one of the gold-standard modalities for imaging women with dense breasts. 

All in all, researchers found …

  • Mammography with AI had lower sensitivity than mammography with ultrasound but slightly better than mammography alone (61% vs. 97% vs. 58%)
  • Mammography with AI had a lower cancer detection rate per 1k women but higher than mammography alone (3.5 vs. 5.6 vs. 3.3)
  • Mammography with AI missed 12 cancers detected with mammography with ultrasound
  • Mammography with AI had the highest specificity (95% vs. 78% vs. 94%)
  • And the lowest abnormal interpretation rate (5% vs. 23% vs. 6%)

The results show that while AI can help radiologists interpret screening mammography for most women, at present it can’t compensate for mammography’s low sensitivity in women with dense breast tissue.

In an editorial, breast radiologists Gary Whitman, MD, and Stamatia Destounis, MD, observed that supplemental imaging of women with dense breasts is getting more attention as the FDA prepares to implement breast density notification rules in September. 

  • They recommended follow-up studies with other AI algorithms, more patients, and a longer follow-up period. 

The Takeaway

As with a recent study on AI and teleradiology, the current research is a good step toward real-world evaluation of AI for a specific use case. While AI in this instance didn’t improve mammography’s sensitivity in women with dense breast tissue, it could carve out a role reducing false positives for these women who get mammography and ultrasound.

Get every issue of The Imaging Wire, delivered right to your inbox.

You might also like..

Select All

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Imaging Wire team

You're all set!