It’s no secret that the rapid growth of AI in radiology is being fueled by venture capital firms eager to see a payoff for early investments in startup AI developers. But are there signs that VCs’ appetite for radiology AI is starting to wane?
Maybe. And maybe not. While one new analysis shows that AI investments slowed in 2023 compared to the year before, another predicts that over the long term, VC investing will spur a boom in AI development that is likely to transform radiology.
First up is an update by Signify Research to its ongoing analysis of VC funding. The new numbers show that through Q3 2023, the number of medical imaging AI deals has fallen compared to Q3 2022 (24 vs. 40).
Total funding has also fallen for the second straight year, to $501M year-to-date in 2023. That compares to $771M through the third quarter of 2022, and $1.1B through the corresponding quarter of 2021.
On the other hand, the average deal size has grown to an all-time high of $20.9M, compared to 2022 ($15.4M) and 2021 ($18M).
And one company – Rapid AI – joined the exclusive club of just 14 AI vendors that have raised over $100M with a $75M Series C round in July 2023.
In a look forward at AI’s future, a new analysis in JACRby researchers from the ACR Data Science Institute (DSI) directly ties VC funding to healthcare AI software development, predicting that every $1B in funding translates into 11 new product approvals, with a six-year lag between funding and approval.
And the authors forecast long-term growth: In 2022 there were 69 FDA-approved products, but by 2035, funding is expected to reach $31B for the year, resulting in the release of a staggering 350 new AI products that year.
Further, the ACR DSI authors see a virtuous cycle developing, as increasing AI adoption spurs more investment that creates more products available to help radiologists with their workloads.
The Takeaway
The numbers from Signify and ACR DSI don’t match up exactly, but together they paint a picture of a market segment that continues to enjoy massive VC investment. While the precise numbers may fluctuate year to year, investor interest in medical imaging AI will fuel innovation that promises to transform how radiology is practiced in years to come.
What is autonomous artificial intelligence, and is radiology ready for this new technology? In this paper, we explore one of the most exciting autonomous AI applications, ChestLink from Oxipit.
What is Autonomous AI?
Up to now, most interpretive AI solutions have focused on assisting radiologists with analyzing medical images. In this scenario, AI provides suggestions to radiologists and alerts them to suspicious areas, but the final diagnosis is the physician’s responsibility.
Autonomous AI flips the script by having AI run independently of the radiologist, such as by analyzing a large batch of chest X-ray exams for tuberculosis to screen out those certain to be normal. This can significantly reduce the primary care workload, where healthcare providers who offer preventive health checkups may see up to 80% of chest X-rays with no abnormalities.
Autonomous AI frees the radiologist to focus on cases with suspicious pathology – with the potential of delivering a more accurate diagnosis to patients in real need.
One of the first of this new breed of autonomous AI is ChestLink from Oxipit. The solution received the CE Mark in March 2022, and more than a year later it is still the only AI application capable of autonomous performance.
How ChestLink Works
ChestLink produces final chest X-ray reports on healthy patients with no involvement from human radiologists. The application only reports autonomously on chest X-ray studies where it is highly confident that the image does not include abnormalities. These studies are automatically removed from the reporting workflow.
ChestLink enables radiologists to report on studies most likely to have abnormalities. In current clinical deployments, ChestLink automates 10-30% of all chest X-ray workflow. The exact percentage depends on the type of medical institution, with primary care facilities having the most potential for automation.
ChestLink Clinical Validation
ChestLink was trained on a dataset with over 500k images. In clinical validation studies, ChestLink consistently performed at 99%+ sensitivity.
“The most surprising finding was just how sensitive this AI tool was for all kinds of chest disease. In fact, we could not find a single chest X-ray in our database where the algorithm made a major mistake. Furthermore, the AI tool had a sensitivity overall better than the clinical board-certified radiologists,” said study co-author Louis Lind Plesner, MD, from the Department of Radiology at the Herlev and Gentofte Hospital in Copenhagen, Denmark.
In this study ChestLink autonomously reported on 28% of all normal studies.
In another study at the Oulu University Hospital in Finland, researchers concluded that AI could reliably remove 36.4% of normal chest X-rays from the reporting workflow with a minimal number of false negatives, leading to effectively no compromise on patient safety.
Safe Path to AI Autonomy
Oxipit ChestLink is currently used in healthcare facilities in the Netherlands, Finland, Lithuania, and other European countries, and is in the trial phase for deployment in one of the leading hospitals in England.
ChestLink follows a three-stage framework for clinical deployment.
Retrospective analysis. ChestLink analyzes a couple of years worth (100k+) of historic chest x-ray studies at the medical institution. In this analysis the product is validated on real-world data. It also realistically estimates what fraction of reporting scope can be automated.
Semi-autonomous operations. The application moves into prospective settings, analyzing images in near-real time. ChestLink produces preliminary reports for healthy patients, which may then be approved by a certified clinician.
Autonomous operations. The application autonomously reports on high-confidence healthy patient studies. The application performance is monitored in real-time with analytical tools.
Are We There Yet?
ChestLink aims to address the shortage of clinical radiologists worldwide, which has led to a substantial decline in care quality.
In the UK, the NHS currently faces a massive 33% shortfall in its radiology workforce. Nearly 71% of clinical directors of UK radiology departments feel that they do not have a sufficient number of radiologists to deliver safe and effective patient care.
ChestLink offers a safe pathway into autonomous operations by automating a significant and somewhat mundane portion of radiologist workflow without any negative effects for patient care.
So should we embrace autonomous AI? The real question should be, can we afford not to?
In another blow to radiology AI, the UK’s national technology assessment agency issued an equivocal report on AI for chest X-ray, stating that more research is needed before the technology can enter routine clinical use.
The report came from the National Institute for Health and Care Excellence (NICE), which assesses new health technologies that have the potential to address unmet NHS needs.
The NHS sees AI as a potential solution to its challenge of meeting rising demand for imaging services, a dynamic that’s leading to long wait times for exams.
But at least some corners of the UK health establishment have concerns about whether AI for chest X-ray is ready for prime time.
The NICE report states that – despite the unmet need for quicker chest X-ray reporting – there is insufficient evidence to support the technology, and as such it’s not possible to assess its clinical and cost benefits. And it said there is “no evidence” on the accuracy of AI-assisted clinician review compared to clinicians working alone.
As such, the use of AI for chest X-ray in the NHS should be limited to research, with the following additional recommendations …
Centers already using AI software to review chest X-rays may continue to do so, but only as part of an evaluation framework and alongside clinician review
Purchase of chest X-ray AI software should be made through corporate, research, or non-core NHS funding
More research is needed on AI’s impact on a number of outcomes, such as CT referrals, healthcare costs and resource use, review and reporting time, and diagnostic accuracy when used alongside clinician review
The NICE report listed 14 commercially available chest X-ray algorithms that need more research, and it recommended prospective studies to address gaps in evidence. AI developers will be responsible for performing these studies.
The Takeaway
Taken with last week’s disappointing news on AI for radiology, the NICE report is a wakeup call for what had been one of the most promising clinical use cases for AI. The NHS had been seen as a leader in spearheading clinical adoption of AI; for chest X-ray, clinicians in the UK may have to wait just a bit longer.
How can you predict whether an AI algorithm will fall short for a particular clinical use case such as detecting cancer? Researchers in Radiologytook a crack at this conundrum by developing what they call an “uncertainty quantification” metric to predict when an AI algorithm might be less accurate.
AI is rapidly moving into wider clinical use, with a number of exciting studies published in just the last few months showing how AI can help radiologists interpret screening mammograms or direct which women should get supplemental breast MRI.
But AI isn’t infallible. And unlike a human radiologist who might be less confident in a particular diagnosis, an AI algorithm doesn’t have a built-in hedging mechanism.
So researchers from Denmark and the Netherlands decided to build one. They took publicly available AI algorithms and tweaked their code so they produced “uncertainty quantification” scores with their predictions.
They then tested how well the scores predicted AI performance in a dataset of 13k images for three common tasks covering some of the deadliest types of cancer:
1) detecting pancreatic ductal adenocarcinoma on CT 2) detecting clinically significant prostate cancer on MRI 3) predicting pulmonary nodule malignancy on low-dose CT
Researchers classified the highest 80% of the AI predictions as “certain,” and the remaining 20% as “uncertain,” and compared AI’s accuracy in both groups, finding …
AI led to significant accuracy improvements in the “certain” group for pancreatic cancer (80% vs. 59%), prostate cancer (90% vs. 63%), and pulmonary nodule malignancy prediction (80% vs. 51%)
AI accuracy was comparable to clinicians when its predictions were “certain” (80% vs. 78%, P=0.07), but much worse when “uncertain” (50% vs. 68%, P<0.001)
Using AI to triage “uncertain” cases produced overall accuracy improvements for pancreatic and prostate cancer (+5%) and lung nodule malignancy prediction (+6%) compared to a no-triage scenario
How would uncertainty quantification be used in clinical practice? It could play a triage role, deprioritizing radiologist review of easier cases while helping them focus on more challenging studies. It’s a concept similar to the MASAI study of mammography AI.
The Takeaway
Like MASAI, the new findings present exciting new possibilities for AI implementation. They also present a framework within which AI can be implemented more safely by alerting clinicians to cases in which AI’s analysis might fall short – and enabling humans to step in and pick up the slack.
A deep learning algorithm trained to analyze mammography images did a better job than traditional risk models in predicting breast cancer risk. The study shows the AI model could direct the use of supplemental screening breast MRI for women who need it most.
Breast MRI has emerged (along with ultrasound) as one of the most effective imaging modalities to supplement conventional X-ray-based mammography. Breast MRI performs well regardless of breast tissue density, and can even be used for screening younger high-risk women for whom radiation is a concern.
But there are also disadvantages to breast MRI. It’s expensive and time-consuming, and clinicians aren’t always sure which women should get it. As a result, breast MRI is used too often in women at average risk and not often enough in those at high risk.
In the current study in Radiology, researchers from MGH compared the Mirai deep learning algorithm to conventional risk-prediction models. Mirai was developed at MIT to predict five-year breast cancer risk, and the first papers on the model emerged in 2019; previous studies have already demonstrated the algorithm’s prowess for risk prediction.
Mirai was used to analyze mammograms and develop risk scores for 2.2k women who also received 4.2k screening breast MRI exams from 2017-2020 at four facilities. Researchers then compared the performance of the algorithm to traditional risk tools like Tyrer-Cuzick and NCI’s Breast Cancer Risk Assessment (BCRAT), finding that …
In women Mirai identified as high risk, the cancer detection rate per 1k on breast MRI was far higher compared to those classified as high risk by Tyrer-Cuzick and BCRAT (20.6 vs. 6.0 & 6.8)
Mirai had a higher PPV for predicting abnormal findings on breast MRI screening (14.6% vs. 5.0% & 5.5%)
Mirai scored higher in PPV of biopsies recommended (32.4% vs. 12.7% & 11.1%) and PPV for biopsies performed (36.4% vs. 13.5% & 12.5%)
The Takeaway Breast imaging has become one of the AI use cases with the most potential, based on recent studies like PERFORMS and MASAI, and the new study shows Mirai could be useful in directing women to breast MRI screening. Like the previous studies, the current research is pointing to a near-term future in which AI and deep learning can make breast screening more accurate and cost-effective than it’s ever been before.
Better patient care is the main selling point used by AI vendors when marketing neuroimaging algorithms, followed closely by time savings. Farther down the list of benefits are lower costs and increased revenue for providers.
So says a new analysis in JACRthat takes a close look at how FDA-cleared neuroimaging AI algorithms are marketed by vendors. It also includes several warning signs for both AI developers and clinicians.
AI is the most exciting technology to arrive in healthcare in decades, but questions percolate on whether AI developers are overhyping the technology. In the new analysis, researchers focused on marketing claims made for 59 AI neuroimaging algorithms cleared by the FDA from 2008 to 2022. Researchers analyzed FDA summaries and vendor websites, finding:
For 69% of algorithms, vendors highlighted an improvement in quality of patient care, while time savings for clinicians were touted for 44%. Only 16% of algorithms were promoted as lowering costs, while just 11% were positioned as increasing revenue
50% of cleared neuroimaging algorithms were related to detection or quantification of stroke; of these, 41% were for intracranial hemorrhage, 31% for stroke brain perfusion, and 24% for detection of large vessel occlusion
41% of the algorithms were intended for use with non-contrast CT scans, 36% with MRI, 15% with CT perfusion, 14% with CT angiography, and the rest with MR perfusion and PET
90% of the algorithms studied were cleared in the last five years, and 42% since last year
The researchers further noted two caveats in AI marketing:
There is a lack of publicly available data to support vendor claims about the value of their algorithms. Better transparency is needed to create trust and clinician engagement.
The single-use-case nature of many AI algorithms raises questions about their economic viability. Many different algorithms would have to be implemented at a facility to ensure “a reasonable breadth of triage” for critical findings, and the financial burden of such integration is unclear.
The Takeaway
The new study offers intriguing insights into how AI algorithms are marketed by vendors, and how these efforts could be perceived by clinicians. The researchers note that financial pressure on AI developers may cause them to make “unintentional exaggerated claims” to recoup the cost of development; it is incumbent upon vendors to scrutinize their marketing activities to avoid overhyping AI technology.
One of the most exciting new use cases for medical AI is in generating radiology reports. But how can you tell whether the quality of a report generated by an AI algorithm is comparable to that of a radiologist?
In a new study in Patterns, researchers propose a technical framework for automatically grading the output of AI-generated radiology reports, with the ultimate goal of producing AI-generated reports that are indistinguishable from those of radiologists.
Most radiology AI applications so far have focused on developing algorithms to identify individual pathologies on imaging exams.
While this is useful, helping radiologists streamline the production of their main output – the radiology report – could have a far greater impact on their productivity and efficiency.
But existing tools for measuring the quality of AI-generated narrative reports are limited and don’t match up well with radiologists’ evaluations.
To improve that situation, the researchers applied several existing automated metrics for analyzing report quality and compared them to the scores of radiologists, seeking to better understand AI’s weaknesses.
Not surprisingly, the automated metrics fell short in several ways, including false prediction of findings, omitting findings, and incorrectly locating and predicting the severity of findings.
These shortcomings point out the need for better scoring systems for gauging AI performance.
The researchers therefore proposed a new metric for grading AI-generated report quality, called RadGraph F1, and a new methodology, RadCliQ, to predict how well an AI report would measure up to radiologist scrutiny.
RadGraph F1 and RadCliQ could be used in future research on AI-generated radiology reports, and to that end the researchers have made the code for both metrics available as open source.
Ultimately, the researchers see the construction of generalist medical AI models that could perform multiple complex tasks, such as conversing with radiologists and physicians about medical images.
Another use case could be applications that are able to explain imaging findings to patients in everyday language.
The Takeaway
It’s a complex and detailed paper, but the new study is important because it outlines the metrics that can be used to teach machines how to generate better radiology reports. Given the imperative to improve radiologist productivity in the face of rising imaging volume and workforce shortages, this could be one more step on the quest for the Holy Grail of AI in radiology.
Radiologists ignored AI suggestions in a new study because of “automation neglect,” a phenomenon in which humans are less likely to trust algorithmic recommendations. The findings raise questions about whether AI really should be used as a collaborative tool by radiologists.
How radiologists use AI predictions has become a growing area of research as AI moves into the clinical realm. Most use cases see radiologists employing AI in a collaborative role as a decision-making aid when reviewing cases.
But is that really the best way to use AI? In a paper published by the National Bureau of Economic Research, researchers from Harvard Medical School and MIT explored the effectiveness of radiologist performance when assisted by AI, in particular its impact on diagnostic quality.
They ran an experiment in which they manipulated radiologist access to predictions from the CheXpert AI algorithm for 324 chest X-ray cases, and then analyzed the results. They also assessed radiologist performance with and without clinical context. The 180 radiologists participating in the study were recruited from US teleradiology firms, as well as from a health network in Vietnam.
It was expected that AI would boost radiologist performance, but instead accuracy remained unchanged:
AI predictions were more accurate than two-thirds of the radiologists
Yet, AI assistance failed to improve the radiologists’ diagnostic accuracy, as readers underweighted AI findings by 30% compared to their own assessments
Radiologists took 4% longer to interpret cases when either AI or clinical context were added
Adding clinical context to cases had a bigger impact on radiologist performance than adding AI interpretations
The findings show automation neglect can be a “major barrier” to human-AI collaboration. Interestingly, the new article seems to run counter to a previous study finding that radiologists who received incorrect AI results were more likely to follow the algorithm’s suggestions – against their own judgment.
The Takeaway
The authors themselves admit the new findings are “puzzling,” but they do have intriguing ramifications. In particular, the researchers suggest that there may be limitations to the collaborative model in which humans and AI work together to analyze cases. Instead, it may be more effective to assign AI exclusively to certain studies, while radiologists work without AI assistance on other cases.
Can you believe the hype when it comes to marketing claims made for AI software? Not always. A new review in JAMA Network Open suggests that marketing materials for one-fifth of FDA-cleared AI applications don’t agree with the language in their regulatory submissions.
Interest in AI for healthcare has exploded, creating regulatory challenges for the FDA due to the technology’s novelty. This has left many AI developers guessing how they should comply with FDA rules, both before and after products get regulatory clearance.
This creates the possibility for discrepancies between products the FDA has cleared and how AI firms promote them. To investigate further, researchers from NYU Langone Health analyzed content from 510(k) clearance summaries and accompanying marketing materials for 119 AI- and machine learning (ML)-enabled devices cleared from November 2021 to March 2022. Their findings included:
Overall, AI/ML marketing language was consistent with 510(k) summaries for 80.67% of devices
Language was considered “discrepant” for 12.61% and “contentious” for 6.72%
Most of the AI/ML devices surveyed (63.03%) were developed for radiology use; these had a slightly higher rate of consistency (82.67%) than the entire study sample
The authors provided several examples illustrating when AI/ML firms went astray. In one case labeled as “discrepant,” a developer touted the “cutting-edge AI and advanced robotics” in its software for measuring and displaying cerebral blood flow with ultrasound. But the product’s 510(k) summary never discussed AI capabilities, and the algorithm isn’t included on the FDA’s list of AI/ML-enabled devices.
In another case labeled as “contentious,” marketing materials for an ECG mapping software application mention that it includes computation modeling and is a smart device, but require users to request a pamphlet from the developer for more information.
The Takeaway
So, can you believe the AI hype? This study shows that most of the time you can, with a consistency rate of 80.67% – not bad for a field as new as AI (a fact acknowledged in an invited commentary on the paper). But the study’s authors suggest that “any level of discrepancy is important to note for consumer safety.” And for a technology that already has trust issues, it’s probably best that developers not push the envelope when it comes to marketing.
VC investment in the AI medical imaging sector has shifted notably in the last couple years, says a new report from UK market intelligence firm Signify Research. The report offers a fascinating look at an industry where almost $5B has been raised since 2015.
Total Funding Value Drops – Both investors and AI independent software vendors (ISVs) have noticed reduced funding activity, and that’s reflected in the Signify numbers. VC funding of imaging AI firms fell 32% in 2022, to $750.4M, down from a peak of $1.1B in 2021.
Deal Volume Declines – The number of deals getting done has also fallen, to 42 deals in 2022, off 30% compared to 60 in 2021. In imaging AI’s peak year, 2020, 95 funding deals were completed.
VC Appetite Remains Strong – Despite the declines, VCs still have a strong appetite for radiology AI, but funding has shifted from smaller early-stage deals to larger, late-stage investments.
HeartFlow Deal Tips Scales – The average deal size has spiked this year to date, to $27.6M, compared to $17.9M in 2022, $18M in 2021, and $7.9M in 2020. Much of the higher 2023 number is driven by HeartFlow’s huge $215M funding round in April; Signify analyst Sanjay Parekh, PhD, told The Imaging Wire he expects the average deal value to fall to $18M by year’s end.
The Rich Get Richer – Much of the funding has concentrated in a dozen or so AI companies that have raised over $100M. Big winners include HeartFlow (over $650M), and Cleerly, Shukun Technology, and Viz.ai (over $250M). Signify’s $100M club is rounded out by Aidoc, Cathworks, Keya Medical, Deepwise Shenrui, Imagen Technologies, Perspectum, Lunit, and Annalise.ai.
US and China Dominate – On a regional basis, VC funding is going to companies in the US (almost $2B) and China ($1.1B). Following them are Israel ($513M), the UK ($310M), and South Korea ($255M).
The Takeaway
Signify’s report shows the continuation of trends seen in previous years that point to a maturing market for medical imaging AI. As with any such market, winners and losers are emerging, and VCs are clearly being selective about choosing which horses to put their money on.
Get every issue of The Imaging Wire, delivered right to your inbox.