It’s no secret that the rapid growth of AI in radiology is being fueled by venture capital firms eager to see a payoff for early investments in startup AI developers. But are there signs that VCs’ appetite for radiology AI is starting to wane?
Maybe. And maybe not. While one new analysis shows that AI investments slowed in 2023 compared to the year before, another predicts that over the long term, VC investing will spur a boom in AI development that is likely to transform radiology.
First up is an update by Signify Research to its ongoing analysis of VC funding. The new numbers show that through Q3 2023, the number of medical imaging AI deals has fallen compared to Q3 2022 (24 vs. 40).
Total funding has also fallen for the second straight year, to $501M year-to-date in 2023. That compares to $771M through the third quarter of 2022, and $1.1B through the corresponding quarter of 2021.
On the other hand, the average deal size has grown to an all-time high of $20.9M, compared to 2022 ($15.4M) and 2021 ($18M).
And one company – Rapid AI – joined the exclusive club of just 14 AI vendors that have raised over $100M with a $75M Series C round in July 2023.
In a look forward at AI’s future, a new analysis in JACRby researchers from the ACR Data Science Institute (DSI) directly ties VC funding to healthcare AI software development, predicting that every $1B in funding translates into 11 new product approvals, with a six-year lag between funding and approval.
And the authors forecast long-term growth: In 2022 there were 69 FDA-approved products, but by 2035, funding is expected to reach $31B for the year, resulting in the release of a staggering 350 new AI products that year.
Further, the ACR DSI authors see a virtuous cycle developing, as increasing AI adoption spurs more investment that creates more products available to help radiologists with their workloads.
The Takeaway
The numbers from Signify and ACR DSI don’t match up exactly, but together they paint a picture of a market segment that continues to enjoy massive VC investment. While the precise numbers may fluctuate year to year, investor interest in medical imaging AI will fuel innovation that promises to transform how radiology is practiced in years to come.
What is autonomous artificial intelligence, and is radiology ready for this new technology? In this paper, we explore one of the most exciting autonomous AI applications, ChestLink from Oxipit.
What is Autonomous AI?
Up to now, most interpretive AI solutions have focused on assisting radiologists with analyzing medical images. In this scenario, AI provides suggestions to radiologists and alerts them to suspicious areas, but the final diagnosis is the physician’s responsibility.
Autonomous AI flips the script by having AI run independently of the radiologist, such as by analyzing a large batch of chest X-ray exams for tuberculosis to screen out those certain to be normal. This can significantly reduce the primary care workload, where healthcare providers who offer preventive health checkups may see up to 80% of chest X-rays with no abnormalities.
Autonomous AI frees the radiologist to focus on cases with suspicious pathology – with the potential of delivering a more accurate diagnosis to patients in real need.
One of the first of this new breed of autonomous AI is ChestLink from Oxipit. The solution received the CE Mark in March 2022, and more than a year later it is still the only AI application capable of autonomous performance.
How ChestLink Works
ChestLink produces final chest X-ray reports on healthy patients with no involvement from human radiologists. The application only reports autonomously on chest X-ray studies where it is highly confident that the image does not include abnormalities. These studies are automatically removed from the reporting workflow.
ChestLink enables radiologists to report on studies most likely to have abnormalities. In current clinical deployments, ChestLink automates 10-30% of all chest X-ray workflow. The exact percentage depends on the type of medical institution, with primary care facilities having the most potential for automation.
ChestLink Clinical Validation
ChestLink was trained on a dataset with over 500k images. In clinical validation studies, ChestLink consistently performed at 99%+ sensitivity.
“The most surprising finding was just how sensitive this AI tool was for all kinds of chest disease. In fact, we could not find a single chest X-ray in our database where the algorithm made a major mistake. Furthermore, the AI tool had a sensitivity overall better than the clinical board-certified radiologists,” said study co-author Louis Lind Plesner, MD, from the Department of Radiology at the Herlev and Gentofte Hospital in Copenhagen, Denmark.
In this study ChestLink autonomously reported on 28% of all normal studies.
In another study at the Oulu University Hospital in Finland, researchers concluded that AI could reliably remove 36.4% of normal chest X-rays from the reporting workflow with a minimal number of false negatives, leading to effectively no compromise on patient safety.
Safe Path to AI Autonomy
Oxipit ChestLink is currently used in healthcare facilities in the Netherlands, Finland, Lithuania, and other European countries, and is in the trial phase for deployment in one of the leading hospitals in England.
ChestLink follows a three-stage framework for clinical deployment.
Retrospective analysis. ChestLink analyzes a couple of years worth (100k+) of historic chest x-ray studies at the medical institution. In this analysis the product is validated on real-world data. It also realistically estimates what fraction of reporting scope can be automated.
Semi-autonomous operations. The application moves into prospective settings, analyzing images in near-real time. ChestLink produces preliminary reports for healthy patients, which may then be approved by a certified clinician.
Autonomous operations. The application autonomously reports on high-confidence healthy patient studies. The application performance is monitored in real-time with analytical tools.
Are We There Yet?
ChestLink aims to address the shortage of clinical radiologists worldwide, which has led to a substantial decline in care quality.
In the UK, the NHS currently faces a massive 33% shortfall in its radiology workforce. Nearly 71% of clinical directors of UK radiology departments feel that they do not have a sufficient number of radiologists to deliver safe and effective patient care.
ChestLink offers a safe pathway into autonomous operations by automating a significant and somewhat mundane portion of radiologist workflow without any negative effects for patient care.
So should we embrace autonomous AI? The real question should be, can we afford not to?
There’s no question AI is the future of radiology. But AI’s drive to widespread clinical use is going to hit some speed bumps along the way.
This week is a case in point. Two studies were published showing AI’s limitations and underscoring the challenges faced in making AI an everyday clinical reality.
In the first study, researchers found that radiologists outperformed four commercially available AI algorithms for analyzing chest X-rays (Annalise.ai, Milvue, Oxipit, and Siemens Healthineers) in a study of 2k patients in Radiology.
Researchers from Denmark found the AI tools had moderate to high sensitivity for three detection tasks:
airspace disease (72%-91%)
pneumothorax (63%-90%)
pleural effusion (62%-95%).
But the algorithms also had higher false-positive rates and performance dropped in cases with smaller pathology and multiple findings. The findings are disappointing, especially since they got such widespread play in the mainstream media.
But this week’s second study also brought worrisome news, this time in Radiology: Artificial Intelligence about an AI training method called foundation models that many hope holds the key to better algorithms.
Foundation models are designed to address the challenge of finding enough high-quality data for AI training. Most algorithms are trained with actual de-identified clinical data that have been labeled and referenced to ground truth; foundation models are AI neural networks pre-trained with broad, unlabeled data and then fine-tuned with smaller volumes of more detailed data to perform specific tasks.
Researchers in the new study found that a chest X-ray algorithm trained on a foundation model with 800k images had lower performance than an algorithm trained with the CheXpert reference model in a group of 42.9k patients. The foundation model’s performance lagged for four possible results – no finding, pleural effusion, cardiomegaly, and pneumothorax – as follows…
Lower by 6.8-7.7% in females for the “no finding” result
Down by 10.7-11.6% in Black patients in detecting pleural effusion
Lower performance across all groups for classifying cardiomegaly
The decline in female and Black patients is particularly concerning given recent studies on bias and lack of generalizability for AI.
The Takeaway
This week’s studies show that there’s not always going to be a clear road ahead for AI in its drive to routine clinical use. The study on foundation models in particular could have ramifications for AI developers looking for a shortcut to faster algorithm development. They may want to slow their roll.
How can you predict whether an AI algorithm will fall short for a particular clinical use case such as detecting cancer? Researchers in Radiologytook a crack at this conundrum by developing what they call an “uncertainty quantification” metric to predict when an AI algorithm might be less accurate.
AI is rapidly moving into wider clinical use, with a number of exciting studies published in just the last few months showing how AI can help radiologists interpret screening mammograms or direct which women should get supplemental breast MRI.
But AI isn’t infallible. And unlike a human radiologist who might be less confident in a particular diagnosis, an AI algorithm doesn’t have a built-in hedging mechanism.
So researchers from Denmark and the Netherlands decided to build one. They took publicly available AI algorithms and tweaked their code so they produced “uncertainty quantification” scores with their predictions.
They then tested how well the scores predicted AI performance in a dataset of 13k images for three common tasks covering some of the deadliest types of cancer:
1) detecting pancreatic ductal adenocarcinoma on CT 2) detecting clinically significant prostate cancer on MRI 3) predicting pulmonary nodule malignancy on low-dose CT
Researchers classified the highest 80% of the AI predictions as “certain,” and the remaining 20% as “uncertain,” and compared AI’s accuracy in both groups, finding …
AI led to significant accuracy improvements in the “certain” group for pancreatic cancer (80% vs. 59%), prostate cancer (90% vs. 63%), and pulmonary nodule malignancy prediction (80% vs. 51%)
AI accuracy was comparable to clinicians when its predictions were “certain” (80% vs. 78%, P=0.07), but much worse when “uncertain” (50% vs. 68%, P<0.001)
Using AI to triage “uncertain” cases produced overall accuracy improvements for pancreatic and prostate cancer (+5%) and lung nodule malignancy prediction (+6%) compared to a no-triage scenario
How would uncertainty quantification be used in clinical practice? It could play a triage role, deprioritizing radiologist review of easier cases while helping them focus on more challenging studies. It’s a concept similar to the MASAI study of mammography AI.
The Takeaway
Like MASAI, the new findings present exciting new possibilities for AI implementation. They also present a framework within which AI can be implemented more safely by alerting clinicians to cases in which AI’s analysis might fall short – and enabling humans to step in and pick up the slack.
Have we reached a tipping point when it comes to AI for breast screening? This week another study was published – this one in Radiology– demonstrating the value of AI for interpreting screening mammograms.
Of all the medical imaging exams, breast screening probably could use the most help. Reading mammograms has been compared to looking for a needle in a haystack, with radiologists reviewing thousands of images before finding a single cancer.
AI could help in multiple ways, either at the radiologist’s side during interpretation or by reviewing mammograms in advance, triaging the ones most likely to be normal while reserving suspicious exams for closer attention by radiologists (indeed, that was the approach used in the MASAI study in Sweden in August).
In the new study, UK researchers in the PERFORMS trial compared the performance of Lunit’s INSIGHT MMG AI algorithm to that of 552 radiologists in 240 test mammogram cases, finding that …
AI was comparable to radiologists for sensitivity (91% vs. 90%, P=0.26) and specificity (77% vs. 76%, P=0.85).
There was no statistically significant difference in AUC (0.93 vs. 0.88, P=0.15)
AI and radiologists were comparable or no different with other metrics
Like the MASAI trial, the PERFORMS results show that AI could play an important role in breast screening. To that end, a new paper in European Journal of Radiology proposes a roadmap for implementing mammography AI as part of single-reader breast screening programs, offering suggestions on prospective clinical trials that should take place to prove breast AI is ready for widespread use in the NHS – and beyond.
The Takeaway
It certainly does seem that AI for breast screening has reached a tipping point. Taken together, PERFORMS and MASAI show that mammography AI works well enough that “the days of double reading are numbered,” at least where it is practiced in Europe, as noted in an editorial by Liane Philpotts, MD.
While double-reading isn’t practiced in the US, the PERFORMS protocol could be used to supplement non-specialized radiologists who don’t see that many mammograms, Philpotts notes. Either way, AI looks poised to make a major impact in breast screening on both sides of the Atlantic.
New research on the cancer risk of low-dose ionizing radiation could have disturbing implications for those who are exposed to radiation on the job – including medical professionals. In a new study in BMJ, researchers found that nuclear workers exposed to occupational levels of radiation had a cancer mortality risk that was higher than previously estimated.
The link between low-dose radiation and cancer has long been controversial. Most studies on the radiation-cancer connection are based on Japanese atomic bomb survivors, many of whom were exposed to far higher levels of radiation than most people receive over their lifetimes – even those who work with ionizing radiation.
The question is whether that data can be extrapolated to people exposed to much lower levels of radiation, such as nuclear workers, medical professionals, or even patients. To that end, researchers in the International Nuclear Workers Study (INWORKS) have been tracking low-dose radiation exposure and its connection to mortality in nearly 310k people in France, the UK, and the US who worked in the nuclear industry from 1944 to 2016.
INWORKS researchers previously published studies showing low-dose radiation exposure to be carcinogenic, but the new findings in BMJ offer an even stronger link. For the study, researchers tracked radiation exposure based on dosimetry badges worn by the workers and then rates of cancer mortality, and calculated rates of death from solid cancer based on their exposure levels, finding:
Mortality risk was higher for solid cancers, at 52% per 1 Gy of exposure
Individuals who received the occupational radiation limit of 20 mSv per year would have a 5.2% increased solid cancer mortality rate over five years
There was a linear association between low-dose radiation exposure and cancer mortality, meaning that cancer mortality risk was also found at lower levels of exposure
The dose-response association seen the study was even higher than in studies of atomic bomb survivors (52% vs. 32%)
The Takeaway
Even though the INWORKS study was conducted on nuclear workers rather than medical professionals, the findings could have implications for those who might be exposed to medical radiation, such as interventional radiologists and radiologic technologists. The study will undoubtedly be examined by radiation protection organizations and government regulators; the question is whether it leads to any changes in rules on occupational radiation exposure.
Better patient care is the main selling point used by AI vendors when marketing neuroimaging algorithms, followed closely by time savings. Farther down the list of benefits are lower costs and increased revenue for providers.
So says a new analysis in JACRthat takes a close look at how FDA-cleared neuroimaging AI algorithms are marketed by vendors. It also includes several warning signs for both AI developers and clinicians.
AI is the most exciting technology to arrive in healthcare in decades, but questions percolate on whether AI developers are overhyping the technology. In the new analysis, researchers focused on marketing claims made for 59 AI neuroimaging algorithms cleared by the FDA from 2008 to 2022. Researchers analyzed FDA summaries and vendor websites, finding:
For 69% of algorithms, vendors highlighted an improvement in quality of patient care, while time savings for clinicians were touted for 44%. Only 16% of algorithms were promoted as lowering costs, while just 11% were positioned as increasing revenue
50% of cleared neuroimaging algorithms were related to detection or quantification of stroke; of these, 41% were for intracranial hemorrhage, 31% for stroke brain perfusion, and 24% for detection of large vessel occlusion
41% of the algorithms were intended for use with non-contrast CT scans, 36% with MRI, 15% with CT perfusion, 14% with CT angiography, and the rest with MR perfusion and PET
90% of the algorithms studied were cleared in the last five years, and 42% since last year
The researchers further noted two caveats in AI marketing:
There is a lack of publicly available data to support vendor claims about the value of their algorithms. Better transparency is needed to create trust and clinician engagement.
The single-use-case nature of many AI algorithms raises questions about their economic viability. Many different algorithms would have to be implemented at a facility to ensure “a reasonable breadth of triage” for critical findings, and the financial burden of such integration is unclear.
The Takeaway
The new study offers intriguing insights into how AI algorithms are marketed by vendors, and how these efforts could be perceived by clinicians. The researchers note that financial pressure on AI developers may cause them to make “unintentional exaggerated claims” to recoup the cost of development; it is incumbent upon vendors to scrutinize their marketing activities to avoid overhyping AI technology.
A new study out of Sweden offers a resounding vote of confidence in the use of AI for analyzing screening mammograms. Published in The Lancet Oncology, researchers found that AI cut radiologist workload almost by half without affecting cancer detection or recall rates.
AI has been promoted as the technology that could save radiology from rising imaging volumes, growing burnout, and pressure to perform at a higher level with fewer resources. But many radiology professionals remember similar promises made in the 1990s around computer-aided detection (CAD), which failed to live up to the hype.
Breast screening presents a particular challenge in Europe, where clinical guidelines call for all screening exams to be double-read by two radiologists – leading to better sensitivity but also imposing a higher workload. AI could help by working as a triage tool, enabling radiologists to only double-read those cases most likely to have cancer.
In the MASAI study, researchers are assessing AI for breast screening in 100k women in a population-based screening program in Sweden, with mammograms being analyzed by ScreenPoint’s Transpara version 1.7.0 software. In an in-progress analysis, researchers looked at results for 80k mammography-eligible women ages 40-80.
The Transpara software applies a 10-point score to mammograms; in MASAI those scored 1-9 are read by a single radiologist, while those scored 10 are read by two breast radiologists. This technique was compared to double-reading, finding that:
AI reduced the mammography reading workload by almost 37k screening mammograms, or 44%
AI had a higher cancer detection rate per 1k screened participants (6.1 vs. 5.1) although the difference was not statistically significant (P=0.052)
Recall rates were comparable (2.2% vs. 2.0%)
The results demonstrate the safety of using AI as a triage tool, and the MASAI researchers plan to continue the study until it reaches 100k participants so they can measure the impact of AI on detection of interval cancers – cancers that appear between screening rounds.
The Takeaway
It’s hard to overestimate the MASAI study’s significance. The findings strongly support what AI proponents have been saying all along – that AI can save radiologists time while maintaining diagnostic performance. The question is the extent to which the MASAI results will apply outside of the double-reading environment, or to other clinical use cases.
Radiologists ignored AI suggestions in a new study because of “automation neglect,” a phenomenon in which humans are less likely to trust algorithmic recommendations. The findings raise questions about whether AI really should be used as a collaborative tool by radiologists.
How radiologists use AI predictions has become a growing area of research as AI moves into the clinical realm. Most use cases see radiologists employing AI in a collaborative role as a decision-making aid when reviewing cases.
But is that really the best way to use AI? In a paper published by the National Bureau of Economic Research, researchers from Harvard Medical School and MIT explored the effectiveness of radiologist performance when assisted by AI, in particular its impact on diagnostic quality.
They ran an experiment in which they manipulated radiologist access to predictions from the CheXpert AI algorithm for 324 chest X-ray cases, and then analyzed the results. They also assessed radiologist performance with and without clinical context. The 180 radiologists participating in the study were recruited from US teleradiology firms, as well as from a health network in Vietnam.
It was expected that AI would boost radiologist performance, but instead accuracy remained unchanged:
AI predictions were more accurate than two-thirds of the radiologists
Yet, AI assistance failed to improve the radiologists’ diagnostic accuracy, as readers underweighted AI findings by 30% compared to their own assessments
Radiologists took 4% longer to interpret cases when either AI or clinical context were added
Adding clinical context to cases had a bigger impact on radiologist performance than adding AI interpretations
The findings show automation neglect can be a “major barrier” to human-AI collaboration. Interestingly, the new article seems to run counter to a previous study finding that radiologists who received incorrect AI results were more likely to follow the algorithm’s suggestions – against their own judgment.
The Takeaway
The authors themselves admit the new findings are “puzzling,” but they do have intriguing ramifications. In particular, the researchers suggest that there may be limitations to the collaborative model in which humans and AI work together to analyze cases. Instead, it may be more effective to assign AI exclusively to certain studies, while radiologists work without AI assistance on other cases.
SAN DIEGO – What’s behind the slow clinical adoption of artificial intelligence? That question permeated the discussion at this week’s AIMed Global Summit, an up-and-coming conference dedicated to AI in healthcare.
Running June 4-7, this week’s meeting saw hundreds of healthcare professionals gather in San Diego. Radiology figured prominently as the medical specialty with a lion’s share of the over 500 FDA-cleared AI algorithms available for clinical use.
But being available for use and actually being used are two different things. A common refrain at AIMed 2023 was slow clinical uptake of AI, a problem widely attributed to difficulties in deploying and implementing the technology. One speaker noted that less than 5% of practices are using AI today.
One way to spur AI adoption is the platform approach, in which AI apps are vetted by a single entity for inclusion in a marketplace from which clinicians can pick and choose what they want.
The platform approach is gaining steam in radiology, but Mayo Clinic is rolling the platform concept out across its entire healthcare enterprise. First launched in 2019, Mayo Clinic Platform aims to help clinicians enjoy the benefits of AI without the implementation headache, according to Halim Abbas, senior director of AI at Mayo, who discussed Mayo’s progress on the platform at AIMed.
The Mayo Clinic Platform has several main features:
Each medical specialty maintains its own internal AI R&D team with access to its own AI applications
At the same time, Mayo operates a centralized AI operation that provides tools and services accessible across departments, such as data de-identification and harmonization, augmented data curation, and validation benchmarks
Clinical data is made available outside the -ologies, but the data is anonymized and secured, an approach Mayo calls “data behind glass”
Mayo Clinic Platform gives different -ologies some ownership of AI, but centralizes key functions and services to improve AI efficiency and smooth implementation.
The Takeaway
Mayo Clinic Platform offers an intriguing model for AI deployment. By removing AI’s implementation pain points, Mayo hopes to ramp up clinical utilization, and Mayo has the organizational heft and technical expertise to make it work (see below for news on Mayo’s new generative AI deal with Google Cloud).
But can Mayo’s AI model be duplicated at smaller health systems and community providers that don’t have its IT resources? Maybe we’ll find out at AIMed 2024.
Get every issue of The Imaging Wire, delivered right to your inbox.