Mayo’s AI Model

SAN DIEGO – What’s behind the slow clinical adoption of artificial intelligence? That question permeated the discussion at this week’s AIMed Global Summit, an up-and-coming conference dedicated to AI in healthcare.

Running June 4-7, this week’s meeting saw hundreds of healthcare professionals gather in San Diego. Radiology figured prominently as the medical specialty with a lion’s share of the over 500 FDA-cleared AI algorithms available for clinical use.

But being available for use and actually being used are two different things. A common refrain at AIMed 2023 was slow clinical uptake of AI, a problem widely attributed to difficulties in deploying and implementing the technology. One speaker noted that less than 5% of practices are using AI today.

One way to spur AI adoption is the platform approach, in which AI apps are vetted by a single entity for inclusion in a marketplace from which clinicians can pick and choose what they want. 

The platform approach is gaining steam in radiology, but Mayo Clinic is rolling the platform concept out across its entire healthcare enterprise. First launched in 2019, Mayo Clinic Platform aims to help clinicians enjoy the benefits of AI without the implementation headache, according to Halim Abbas, senior director of AI at Mayo, who discussed Mayo’s progress on the platform at AIMed. 

The Mayo Clinic Platform has several main features:

  • Each medical specialty maintains its own internal AI R&D team with access to its own AI applications 
  • At the same time, Mayo operates a centralized AI operation that provides tools and services accessible across departments, such as data de-identification and harmonization, augmented data curation, and validation benchmarks
  • Clinical data is made available outside the -ologies, but the data is anonymized and secured, an approach Mayo calls “data behind glass”

Mayo Clinic Platform gives different -ologies some ownership of AI, but centralizes key functions and services to improve AI efficiency and smooth implementation. 

The Takeaway 

Mayo Clinic Platform offers an intriguing model for AI deployment. By removing AI’s implementation pain points, Mayo hopes to ramp up clinical utilization, and Mayo has the organizational heft and technical expertise to make it work (see below for news on Mayo’s new generative AI deal with Google Cloud). 

But can Mayo’s AI model be duplicated at smaller health systems and community providers that don’t have its IT resources? Maybe we’ll find out at AIMed 2024.

When AI Goes Wrong

What impact do incorrect AI results have on radiologist performance? That question was the focus of a new study in European Radiology in which radiologists who received incorrect AI results were more likely to make wrong decisions on patient follow-up – even though they would have been correct without AI’s help.

The accuracy of AI has become a major concern as deep learning models like ChatGPT become more powerful and come closer to routine use. There’s even a term – the “hallucination effect” – for when AI models veer off script to produce text that sounds plausible but in fact is incorrect.

While AI hallucinations may not be an issue in healthcare – yet – there is still concern about the impact that AI algorithms are having on clinicians, both in terms of diagnostic performance and workflow. 

To see what happens when AI goes wrong, researchers from Brown University sent 90 chest radiographs with “sham” AI results to six radiologists, with 50% of the studies positive for lung cancer. They employed different strategies for AI use, ranging from keeping the AI recommendations in the patient’s record to deleting them after the interpretation was made. Findings included:

  • When AI falsely called a true-pathology case “normal,” radiologists’ false-negative rates rose compared to when they didn’t use AI (20.7-33.0% depending on AI use strategy vs. 2.7%)
  • AI calling a negative case “abnormal” boosted radiologists’ false-positive rates compared to without AI (80.5-86.0% vs. 51.4%)
  • Not surprisingly, when AI calls were correct, radiologists were more accurate with AI than without, with increases in both true-positive rates (94.7-97.8% vs. 88.3%) and true-negative rates (89.7-90.7% vs. 77.3%)

Fortunately, the researchers offered suggestions on how to mitigate the impact of incorrect AI. Radiologists had fewer false negatives when AI provided a box around the region of suspicion, a phenomenon the researchers said could be related to AI helping radiologists focus. 

Also, radiologists’ false positives were higher when AI results were retained in the patient record versus when they were deleted. Researchers said this was evidence that radiologists were less likely to disagree with AI if there was a record of the disagreement occurring. 

The Takeaway 
As AI becomes more widespread clinically, studies like this will become increasingly important in shaping how the technology is used in the real world, and add to previous research on AI’s impact. Awareness that AI is imperfect – and strategies that take that awareness into account – will become key to any AI implementation.

AI Investment Shift

VC investment in the AI medical imaging sector has shifted notably in the last couple years, says a new report from UK market intelligence firm Signify Research. The report offers a fascinating look at an industry where almost $5B has been raised since 2015. 

VC investment in the AI medical imaging sector has shifted in the last couple years, with money moving to later-stage companies.

Total Funding Value Drops – Both investors and AI independent software vendors (ISVs) have noticed reduced funding activity, and that’s reflected in the Signify numbers. VC funding of imaging AI firms fell 32% in 2022, to $750.4M, down from a peak of $1.1B in 2021.

Deal Volume Declines – The number of deals getting done has also fallen, to 42 deals in 2022, off 30% compared to 60 in 2021. In imaging AI’s peak year, 2020, 95 funding deals were completed. 

VC Appetite Remains Strong – Despite the declines, VCs still have a strong appetite for radiology AI, but funding has shifted from smaller early-stage deals to larger, late-stage investments. 

HeartFlow Deal Tips Scales – The average deal size has spiked this year to date, to $27.6M, compared to $17.9M in 2022, $18M in 2021, and $7.9M in 2020. Much of the higher 2023 number is driven by HeartFlow’s huge $215M funding round in April; Signify analyst Sanjay Parekh, PhD, told The Imaging Wire he expects the average deal value to fall to $18M by year’s end.

The Rich Get Richer – Much of the funding has concentrated in a dozen or so AI companies that have raised over $100M. Big winners include HeartFlow (over $650M), and Cleerly, Shukun Technology, and (over $250M). Signify’s $100M club is rounded out by Aidoc, Cathworks, Keya Medical, Deepwise Shenrui, Imagen Technologies, Perspectum, Lunit, and

US and China Dominate – On a regional basis, VC funding is going to companies in the US (almost $2B) and China ($1.1B). Following them are Israel ($513M), the UK ($310M), and South Korea ($255M).  

The Takeaway 

Signify’s report shows the continuation of trends seen in previous years that point to a maturing market for medical imaging AI. As with any such market, winners and losers are emerging, and VCs are clearly being selective about choosing which horses to put their money on.

Radiology Puts ChatGPT to Work

ChatGPT has taken the world by storm since the AI technology was first introduced in November 2022. In medicine, radiology is taking the lead in putting ChatGPT to work to address the specialty’s many efficiency and workflow challenges. 

Both ChatGPT and its newest iteration, GPT-4, are forms of AI known as large language models – essentially neural networks that are trained on massive volumes of unlabeled text and are able to learn on their own how to predict the structure and syntax of human language. 

A flood of papers have appeared in just the last week or so investigating ChatGPT’s potential:

  • ChatGPT could be used to improve patient engagement with radiology providers, such as by creating layperson reports that are more understandable, or by answering patient questions in a chatbot function, says an American Journal of Roentgenology article.
  • ChatGPT offered up accurate information about breast cancer prevention and screening to patients in a study in Radiology. But ChatGPT also gave some inappropriate and inconsistent recommendations – perhaps no surprise given that many experts themselves often disagree on breast screening guidelines.
  • ChatGPT was able to produce a report on a PET/CT scan of a patient – including technical terms like SUVmax and TNM stage – without special training, found researchers writing in Journal of Nuclear Medicine.
  • GPT-4 translated free-text radiology reports into structured reports that better lend themselves to standardization and data extraction for research in another paper published in Radiology. Best of all, the service cost 10 cents a report.

Where is all this headed? A review article on AI in medicine in New England Journal of Medicine gave the opinion – often stated in radiology – that AI has the potential to take over mundane tasks and give health professionals more time for human-to-human interactions. 

They compared the arrival of ChatGPT to the onset of digital imaging in radiology in the 1990s, and offered a tantalizing future in which chatbots like ChatGPT and GPT-4 replace outdated technologies like x-ray file rooms and lost images – remember those?

The Takeaway

Radiology’s embrace of ChatGPT and GPT-4 is heartening given the specialty’s initial skeptical response to AI in years past. As the most technologically advanced medical specialty, it’s only fitting that radiology takes the lead in putting this transformative technology to work – as it did with digital imaging.

RadNet’s Path to AI Profit

There’s plenty of bold forecasts about imaging AI’s long term potential, but short term projections of when AI startups will reach profitability are rarely disclosed and almost never bold. That’s why RadNet’s quarterly investor calls are proving to be such a valuable bellwether for the business of AI, and its latest briefing was no exception.

RadNet entered the AI arena with its 2020 acquisition of DeepHealth (~$20M) and solidified its AI presence in early 2022 by acquiring Aidence and Quantib (~$85M), but its AI business generated just $4.4M in revenue and booked a $24.9M in pre-tax loss in 2022. 

Those numbers are likely typical for similar-sized AI companies. However, RadNet’s path towards AI revenue growth and breakeven operations might outpace most of its peers.

  • Looking into 2023, RadNet forecasts that its AI revenue will quadruple to between $16M and $18M, while its Adjusted EBITDA loss falls to between -$9M and -$11M.
  • By 2024, RadNet expects its AI division to generate at least $25M to $30M in revenue, allowing it to achieve AI profitability for the first time.

So how exactly is RadNet going to achieve 6x AI revenue growth and reach profitability within just two years? Patients are going to pay for it. 

RadNet expects its new direct-to-patient Enhanced Breast Cancer Detection (EBCD) service to generate between $11M and $13M in 2023 revenue, representing up to 72% of RadNet’s overall AI revenue and driving much of its AI profitability improvements. And EBCD’s nationwide rollout won’t be complete until Q3.

RadNet’s 2024 AI revenue and profit improvements will again rely on “substantial” EBCD growth, with some help from its Aidence and Quantib operations. Those improvements would offset delayed AI efficiency benefits that RadNet has “yet to really realize” due in part to slow radiologist adoption.


The fact that RadNet expects to become one of imaging’s largest and most profitable AI companies within the next two years might not be surprising. However, RadNet’s reliance on patient payments to drive that growth is astounding, and it’s something to keep an eye on as AI vendors and radiology groups work on their own AI monetization strategies.

Radiology NLP’s Efficiency and Accuracy Potential

The last week brought two high profile studies underscoring radiology NLP’s potential to improve efficiency and accuracy, showing how the language-based technology can give radiologists a reporting head-start and allow them to enjoy the benefits of AI detection without the disruptions.

AI + NLP for Nodule QA – A new JACR study detailed how Yale New Haven Hospital combined AI and NLP to catch and report more incidental lung nodules in emergency CT scans, without impacting in-shift radiologists. The quality assurance program used a CT AI algorithm to detect suspicious nodules and an NLP tool to analyze radiology reports, flagging only the cases that AI marked as suspicious but the NLP tool marked as negative.

  • The AI/NLP program processed 19.2k CT exams over an 8-month period, flagging just 50 cases (0.26%) for a second review.
  • Those flagged cases led to 34 reporting changes and 20 patients receiving follow-up imaging recommendations. 
  • Just as notably, this semi-autonomous process helped rads avoid “thousands of unnecessary notifications” for non-emergent nodules.

NLP Auto-Captions – JAMA highlighted an NLP model that automatically generates free-text captions describing CXR images, streamlining the radiology report writing process. A Shanghai-based team trained the model using 74k unstructured CXR reports labeled for 23 different abnormalities, and tested with 5,091 external CXRs alongside two other caption-generating models.

  • The NLP captions reduced radiology residents’ reporting times compared to when they used a normal captioning template or a rule-based captioning model (283 vs. 347 & 296 seconds), especially with abnormal exams (456 vs. 631 & 531 seconds). 
  • The NLP-generated captions also proved to be most similar to radiologists’ final reports (mean BLEU scores: 0.69 vs. 0.37 & 0.57; on 0-1 scale).

The Takeaway

These are far from the first radiology NLP studies, but the fact that these implementations improved efficiency (without sacrificing accuracy) or improved accuracy (without sacrificing efficiency) deserves extra attention at a time when trade-offs are often expected. Also, considering that everyone just spent the last month marveling at what ChatGPT can do, it might be a safe bet that even more impressive language and text-based radiology solutions are on the way.

Understanding AI’s Physician Influence

We spend a lot of time exploring the technical aspects of imaging AI performance, but little is known about how physicians are actually influenced by the AI findings they receive. A new Scientific Reports study addresses that knowledge gap, perhaps more directly than any other research to date. 

The researchers provided 233 radiologists (experts) and internal and emergency medicine physicians (non-experts) with eight chest X-ray cases each. The CXR cases featured correct diagnostic advice, but were manipulated to show different advice sources (generated by AI vs. by expert rads) and different levels of advice explanations (only advice vs. advice w/ visual annotated explanations). Here’s what they found…

  • Explanations Improve Accuracy – When the diagnostic advice included annotated explanations, both the IM/EM physicians and radiologists’ accuracy improved (+5.66% & +3.41%).
  • Non-Rads with Explainable Advice Rival Rads – Although the IM/EM physicians performed far worse than rads when given advice without explanations, they were “on par with” radiologists when their advice included explainable annotations (see Fig 3).
  • Explanations Help Radiologists with Tough Cases – Radiologists gained “limited benefit” from advice explanations with most of the X-ray cases, but the explanations significantly improved their performance with the single most difficult case.
  • Presumed AI Use Improves Accuracy – When advice was labeled as AI-generated (vs. rad-generated), accuracy improved for both the IM/EM physicians and radiologists (+4.22% & +3.15%).
  • Presumed AI Use Improves Expert Confidence – When advice was labeled as AI-generated (vs. rad-generated), radiologists were more confident in their diagnosis.

The Takeaway
This study provides solid evidence supporting the use of visual explanations, and bolsters the increasingly popular theory that AI can have the greatest impact on non-experts. It also revealed that physicians trust AI more than some might have expected, to the point where physicians who believed they were using AI made more accurate diagnoses than they would have if they were told the same advice came from a human expert.

However, more than anything else, this study seems to highlight the underappreciated impact of product design on AI’s clinical performance.

Acute Chest Pain CXR AI

Patients who arrive at the ED with acute chest pain (ACP) syndrome end up receiving a series of often-negative tests, but a new MGB-led study suggests that CXR AI might make ACP triage more accurate and efficient.

The researchers trained three ACP triage models using data from 23k MGH patients to predict acute coronary syndrome, pulmonary embolism, aortic dissection, and all-cause mortality within 30 days. 

  • Model 1: Patient age and sex
  • Model 2: Patient age, sex, and troponin or D-dimer positivity
  • Model 3: CXR AI predictions plus Model 2

In internal testing with 5.7k MGH patients, Model 3 predicted which patients would experience any of the ACP outcomes far more accurately than Models 2 and 1 (AUCs: 0.85 vs. 0.76 vs. 0.62), while maintaining performance across patient demographic groups.

  • At a 99% sensitivity threshold, Model 3 would have allowed 14% of the patients to skip additional cardiovascular or pulmonary testing (vs. Model 2’s 2%).

In external validation with 22.8k Brigham and Women’s patients, poor AI generalizability caused Model 3’s performance to drop dramatically, while Models 2 and 1 maintained their performance (AUCs: 0.77 vs. 0.76 vs. 0.64). However, fine-tuning with BWH’s own images significantly improved the performance of the CXR AI model (from 0.67 to 0.74 AUCs) and Model 3 (from 0.77 to 0.81 AUCs).

  • At a 99% sensitivity threshold, the fine-tuned Model 3 would have allowed 8% of BWH patients to skip additional cardiovascular or pulmonary testing (vs. Model 2’s 2%).

The Takeaway

Acute chest pain is among the most common reasons for ED visits, but it’s also a major driver of wasted ED time and resources. Considering that most ACP patients undergo CXR exams early in the triage process, this proof-of-concept study suggests that adding CXR AI could improve ACP diagnosis and significantly reduce downstream testing.

Bayer Establishes AI Platform Leadership with Blackford Acquisition

Six months after becoming radiology’s newest AI platform vendor, Bayer accelerated its path towards AI leadership with its acquisition of Blackford Analysis.

The acquisition might prove to be among the most significant in imaging AI’s short history, combining Blackford’s many AI advantages (tech, expertise, relationships) with Bayer’s massive radiology presence and AI ambitions. 

After closing later this year, Blackford will operate independently through Bayer’s well-established “arm’s length” model, allowing Blackford to preserve its entrepreneurial culture, while leveraging Bayer’s “experience, infrastructure and reach” to drive further expansion.

Bayer’s Calantic platform and team will operate separately from Blackford, providing Bayer customers with two distinct AI platforms to choose from, while giving Bayer two ways to drive its AI business forward. 

Although few would have predicted this acquisition, it makes sense given Bayer and Blackford’s relatively long history together and their complementary situations. 

  • Blackford was part of Bayer’s 2019 G4A digital health accelerator class
  • The companies have been working together to develop Calantic since 2020
  • Bayer has big AI goals, but its AI customer base and reputation were unestablished
  • Blackford’s AI customer base and reputation are solid, but it needed a new way to scale and a positive exit for its shareholders

Even fewer would have predicted that imaging contrast vendors would be the driving force behind AI’s next consolidation wave, noting that Guerbet invested in Intrasense just last week. However, imaging contrast and imaging AI could serve increasingly interrelated (or alternative) roles in the diagnostic process, and there’s surely advantages to being a leader in both areas for Bayer and Guerbet.

Speaking of AI consolidation, it appears that all those 2023 AI consolidation forecasts are proving to be correct, while bringing some of radiology’s largest companies into an AI segment that’s historically been dominated by startups. It wouldn’t be surprising if that trend continued.

The Takeaway

Bayer and Blackford have been working on their AI strategies for years, and this acquisition appears to give both companies a much better chance of achieving long-term AI leadership. Considering that AI is still in its infancy and could eventually play a dominant role in radiology (and across healthcare), AI leadership might be a far more significant market position in the future than many can imagine today.

CXR AI’s Screening Generalizability Gap

A new European Radiology study detailed a commercial CXR AI tool’s challenges when used for screening patients with low disease prevalence, bringing more attention to the mismatch between how some AI tools are trained and how they’re applied in the real world.

The researchers used an unnamed commercial AI tool to detect abnormalities in 3k screening CXRs sourced from two healthcare centers (2.2% w/ clinically significant lesions), and had four radiology residents read the same CXRs with and without AI assistance, finding that the AI:

  • Produced a far lower AUROC than in its other studies (0.648 vs. 0.77–0.99)
  • Achieved 94.2% specificity, but just 35.3% sensitivity
  • Detected 12 of 41 pneumonia, 3 of 5 tuberculosis, and 9 of 22 tumors 
  • Only “modestly” improved the residents’ AUROCs (0.571–0.688 vs. 0.534–0.676)
  • Added 2.96 to 10.27 seconds to the residents’ average CXR reading times

The researchers attributed the AI tool’s “poorer than expected” performance to differences between the data used in its initial training and validation (high disease prevalence) and the study’s clinical setting (high-volume, low-prevalence, screening).

  • More notably, the authors pointed to these results as evidence that many commercial AI products “may not directly translate to real-world practice,” urging providers facing this kind of training mismatch to retrain their AI or change their thresholds, and calling for more rigorous AI testing and trials.

These results also inspired lively online discussions. Some commenters cited the study as proof of the problems caused by training AI with augmented datasets, while others contended that the AI tool’s AUROC still rivaled the residents and its “decent” specificity is promising for screening use.

The Takeaway

We cover plenty of studies about AI generalizability, but most have explored bias due to patient geography and demographics, rather than disease prevalence mismatches. Even if AI vendors and researchers are already aware of this issue, AI users and study authors might not be, placing more emphasis on how vendors position their AI products for different use cases (or how they train it).

Get every issue of The Imaging Wire, delivered right to your inbox.

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

Another important feature of the best 10 dollar minimum deposit online casino is casino licensing. The best online casinos are regulated by regulators and must meet set standards in order to keep their clients happy. Regulatory bodies such as the UK Gambling Commission, the Malta Gaming Authority, and the Kahnawake Gaming Commission oversee casinos and ensure that they adhere to their rules. Licensed casinos will not accept players under the legal age limit, and they will have to audit their games to ensure fairness and safety.

-- The Imaging Wire team