Imaging AI’s Unseen Potential

Amid the dozens of imaging AI papers and presentations that came out over the last few weeks were three compelling new studies highlighting how much “unseen” information AI can extract from medical images, and the massive impact this information could have. 

Imaging-Led Population Health – An excellent presentation from Ayis Pyrros, MD placed radiology at the center of healthcare’s transition to value-based care and population health, highlighting the AI training opportunities that will come with more value-based care HCC codes and imaging AI’s untapped potential for early disease detection and management. Dr. Pyrros specifically emphasized chest X-ray’s potential given the exam’s ubiquity (26M Medicare CXRs in 2021), CXR AI’s ability to predict outcomes (e.g. mortality, comorbidities, hospital stays), and how opportunistic AI screening can/should support proactive care that benefits both patients and health systems.

  • Healthcare’s value-based overhaul has traditionally been seen as a threat to radiology’s fee-for-service foundations. Even if that might still be true from a business model perspective, Dr. Pyrros makes it quite clear that the shift to value-based care could make radiology even more important — and importance is always good for business.

AI Race Detection – The final peer-reviewed version of the landmark study showing that AI models can accurately predict patient race was officially published, further confirming that AI can detect patients’ self-reported race by analyzing medical image features. The new paper showed that AI very accurately detects patient race across modalities and anatomical regions (AUCs: CXRs 0.91 – 0.99, chest CT 0.89 – 0.96, mammography 0.81), without relying on proxies or imaging-related confounding features (BMI, disease distribution, and breast density all had ≤0.61 AUCs).

  • If imaging AI models intended for clinical tasks can identify patients’ races, they could be applying the same racial biomarkers to diagnosis, thus reproducing or exacerbating healthcare’s existing racial disparities. That’s an important takeaway whether you’re developing or adopting AI.

CXR Cost Predictions – The smart folks at the UCSF Center for Intelligent Imaging developed a series of CXR-based deep learning models that can predict patients’ future healthcare costs. Developed with 21,872 frontal CXRs from 19,524 patients, the best performing models were able to relatively accurately identify which patients would have a top-50% personal healthcare cost after one, three, and five years (AUCs: 0.806, 0.771, 0.729). 

  • Although predicting which patients will have higher costs could be useful on its own, these findings also suggest that similar CXR-based DL models could be used to flag patients who may deteriorate, initiate proactive care, or support healthcare cost analysis and policies.

AI-Assisted Radiographers

A new European Radiology study provided what might be the first insights into whether AI can allow radiographers to independently read lung cancer screening exams, while alleviating the resource challenges that have slowed LDCT screening program rollouts.

This is the type of study that makes some radiologists uncomfortable, but its results suggest that rads’ role in lung cancer screening remains very secure.

The researchers had two trained UK-based radiographers read 716 LDCT exams using a computer-assisted detection AI solution (158 w/ significant pulmonary nodules), and compared them with interpretations from radiologists who didn’t have CADe assistance.

The radiographers had significantly lower sensitivity than the radiologists (68% & 73.7%; p < 0.001), leading to 61 false negative exams. However, the two CADe-assisted radiographers did achieve:

  • Good sensitivity with cancers confirmed from baseline scans – 83.3% & 100%
  • Relatively high specificity – 92.1% & 92.7%
  • Low false-positive rates – 7.9% and 7.3%

The CADe AI solution might have both helped and hurt the radiographers’ performance, as CADe missed 20 of the radiographers’ 40 false negative nodules, and four of their seven false negative malignant nodules. 

Even as LDCT CADe tools become far more accurate, they might not be able to fill in radiographers’ incidental findings knowledge gap. The radiographers achieved either “good” or “fair” interobserver agreement rates with radiologists for emphysema and CAC findings, but the variety of other incidental pathologies was “too broad to reasonably expect radiographers to detect and interpret.”

The Takeaway
Although CADe-assisted radiographer studies might concern some radiologists, this seems like an important aspect of AI to understand given the workload demands that come with lung cancer screening programs, and the need to better understand how clinicians and AI can work together. 

Good thing for any concerned radiologists, this study shows that LDCT reporting is too complex and current CADe solutions are too limited for CADe-equipped radiographers to independently read LDCTs… “at least for the foreseeable future.”

Who Owns LVO AI?

The FDA’s public “reminder” that studies analyzed by AI-based LVO detection tools (CADt) still require radiologist interpretation became one of hottest stories in radiology last week, and although many of us didn’t realize, it made a big statement about how AI-based coordination is changing the way care teams and radiologists work together.

The FDA decided to issue this clarification after finding that some providers were using LVO AI tools to guide their stroke treatment decisions and “might not be aware” that they need to base those decisions on radiologist interpretations. The agency reiterated that these tools are only intended to flag suspicious exams and support diagnostic prioritization, revealing plans to work with LVO AI vendors to make sure everyone understands radiologists’ role in these workflows. 

This story was covered in all the major radiology publications and sparked a number of social media discussions with some interesting theories:

  • One social post suggested that the FDA is preemptively taking a stand against autonomous AI
  • Some posts and articles wondered if AI might be overly-influencing radiologists’ diagnoses
  • The Imaging Wire didn’t even mention care coordination until a reader emailed with a clarification and we went back and edited our initial story

That reader had a point. It does seem like this is more of a care coordination issue than an AI diagnostics issue, considering that:

  • These tools send notifications and images to interventionalist/surgeons before radiologists are able to read the same cases
  • Two of the three leading LVO AI care coordination tools are marketed to everyone on the stroke team except radiologists (go check their sites)
  • Before AI care coordination came along, it would have been hard to believe that stroke team members “might not be aware” that they needed to check radiologist interpretations before making care decisions

The Takeaway

LVO AI care coordination tools have been a huge commercial and clinical success, and care coordination platforms are quickly expanding to new use cases.

That seems like good news for emergency patients and care teams, but as the FDA reminded us last week, it also means that we’re going to need more safeguards to ensure that care decisions are based on radiologists’ diagnoses — even if the AI tool already informed care teams what the diagnosis might be.

Us2.ai Automates Globally

One of imaging AI’s hottest segments just got even hotter with the completion of Us2.ai’s $15M Series A round and the global launch of its flagship echocardiography AI solution. It’s been at least a year since we led-off a newsletter with a funding announcement, but Us2.ai’s unique foundation and the echo AI segment’s rapid evolution makes this a story worth telling…

Us2.ai has already achieved FDA clearance, a growing list of clinical evidence, and key hardware and pharma alliances (EchoNous & AstraZeneca). 

  • The Singapore-based startup also has a unique level of credibility, including co-founders with a history of clinical and business success, and VC support from IHH Healthcare (the world’s second largest health system).
  • With its funding secured, Us2.ai will accelerate its commercial and regulatory expansion, with a focus on driving global clinical adoption (US, Europe, and Asia) and developing new alliances (hardware vendors, healthcare providers, researchers, pharma).

Us2.ai joins a crowded echo AI arena, which might have more commercial-stage vendors than all other ultrasound AI segments combined. In fact, the major echo guidance (Caption Health, UltraSight) and echo reporting (DiA Imaging, Ultromics, Us2.ai) AI startups have already generated more than $180M in combined VC funding and forged a number of major hardware and PACS partnerships.

  • This influx of echo AI startups might be warranted, given echocardiography’s workforce, efficiency, and variability challenges. It might even prove to be visionary if handheld ultrasounds continue their rapid expansion to new users and settings (including primary and at-home care).
  • Us2.ai will have to rely on its reporting advantages to stand out against its well-established competitors, as it is the only vendor to completely automate echo reporting (complete editable/explainable reports in 2 minutes) and analyze every chamber of the heart (vs. just left ventricle with some vendors). 
  • That said, the incumbent echo AI players have big head starts and the impact of Us2.ai’s automation advantage will rely on ultrasound OEMs’ openness to new alliances and (of course) the rate that providers embrace AI for echo reporting.

The Takeaway

Even if many cardiologists and sonographers would have a hard time differentiating the various echo AI solutions, this is a segment that’s showing a high level of product-market fit. That’s more than you can say for most imaging AI segments, and product advancements like Us2.ai’s should improve this alignment. It might even help drive widespread adoption.

The Case for Algorithmic Audits

A new Lancet Digital Health study could have become one of the many “AI rivals radiologists” papers that we see each week, but it instead served as an important lesson that traditional performance tests might not prove that AI models are actually safe for clinical use.

The Model – The team developed their proximal femoral fracture detection DL model using 45.7k frontal X-rays performed at Australia’s Royal Adelaide Hospital (w/ 4,861 fractures).

The Validation – They then tested it against a 4,577-exam internal set (w/ 640 fractures), 400 of which were also interpreted by five radiologists (w/ 200 fractures), and against an 81-image external validation set from Stanford.

The Results – All three tests produced results that a typical study might have viewed as evidence of high-performance: 

  • The model outperformed the five radiologists (0.994 vs. 0.969 AUCs)
  • It beat the best performing radiologist’s sensitivity (95.5% vs. 94.5%) and specificity (99.5% vs 97.5%)
  • It generalized well with the external Stanford data (0.980 AUC)

The Audit – Despite the strong results, a follow-up audit revealed that the model might make some predictions for the wrong reasons, suggesting that it is unsafe for clinical deployment:

  • One false negative X-ray included an extremely displaced fracture that human radiologists would catch
  • X-rays featuring abnormal bones or joints had a 50% false negative rate, far higher than the reader set’s overall false negative rate (2.5%)
  • Salience maps showed that AI decisions were almost never based on the outer region of the femoral neck, even with images where that region was clinically relevant (but it still often made the right diagnosis)
  • The model scored a high AUC with the Stanford data, but showed a substantial model operating point shift

The Case for Auditing – Although the study might have not started with this goal, it ended up becoming an argument for more sophisticated preclinical auditing. It even led to a separate paper outlining their algorithmic auditing process, which among other things suggested that AI users and developers should co-own audits.

The Takeaway

Auditing generally isn’t the most exciting topic in any field, but this study shows that it’s exceptionally important for imaging AI. It also suggests that audits might be necessary for achieving the most exciting parts of AI, like improving outcomes and efficiency, earning clinician trust, and increasing adoption.A new Lancet Digital Health study could have become one of the many “AI rivals radiologists” papers that we see each week, but it instead served as an important lesson that traditional performance tests might not prove that AI models are actually safe for clinical use.

Imaging AI’s Big 2021

Signify Research’s latest imaging AI VC funding report revealed an unexpected surge in 2021, along with major funding shifts that might explain why many of us didn’t see it coming. Here’s some of Signify’s big takeaways and here’s where to get the full report.

AI’s Path to $3.47B – Imaging AI startups have raised $3.47B in venture funding since 2015, helped by a record-high $815M in 2021 after several years of falling investments (vs. 2020’s $592M, 2019’s $450M, 2018’s $790M).

Big Get Bigger – That $3.47B funding total came from over 200 companies and 290 deals, although the 25 highest-funded companies were responsible for 80% of all capital raised. VCs  increased their focus on established AI companies in 2021, resulting in record-high late-stage funding (~$723.5M), record-low Pre-Seed/Seed funding (~$7M), and a major increase in average deal size (~$33M vs. ~$12M in 2020). 

Made in China – If you’re surprised that 2021 was a record AI funding year, that’s probably because it targeted Chinese companies (~$260M vs. US’ ~$150M), continuing a recent trend (China’s AI VC share was 45% in 2020, 26% in 2019). We’re also seeing major funding go to South Korea and Australia’s top startups, adding to APAC AI vendors’ funding leadership.

Health VC Context – Although imaging AI’s $815M 2021 funding total seems big for a category that’s figuring out its path towards full adoption, the amount VC firms are investing in other areas of healthcare makes it seem pretty reasonable. Our two previous Digital Health Wire issues featured seven digital health startup funding rounds with a total value of $267M (and that’s from just one week).

The Takeaway

Signify correctly points out that imaging AI funding remains strong despite a list of headwinds (COVID, regulatory hurdles, lacking reimbursements), while showing more signs of AI market maturation (larger funding rounds to fewer players) and suggesting that consolidation is on the way. Those factors will likely continue in 2022. However, more innovation is surely on the way too and quite a few regional AI powerhouses still haven’t expanded globally, suggesting that the next steps in AI’s evolution won’t be as straightforward as some might think.

Autonomous AI Milestone

Just as the debate over whether AI might replace radiologists is starting to fade away, Oxipit’s ChestLink solution became the first regulatory-approved imaging AI product intended to perform diagnoses without involving radiologists (*please see editor’s note below regarding Behold.ai). That’s a big and potentially controversial milestone in the evolution of imaging AI and it’s worth a deeper look.

About ChestLink – ChestLink autonomously identifies CXRs without abnormalities and produces final reports for each of these “normal” exams, automating 15% to 40% of reporting workflows.

Automation Evidence – Oxipit has already piloted ChestLink in supervised settings for over a year, processing over 500k real-world CXRs with 99% sensitivity and no clinically relevant errors.

The Rollout – With its CE Class IIb Mark finalized, Oxipit is now planning to roll out ChestLink across Europe and begin “fully autonomous” operation by early 2023. Oxipit specifically mentioned primary care settings (many normal CXRs) and large-scale screening projects (high volumes, many normal scans) in its announcement, but ChestLink doesn’t appear limited to those use cases.

ChestLink’s ability to address radiologist shortages and reduce labor costs seem like strong and unique advantages. However, radiology’s first regulatory approved autonomous AI solution might face even stronger challenges:

  • ChestLink’s CE Mark doesn’t account for country-specific regulations around autonomous diagnostic reporting (e.g. the UK requires “appropriate reporting” with ionizing radiation-based exams)
  • Radiologist societies historically push back against anything that might undermine radiologists’ clinical roles, earning potential, and future career stability
  • Health systems’ evidence requirements for any autonomous AI tools would likely be extremely high, and they might expect similarly high economic ROI in order to justify the associated diagnostic or reputational risks
  • Even the comments in Oxipit’s LinkedIn announcement had a much more skeptical tone than we typically see with regulatory approval announcements

The Takeaway

Autonomous AI products like ChestLink could address some of radiology’s greatest problems (radiologist overwork, staffing shortages, volume growth, low access in developing countries) and their economic value proposition is far stronger than most other diagnostic AI products.

However, autonomous AI solutions could also face more obstacles than any other imaging AI products we’ve seen so far, suggesting that it would take a combination of excellent clinical performance and major changes in healthcare policies/philosophies in order for autonomous AI to reach mainstream adoption.

*Editor’s Note – April 21, 2022: Behold.ai insists that it is the first imaging AI company to receive regulatory approval for autonomous AI. Its product is used with radiologist involvement and local UK guidelines require that radiologists read exams that use ionizing radiation. All above analysis regarding the possibilities and challenges of autonomous AI applies to any autonomous AI vendor in the current AI environment, including both Oxipit and Behold.ai.

Complementary PE AI

A new European Radiology study out of France highlighted how Aidoc’s pulmonary embolism AI solution can serve as a valuable emergency radiology safety net, catching PE cases that otherwise might have been missed and increasing radiologists’ confidence. 

Even if that’s technically what PE AI products are supposed to do, studies using commercially available products and focusing on how AI complements radiologists (vs. comparing AI and rad accuracy) are still rare and worth a closer look.

The Diagnostic Study – A team from French telerad provider, IMADIS, analyzed AI and radiologist CTPA interpretations from patients with suspected PE (n = 1,202 patients), finding that:

  • Aidoc PE achieved higher sensitivity (0.926 vs. 0.9 AUCs) and negative predictive value (0.986 vs. 0.981 AUCs)
  • Radiologists achieved higher specificity (0.991 vs. 0.958 AUCs), positive predictive value (0.95 vs. 0.804 AUCs), and accuracy (0.977 vs. 0.953 AUCs)
  • The AI tool flagged 219 suspicious PEs, with 176 true positives, including 19 cases that were missed by radiologists
  • The radiologists detected 180 suspicious PEs, with 171 true positives, including 14 cases that were missed by AI
  • Aidoc PE would have helped IMADIS catch 285 misdiagnosed PE cases in 2020 based on the above AI-only PE detection ratio (19 per 1,202 patients)  

The Radiologist Survey – Nine months after IMADIS implemented Aidoc PE, a survey of its radiologists (n = 79) and a comparison versus its pre-implementation PE CTPAs revealed that:

  • 72% of radiologists believed Aidoc PE improved their diagnostic confidence and comfort 
  • 52% of radiologists the said the AI solution didn’t impact their interpretation times
  • 14% indicated that Aidoc PE reduced interpretation times
  • 34% of radiologists believed the AI tool added time to their workflow
  • The solution actually increased interpretation times by an average of 7.2% (+1:03 minutes) 

The Takeaway

Now that we’re getting better at not obsessing over AI replacing humans, this is a solid example of how AI can complement radiologists by helping them catch more PE cases and make more confident diagnoses. Some radiologists might be concerned with false positives and added interpretation times, but the authors noted that AI’s PE detection advantages (and the risks of missed PEs) outweigh these potential tradeoffs.

The Case for Operational AI

A trio of radiologists from Mount Sinai and East River Medical Imaging starred in a recent Aunt Minnie webinar, discussing their paths towards operational AI adoption, and sharing some very relevant takeaways for radiology groups and AI vendors.

The Cast – The Subtle Medical-sponsored webinar featured Mount Sinai’s Amish H. Doshi, MD and Idoia Corcuera-Solano, MD (neuro and MSK subspecialists) and East River Medical Imaging’s Timothy Deyer, MD (CMIO and MSK IR), all of whom were involved in evaluating and adopting Subtle Medical’s SubtleMR deep learning reconstruction solution.

Make it Easy – When discussing their AI evaluation criteria, the panelists placed a major emphasis on ease-of-evaluation and implementation, with one noting that “before even having a conversation” he’d have to be certain these early processes won’t be costly or cumbersome (clear process, no new hardware, minimal IT work, no up-front purchases, etc.). 

Why Operational AI – Much of the discussion focused on why the panelists support operational AI, noting that scan-shortening DLIR solutions like SubtleMR:

  • Allow more revenue-generating scans per day
  • Alleviate technologist burnout and staffing challenges
  • Improve the patient experience (especially pediatric)
  • Eliminate re-scans by reducing movement artifacts that occur in long exams
  • Don’t require changes to radiologist workflows
  • Maintain diagnostic image quality
  • Receive less pushback from admins and physicians than diagnostic AI

Evaluating SubtleMR for MSK – Mount Sinai’s MSK SubtleMR evaluation process included comparing standard of care and SubtleMR-enhanced abbreviated MRI exams from 50-consecutive knee MR patients. They found that SubtleMR cut scan times by 50% (13:27 to 6:45), while achieving comparable image quality, artifacts, and diagnostic performance.

Evaluating SubtleMR for Neuro – Mount Sinai’s neuro evaluation process involved comparing SubtleMR and conventional MRI with 10-15 patients for each potential MR sequence. They then reviewed the scans with key stakeholders, worked with the Subtle Medical team to make requested imaging adjustments, and implemented the solution.

SubtleMR Results – SubtleMR’s list of benefits (scan speed, patient experience, patient throughput, revenue) earned it approval from all key stakeholders. Although one panelist noted that some of their radiologists critiqued the enhanced images, the radiologist pushback wasn’t nearly as strong as what they’ve seen in response to diagnostic AI products.

The Takeaway

We cover plenty of editorials about what it takes to drive AI adoption, but feedback from real world AI adopters is still rare, making this webinar particularly useful for AI vendors and adopters. The webinar also states a solid case for SubtleMR and other deep learning reconstruction solutions, even for groups who might not be ready to adopt the kind of “AI” that we usually focus on.

MGH’s Multimodal Thyroid Ultrasound AI

An MGH and Harvard Medical team developed a multimodal ultrasound AI platform that applies an interesting mix of AI techniques to accurately detect and stage thyroid cancer, potentially improving diagnosis and treatment planning.

The Platform – The platform combines radiomics, topological data analysis (TDA), ML-based TI-RADS assessments, and deep learning, allowing them to capture more data, minimize noise, and improve prediction accuracy.

The Study – Starting with 1,346 ultrasound images from 784 patients, the researchers trained the multimodal AI platform with 362 nodules (103 malignant) and validated it against a pair of internal (51 malignant, 98 benign) and external (270 malignant, 50 benign) datasets, finding that:

  • The platform predicted 98.7% of internal dataset malignancies (0.99 AUC)
  • The platform predicted 91.4% of external dataset malignancies (0.94 AUC)
  • The individual AI methods were far less accurate (80% to 89% w/ internal)
  • A version of the platform accurately predicted nodal pathological stages (93% for T-stage, 89% for N-stage, 98% for extrathyroidal extension)
  • The platform predicted BRAF mutations with 96% accuracy

Next Steps – The researchers plan to validate their multimodal platform in prospective multicenter clinical trials, including in low-resource countries where it might be particularly helpful.


The Takeaway

We cover plenty of ultrasound AI and thyroid cancer imaging studies, but this team’s multi-AI approach is unique and appears promising. A multimodal AI platform like this might make thyroid cancer diagnosis more efficient and less subjective, avoid unnecessary biopsies, allow non-invasive staging and mutation assessment, and lead to more personalized treatments. That would be a major accomplishment, and might suggest that similar multimodal AI platforms could be developed for other cancers and imaging modalities.

Get every issue of The Imaging Wire, delivered right to your inbox.

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Imaging Wire team