Opponents of population-based cancer screening aren’t going away anytime soon. Just weeks after publication of a landmark study claiming that cancer screening has saved $7T over 25 years, screening foes published a counterattack in JAMA Internal Medicine casting doubt on whether screening has any value at all.
Population-based cancer screening has been controversial since the first programs were launched decades ago.
- A vocal minority of skeptics continues to raise concerns about screening, despite the fact that mortality rates have dropped and survival rates have increased for the four cancers targeted by population screening.
This week’s JAMA Internal Medicine featured a series of articles that cast doubt on screening. In the main study, researchers performed a meta-analysis of 18 randomized clinical trials (RCTs) covering 2.1M people for six major screening tests, including mammography, CT lung cancer screening, and colon and PSA tests.
- The authors, led by Norwegian gastroenterologist Michael Bretthauer, MD, PhD, concluded that only flexible sigmoidoscopy for colon cancer produced a gain in lifetimes. They conclude that RCTs to date haven’t included enough patients who were followed over enough years to show screening has an effect on all-cause mortality.
But a deeper dive into the study produces interesting revelations. For CT lung cancer screening, Bretthauer et al didn’t include the landmark National Lung Screening Trial, an RCT that showed a 20% mortality reduction from screening.
- With respect to breast imaging, the researchers only included three studies, even though there have been eight major mammography RCTs performed. And one of the three included was the controversial Canadian National Breast Screening Study, originally conducted in the 1980s.
When it comes to colon screening, Bretthauer included his own controversial 2022 NordICC study in his meta-analysis.
- The NordICC study found that if a person is invited to colon screening but doesn’t follow through, they don’t experience a mortality benefit. But those who actually got colon screening saw a 50% mortality reduction.
Other articles in this week’s JAMA Internal Medicine series were penned by researchers well known for their opposition to population-based screening, including Gilbert Welch, MD, and Rita Redberg, MD.
There’s an old saying in statistics: “If you torture the data long enough, it will confess to anything.” Among major academic journals, JAMA Internal Medicine – which Redberg guided for 14 years as editor until she stepped down in June – has consistently been the most hostile toward screening and new medical technology.
In the end, the arguments being made by screening’s foes would carry more weight if they were coming from researchers and journals that haven’t already demonstrated a longstanding, ingrained bias against population-based cancer screening.
A new article in JACR highlights the economic barriers that are limiting wider adoption of AI in healthcare in the US. The study paints a picture of how the complex nature of Medicare reimbursement puts the country at risk of falling behind other nations in the quest to implement healthcare AI on a national scale.
The success of any new medical technology in the US has always been linked to whether physicians can get reimbursed for using it. But there are a variety of paths to reimbursement in the Medicare system, each one with its own rules and idiosyncrasies.
The establishment of the NTAP program was thought to be a milestone in paying for AI for inpatients, for example, but the JACR authors note that NTAP payments are time-limited for no more than three years. A variety of other factors are limiting AI reimbursement, including …
- All of the AI payments approved under the NTAP program have expired, and as such no AI algorithm is being reimbursed under NTAP
- Budget-neutral requirements in the Medicare Physician Fee Schedule mean that AI reimbursement is often a zero-sum game. Payments made for one service (such as AI) must be offset by reductions for something else
- Only one imaging AI algorithm has successfully navigated CMS to achieve Category I reimbursement in the Physician Fee Schedule, starting in 2024 for fractional flow reserve (FFR) analysis
Standing in stark contrast to the Medicare system is the NHS in the UK, where regulators see AI as an invaluable tool to address chronic workforce shortages in radiology and are taking aggressive action to promote its adoption. Not only has NHS announced a £21M fund to fuel AI adoption, but it is mulling the implementation of a national platform to enable AI algorithms to be accessed within standard radiology workflow.
The JACR article illustrates how Medicare’s Byzantine reimbursement structure puts barriers in the path of wider AI adoption. Although there have been some reimbursement victories such as NTAP, these have been temporary, and the fact that only one radiology AI algorithm has achieved a Category I CPT code must be a sobering thought to AI proponents.
CT has established itself as an excellent cardiac imaging modality. But there can still be some fine-tuning in terms of exactly how and when to use it, especially for assessing people presenting with chest pain.
Two studies in JAMA Cardiology tackle this head-on, presenting new evidence that supports a more conservative – and precise – approach to determining which patients get follow-up testing. The studies also address concerns that using coronary CT angiography (CCTA) as an initial test before invasive catheterization could lead to unnecessary testing.
In the PRECISE study, researchers analyzed 2.1k patients from 2018 to 2021 who had stable symptoms of suspected coronary artery disease (CAD). Patients were randomized to a usual testing strategy (such as cardiac SPECT or stress echo), or a precision strategy that employed CCTA with selected fractional flow reserve CT (FFR-CT).
The precision strategy group was further subdivided into a subgroup of those at minimal risk of cardiac events (20%) for whom testing was deferred to see if utilization could be reduced even further. In the precision strategy group….
- Rates of invasive catheterization without coronary obstruction were lower (4% vs. 11%)
- Testing was lower versus the usual testing group (84% vs. 94%)
- Positive tests were more common (18% vs. 13%)
- 64% of the deferred-testing subgroup got no testing at all
- Adverse events were higher, but the difference was not statistically significant
To expand on the analysis, JAMA Cardiology published a related study that further investigated the safety of the deferred-testing strategy at one-year follow-up. Researchers compared adverse events in the deferred testing group to those who got the usual testing strategy, finding that the deferred testing group had…
- A lower incidence rate of adverse events (0.9 vs. 5.9)
- A lower rate of invasive cardiac cath without obstructive CAD per 100 patient years (1.0 vs. 6.5)
The results from both studies show that a strategy of deferring testing for low-risk CAD patients while sending higher-risk patients to CCTA and FFR-CT is clinically effective with no adverse impact on patient safety.
The new findings don’t take any of the luster off cardiac CT; they simply add to the body of knowledge demonstrating when to use – and not to use – this incredibly powerful tool for directing patient care. And in the emerging era of precision medicine, that’s what it’s all about.
A new study claims that medical screening for diseases like breast and cervical cancer has saved lives and generated value of at least $7.5T (yes, trillion) over the last 25 years. The findings, published in BMC Health Services Research, are a stunning rebuke to critics of screening exams.
While the vast majority of doctors and public health officials support evidence-based screening, a vocal minority of skeptics continues to raise questions about screening’s efficacy. These critics emphasize the “harms” of screening, such as overdiagnosis and patient anxiety – an accusation often levied against breast screening.
Screening’s critics also target the downstream costs of medical tests intended to confirm suspicious findings. They argue that a single screen-detected finding can lead to a cascade of additional healthcare spending that drives up medical costs.
But the new study offers a counter-argument, putting a dollar figure on how much screening exams have saved by detecting disease earlier, when it can be treated more effectively.
The research focused on the four main cancer screening tests – breast, cervical, colon, and lung cancer – analyzing the impact of preventive screening on life-years saved and its economic impact from 1996 to 2020, finding …
- Americans enjoyed at least 12M more years of life thanks to cancer screening
- The economic value of these life-years added up to at least $7.5T
- If everyone who qualified for screening exams got them, it would save at least another 3.3M life-years and $1.7T in economic impact
- Cervical cancer screening had by far the biggest economic impact ($5.2T-$5.7T), followed by breast ($0.8T-$1.9T), colorectal ($0.4T-$1T), and finally lung ($40B).
Lung cancer’s paltry value was due to a small eligible population and low screening adherence rates. This finding is underscored by a new article in STAT that ponders why CT lung cancer screening rates are so low, with one observer calling it the “redheaded stepchild” of screening tests.
Screening skeptics have been taking it on the chin lately (witness the USPSTF’s U-turn on mammography for younger women) and the new findings will be another blow. We may continue to see a dribble of papers on the “harms” of overdiagnosis, but the momentum is definitely shifting in screening’s favor – to the benefit of patients.
New research on the cancer risk of low-dose ionizing radiation could have disturbing implications for those who are exposed to radiation on the job – including medical professionals. In a new study in BMJ, researchers found that nuclear workers exposed to occupational levels of radiation had a cancer mortality risk that was higher than previously estimated.
The link between low-dose radiation and cancer has long been controversial. Most studies on the radiation-cancer connection are based on Japanese atomic bomb survivors, many of whom were exposed to far higher levels of radiation than most people receive over their lifetimes – even those who work with ionizing radiation.
The question is whether that data can be extrapolated to people exposed to much lower levels of radiation, such as nuclear workers, medical professionals, or even patients. To that end, researchers in the International Nuclear Workers Study (INWORKS) have been tracking low-dose radiation exposure and its connection to mortality in nearly 310k people in France, the UK, and the US who worked in the nuclear industry from 1944 to 2016.
INWORKS researchers previously published studies showing low-dose radiation exposure to be carcinogenic, but the new findings in BMJ offer an even stronger link. For the study, researchers tracked radiation exposure based on dosimetry badges worn by the workers and then rates of cancer mortality, and calculated rates of death from solid cancer based on their exposure levels, finding:
- Mortality risk was higher for solid cancers, at 52% per 1 Gy of exposure
- Individuals who received the occupational radiation limit of 20 mSv per year would have a 5.2% increased solid cancer mortality rate over five years
- There was a linear association between low-dose radiation exposure and cancer mortality, meaning that cancer mortality risk was also found at lower levels of exposure
- The dose-response association seen the study was even higher than in studies of atomic bomb survivors (52% vs. 32%)
Even though the INWORKS study was conducted on nuclear workers rather than medical professionals, the findings could have implications for those who might be exposed to medical radiation, such as interventional radiologists and radiologic technologists. The study will undoubtedly be examined by radiation protection organizations and government regulators; the question is whether it leads to any changes in rules on occupational radiation exposure.
Better patient care is the main selling point used by AI vendors when marketing neuroimaging algorithms, followed closely by time savings. Farther down the list of benefits are lower costs and increased revenue for providers.
So says a new analysis in JACR that takes a close look at how FDA-cleared neuroimaging AI algorithms are marketed by vendors. It also includes several warning signs for both AI developers and clinicians.
AI is the most exciting technology to arrive in healthcare in decades, but questions percolate on whether AI developers are overhyping the technology. In the new analysis, researchers focused on marketing claims made for 59 AI neuroimaging algorithms cleared by the FDA from 2008 to 2022. Researchers analyzed FDA summaries and vendor websites, finding:
- For 69% of algorithms, vendors highlighted an improvement in quality of patient care, while time savings for clinicians were touted for 44%. Only 16% of algorithms were promoted as lowering costs, while just 11% were positioned as increasing revenue
- 50% of cleared neuroimaging algorithms were related to detection or quantification of stroke; of these, 41% were for intracranial hemorrhage, 31% for stroke brain perfusion, and 24% for detection of large vessel occlusion
- 41% of the algorithms were intended for use with non-contrast CT scans, 36% with MRI, 15% with CT perfusion, 14% with CT angiography, and the rest with MR perfusion and PET
- 90% of the algorithms studied were cleared in the last five years, and 42% since last year
The researchers further noted two caveats in AI marketing:
- There is a lack of publicly available data to support vendor claims about the value of their algorithms. Better transparency is needed to create trust and clinician engagement.
- The single-use-case nature of many AI algorithms raises questions about their economic viability. Many different algorithms would have to be implemented at a facility to ensure “a reasonable breadth of triage” for critical findings, and the financial burden of such integration is unclear.
The new study offers intriguing insights into how AI algorithms are marketed by vendors, and how these efforts could be perceived by clinicians. The researchers note that financial pressure on AI developers may cause them to make “unintentional exaggerated claims” to recoup the cost of development; it is incumbent upon vendors to scrutinize their marketing activities to avoid overhyping AI technology.
A new study on physician salaries is raising pointed questions about pay for US physicians and whether it contributes to rising healthcare costs – that is, if you believe the numbers are accurate.
The study was released in July by the National Bureau of Economic Research (NBER), which produces in-depth reports on a variety of topics.
The current paper is highly technical and may have languished in obscurity were it not for an August 4 article in The Washington Post that examined the findings with the claim that “doctors make more than anyone thought.”
It is indeed true that the NBER’s estimate of physician salaries seems high. The study claims US physicians made an average of $350k in 2017, the year that the researchers focused on by analyzing federal tax records.
- The NBER estimate is far higher than $294k in Medscape’s 2017 report on physician compensation – a 19% difference.
The variation is even greater for diagnostic radiologists. The NBER data claim radiologists had a median annual salary in 2017 of $546k – 38% higher than the $396k average salary listed in Medscape’s 2017 report.
- The NBER numbers from six years ago are even higher than 2022/2023 numbers for radiologist salaries in several recent reports, by Medscape ($483k), Doximity ($504k), and Radiology Business ($482k).
But the NBER researchers claim that by analyzing tax data rather than relying on self-reported earnings, their data are more accurate than previous studies, which they believe underestimate physician salaries by as much as 25%.
- They also estimate that physician salaries make up about 9% of total US healthcare costs.
What difference is it how much physicians make? The WaPo story sparked a debate with 6.1k comments so far, with many readers accusing doctors of contributing to runaway healthcare costs in the US.
- Meanwhile, a thread in the AuntMinnie forums argued whether the NBER numbers were accurate, with some posters warning that the figures could lead to additional cuts in Medicare payments for radiologists.
Lost in the debate over the NBER report is its finding that physician pay makes up only 9% of US healthcare costs. In a medical system that’s rife with overutilization, administrative costs, and duplicated effort across fragmented healthcare networks, physician salaries should be the last target for those who actually want to cut healthcare spending.
One of the most exciting new use cases for medical AI is in generating radiology reports. But how can you tell whether the quality of a report generated by an AI algorithm is comparable to that of a radiologist?
In a new study in Patterns, researchers propose a technical framework for automatically grading the output of AI-generated radiology reports, with the ultimate goal of producing AI-generated reports that are indistinguishable from those of radiologists.
Most radiology AI applications so far have focused on developing algorithms to identify individual pathologies on imaging exams.
- While this is useful, helping radiologists streamline the production of their main output – the radiology report – could have a far greater impact on their productivity and efficiency.
But existing tools for measuring the quality of AI-generated narrative reports are limited and don’t match up well with radiologists’ evaluations.
- To improve that situation, the researchers applied several existing automated metrics for analyzing report quality and compared them to the scores of radiologists, seeking to better understand AI’s weaknesses.
Not surprisingly, the automated metrics fell short in several ways, including false prediction of findings, omitting findings, and incorrectly locating and predicting the severity of findings.
- These shortcomings point out the need for better scoring systems for gauging AI performance.
The researchers therefore proposed a new metric for grading AI-generated report quality, called RadGraph F1, and a new methodology, RadCliQ, to predict how well an AI report would measure up to radiologist scrutiny.
- RadGraph F1 and RadCliQ could be used in future research on AI-generated radiology reports, and to that end the researchers have made the code for both metrics available as open source.
Ultimately, the researchers see the construction of generalist medical AI models that could perform multiple complex tasks, such as conversing with radiologists and physicians about medical images.
- Another use case could be applications that are able to explain imaging findings to patients in everyday language.
It’s a complex and detailed paper, but the new study is important because it outlines the metrics that can be used to teach machines how to generate better radiology reports. Given the imperative to improve radiologist productivity in the face of rising imaging volume and workforce shortages, this could be one more step on the quest for the Holy Grail of AI in radiology.
A new study out of Sweden offers a resounding vote of confidence in the use of AI for analyzing screening mammograms. Published in The Lancet Oncology, researchers found that AI cut radiologist workload almost by half without affecting cancer detection or recall rates.
AI has been promoted as the technology that could save radiology from rising imaging volumes, growing burnout, and pressure to perform at a higher level with fewer resources. But many radiology professionals remember similar promises made in the 1990s around computer-aided detection (CAD), which failed to live up to the hype.
Breast screening presents a particular challenge in Europe, where clinical guidelines call for all screening exams to be double-read by two radiologists – leading to better sensitivity but also imposing a higher workload. AI could help by working as a triage tool, enabling radiologists to only double-read those cases most likely to have cancer.
In the MASAI study, researchers are assessing AI for breast screening in 100k women in a population-based screening program in Sweden, with mammograms being analyzed by ScreenPoint’s Transpara version 1.7.0 software. In an in-progress analysis, researchers looked at results for 80k mammography-eligible women ages 40-80.
The Transpara software applies a 10-point score to mammograms; in MASAI those scored 1-9 are read by a single radiologist, while those scored 10 are read by two breast radiologists. This technique was compared to double-reading, finding that:
- AI reduced the mammography reading workload by almost 37k screening mammograms, or 44%
- AI had a higher cancer detection rate per 1k screened participants (6.1 vs. 5.1) although the difference was not statistically significant (P=0.052)
- Recall rates were comparable (2.2% vs. 2.0%)
The results demonstrate the safety of using AI as a triage tool, and the MASAI researchers plan to continue the study until it reaches 100k participants so they can measure the impact of AI on detection of interval cancers – cancers that appear between screening rounds.
It’s hard to overestimate the MASAI study’s significance. The findings strongly support what AI proponents have been saying all along – that AI can save radiologists time while maintaining diagnostic performance. The question is the extent to which the MASAI results will apply outside of the double-reading environment, or to other clinical use cases.