Background: Breast cancer screening in Australia involves two radiologists reviewing each mammogram with discordant results arbitrated by a third radiologist. Automated reading of mammograms by artificial intelligence (AI) algorithms has been proposed to address screen-reading resourcing challenges faced by breast screening programs by reducing overall screen-reading volume.
Aim: To estimate the impact on screen-reading volume for a simulated AI-radiologist double-reading strategy, where AI replaces one of two initial radiologist reads and disagreement is arbitrated by an additional radiologist read.
Methods: We undertook an external validation study of a commercially-available AI algorithm in a retrospective cohort of 108,970 consecutive mammograms (November 2015-December 2016) from BreastScreen WA (BSWA). A prospectively-defined AI positivity threshold was applied. The integration of AI into double-reading was simulated by analytically pairing the first radiologists' read per screen with the AI result. The number of initial reads and arbitrating reads (where results were discordant) for integrated AI-radiologist double-reading was compared with the number of radiologist reads observed in BSWA practice. Costs for mammographic reading were estimated from 2023 piece rates for radiologists (screening read and arbitrating read).
Results: The total number of radiologist reads in BSWA practice was 223,355, including 108,970 first reads, 108,970 second reads, and 5,415 arbitrating reads. Integration of AI into double reading reduced the total radiologist screen reading volume by 41.4% to 130,930 reads (108,970 first reads, 0 second reads, 21,960 arbitrating reads). Disagreement between the first radiologist and AI increased arbitration by 305.5%, reflecting lower specificity of the AI algorithm. However, the increase in arbitration was offset by a reduction in reading volume by substituting the second radiologist read with AI. Costs of radiologist reading were 33.3% lower for simulated AI-radiologist double-reading ($5.81 per screen) compared with double-reading by radiologists ($8.71 per screen).
Conclusions: Simulation of integrating AI in double reading (with arbitration) showed an increase in arbitration that was offset by substituting AI for the second radiologist read. Overall screen screen-reading volume was 41.4% lower in the AI strategy. Improvements in AI specificity may achieve greater reductions in screen-reading volume for screening programs. Radiologist reading costs per screen were lower by one-third; this smaller reduction was due to higher reimbursement for arbitrating reads. Other costs to screening programs (e.g. assessment) were not included in these estimates. Cost-effectiveness studies are required to estimate the economic impact of AI relative to screening outcomes.