Comparative Study of Radiologists’ Diagnostic Performance With and Without Artificial Intelligence Support for Breast Lesion Detection in 2D Mammography
Author Block: L. Lassalle1, J. Ventre2, V. Marty2, L. Clovis2, N. Nitche2, J. Hadchiti3, N-E. Regnard1, A-L. Hermann4, E. Kotter5; 1Lieusaint/FR, 2Paris/FR, 3Villejuif/FR, 4Lyon/FR, 5Freiburg Im Breisgau/DE
Purpose: Mammography is a key domain where AI may improve early breast cancer detection. The study compared the diagnostic performance of radiologists with and without support from a new AI system (BreastView, Gleamer) in detecting breast lesions.
Methods or Background: We retrospectively collected mammographies from three imaging centers across France (2018-2023), sourced from three manufacturers (Hologic, Siemens, GE). Eligible patients were women over 18 who underwent 2D mammography with CC and MLO views and had either biopsy or 18-month follow-up. Poor-quality exams were excluded.
Ground truth was determined by an experienced breast radiologist who had access to the entire patient file, including anterior and posterior mammographies and, when available, digital breast tomosynthesis, MRI, ultrasound, clinical reports, and biopsy reports for cancer cases.
Nine radiologists participated: five “non-subspecialists” (250 and 750 mammographies/year), and four “subspecialists” (>1000 mammographies/year). They annotated all visible lesions and assigned malignancy scores from 0 to 100. Each completed two reading sessions, unaided and with AI support, separated by a 12-month washout.
Results or Findings: The dataset included 319 patients (age: 58 ± 13 years): 159 with a malignant biopsy-proven lesion, 39 with only benign lesions confirmed by biopsy or 18-month follow-up, and 121 with no lesion confirmed by 18-month follow-up.
The stand-alone AI achieved an AUC of 0.911 [0.881–0.942] for malignant lesion detection, outperforming the mean radiologist AUC of 0.801 [0.769–0.832]. With AI assistance, radiologists significantly improved their AUC (0.880 [0.865–0.896]), sensitivity (+18.7 points, p<.001), and specificity (+2.6 points, p=.042). Gains were lower for subspecialists but still significant.
Conclusion: The AI system showed robust performance, enhancing radiologists’ diagnostic accuracy and supporting its potential as a clinical decision-support tool.
Limitations: The study was retrospective, with an enriched cancer dataset, and readers had no access to clinical information.
Funding for this study: Gleamer
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: