Evaluation of an AI System for Cancer Detection in Abbreviated Breast MRI
Author Block: K. Eppenhof1, A. Rodriguez Ruiz1, W. B. Veldhuis2, C. Van Gils2, A. M. Rosanò3, R. Yang4, D. E. Lehrer5, L. Çelik6, R. Mann1; 1Nijmegen/NL, 2Utrecht/NL, 3Sion/CH, 4East Brunswick, NJ/US, 5Buenos Aires/AR, 6Istanbul/TR
Purpose: To investigate the performance of an AI system for breast cancer detection in abbreviated DCE-MRI.
Methods or Background: A combination of high-risk screening and diagnostic DCE-MRI exams from five hospital groups and a public data set (Duke-Breast-Cancer-MRI) were acquired. Each MRI exam was processed by an AI system, which takes as input the pre-contrast and a single post-contrast T1 image (abbreviated breast MRI), detects suspicious regions, and outputs a malignancy score per breast between 1 and 10. Additionally, the AI system was evaluated on an enriched screening dataset from the DENSE trial.
Results or Findings: Area under the Receiver Operating Characteristic curve (AUROC) was computed for classifying exam malignancy for exams from four hospital groups located in Argentina (41 of 780 exams containing biopsy-proven cancer, AUROC 0.891 (95% CI=0.828-0.944)), Switzerland (98/3499, 0.863(0.824-0.896)), Turkey (33/164, 0.955(0.898-0.998)), and the US (153/1096, 0.904(0.877-0.929)). The consistency in AUROCs indicates robustness across populations, protocols, and scanners.
Because Duke-Breast-Cancer-MRI exams all contain cancer, a breast-level analysis was done where breasts without cancer were used as the negative class (904/1808 breasts containing cancer). The AUROC (0.965(0.957-0.972)) is similar to an earlier published AI that used two post contrast images.
For exams that had a BIRADS assessment, the agreement between the AI (score >= 9) and the radiologist interpretation (BIRADS 1 or 2 vs. 4 or 5) was found to be moderate (Cohen kappa=0.502(0.449-0.555)).
The performance on screening-only data was measured in exams from the fifth hospital located in the Netherlands (66/2920 exams containing cancer, AUROC 0.812(0.753, 0.868)), and exams from the DENSE trial (83/517, AUROC 0.803(0.747-0.856)).
Conclusion: A first evaluation of an AI system for abbreviated DCE-MRI shows potential for decision support in detecting breast cancer.
Limitations: The study has a retrospective design.
Funding for this study: Not applicable
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: Not applicable