Research Presentation Session: Artificial Intelligence and Imaging Informatics

RPS 2305 - How AI is redefining prostate cancer detection, risk stratification and prognosis: from promising research to clinical impact

March 8, 09:30 - 11:00 CET

6 min
Evaluating AI assistance in clinically significant prostate cancer diagnosis using MRI
Ying Hou, Nanjing / China
Author Block: Y. HOU; Nanjing/CN
Purpose: Artificial intelligence (AI) is potential as an assistant in support of human in diagnosing clinically significant prostate cancer (csPC) with MRI. We aimed to test the noninferiority and superiority of human-AI collaboration to human stand-alone review in csPC diagnosis with MRI.
Methods or Background: This observer study was conducted in four randomized cohorts (n = 1,305) who initially underwent MRI for csPC from two medical centers. A clinically available AI system was implemented as an assistant or not to stand-alone double reading in 21 readers to flag cases for further arbitration review among MRI-screened men with suspicious csPC. The secondary outcome was the clinically insignificant prostate cancer (ciPC) diagnosis.
Results or Findings: In four observer groups, except for junior readers in center 1, Human-AI collaboration was equal to human-alone (AUROC, 0.87 vs 0.82; P = .069), regardless of experience, readers with human-AI collaboration were superior to readers with human-alone (AUROC, 0.86 ~ 0.94 vs 0.70 ~ 0.86; all P-values < .001) in patient-level csPC diagnosis. Using PI-RADS ≥ 3 as diagnosing indication, human-AI collaboration achieved higher sensitivity (92.9% [688/740] vs 87.6% [617/704]; Odds ratio [OR], 1.86 [1.30- 2.67], P = .0007), higher specificity (66.2% [408/616] vs 46.7% [257/550]; OR, 2.23 [1.76- 2.83], P < .0001) than human-alone for csPC diagnosis. For ciPC, human-AI collaboration did not increase risk for over cancer detection rate (6.2% [38/616] vs 4.7% [26/550], OR, 1.30 [0.80- 2.21], P = .276), while achieved higher specificity (68.9% [376/546] vs 46.4% [231/498], OR, 2.56 [1.99- 3.29], P < .0.001) than human-alone review.
Conclusion: Human-AI collaboration is noninferior and superior to human-alone review regarding improvement of accuracy in csPC diagnosis and not increase risk for ciPC overdiagnosis at MRI.
Limitations: Further prospective validation
Funding for this study: None
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: grant no. 2019-SR-396
6 min
Automated MRI-based segmentation of the prostate, prostatic zones, periprostatic space, urethra and seminal vesicles using Convolutional Neural Networks trained on expert-annotated data
Szilvia Tótin, Szeged / Hungary
Author Block: S. Tótin1, W. Holmlund2, A. T. Simkó2, T. Z. Kincses1, E. Koós1, P. Palásti1, Z. Fejes1; 1Szeged/HU, 2Umeå/SE
Purpose: Accurate delineation of prostate zones and adjacent structures is essential for diagnosis, radiotherapy planning and surgical decision-making in prostate cancer. Manual contouring is time consuming and prone to variability. This study aimed to develop and validate convolutional neural networks (CNNs) for automated segmentation of the prostate zones, urethra, periprostatic neurovascular bundle and seminal vesicles on MRI, using a large expert-annotated dataset.
Methods or Background: We used T2-weighted multiparametric MRI scans from 200 PROSTATEx patients. Manual segmentations of the peripheral, central and transitional zones, anterior fibromuscular stroma were performed in 3D Slicer following PI-RADS v2.1 supplemented by the urethra, seminal vesicles and periprostatic bundle. 40 cases were independently annotated by two radiologists to provide inter-reader variability. The CNNs were trained on 160 cases and validated on 40 test cases. Performance was assessed using Dice Similarity Coefficient (DSC), Surface Dice Measurement (SDM) at multiple tolerance levels and Center Line Distance (CLD) for urethral evaluation, using Hero Imaging software.
Results or Findings: The CNNs achieved segmentation accuracy comparable to expert readers. For the prostate, the mean DSC value exceeded 0.90, matching inter-reader variability (0.913±0.0027). Urethral delineation achieved CLD values of 3.0 mm, similar to radiologist agreement (3.6 mm). The CNNs extended reliable segmentations to the central, peripheral and transitional zones, anterior fibromuscular stroma, periprostatic bundle and seminal vesicles, reaching agreement levels comparable to experienced radiologists.
Conclusion: Our CNN framework enables realiable, automated MRI segmentation of the prostate zones, urethra, periprostatic space and seminal vesicles. By reducing workload it has the potential to support standardized diagnosis, precise radiotherapy with urethral and periprostatic sparing and informed surgical planning.
Limitations: Partial external validation (performed on the prostate zones only).
Some regions showed high inter-reader variability, which may indicate the need to harmonize image interpretation.
Funding for this study: No funding was provided for this study.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Integrating Automated Neurovascular Bundle Segmentation and Radiomics Biomarkers for Diagnostic and Prognostic Modeling in Prostate Cancer
Gemma Urbanos, Madrid / Spain
Author Block: G. Urbanos, A. Jimenez-Pastor, A. Nogue, G. Ribas, F. J. E. Higa, C. Fontenla Martínez, V. Belloch Ripollés, L. Marti-Bonmati, A. Alberich-Bayarri; Valencia/ES
Purpose: Accurate evaluation of the prostatic neurovascular bundles (NVBs) in prostate cancer (PCa) informs staging and treatment but remains difficult to standardize. This study developed an automated pipeline for NVB segmentation, tumor–NVB invasion risk stratification, and radiomics-based prediction of biochemical recurrence (BCR), perineural invasion (PNI), and extraprostatic extension (EPE).
Methods or Background: We collected 807 PCa MRI real-world exams from three sources. Apparent Diffusion Coefficient (ADC) maps and prostate/lesion segmentations were obtained with QP-Prostate®. Experts manually annotated NVBs in 470 T2-weighted (T2w) series, used to train a nnU-Net 3D full-resolution model. Data were split into 80% training/validation and 20% testing. In patients with peripheral zone (PZ) lesions (N=65), minimum lesion–NVB distance was computed and categorized as high (<2 mm), intermediate (2–5 mm), or low (>5 mm) risk.

The trained model was applied to 280 diagnostic cases with PZ lesions and clinical data. Radiomic features from lesions and NVBs (ADC and T2w), extended with PSA and age, were used to predict PNI (185-, 91+), EPE (131-, 64+), and BCR (35-, 43+). Pipelines included normalization, feature selection, outlier removal, and class balancing. Different classification models were trained with 5-fold cross-validation.
Results or Findings: The results were assessed on the test set. NVB segmentation showed a mean surface distance of 1.02 (0.58–1.93) mm and volume difference of 0.41 (0.23–0.76) cc. Tumor–NVB invasion risk classification achieved accuracy of 0.89. Prediction AUCs were 0.73±0.07 for BCR, 0.80±0.05 for PNI, and 0.80±0.05 for EPE. Models combining NVB and lesion radiomics outperformed either region alone.
Conclusion: Automated NVB segmentation enables accurate invasion risk classification and improves prediction of BCR, PNI, and EPE, supporting integration into clinical risk stratification.
Limitations: Retrospective design, moderate segmentation overlap, and limited BCR prediction performance require external validation.
Funding for this study: Funding was provided by the project ProCanAid (PLEC2021-007709)
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: CEIM-Hopital Universitario y Politécnico la Fe (ProcanAid) – Nº de registro 2021-471-1.
6 min
DeepSector-PI: Automated PI-RADS-Compliant Lesion-to-Sector Mapping
Rafał Janusz Jóźwiak, Warsaw / Poland
Author Block: R. J. Jóźwiak1, J. Mycka1, M. Gonet1, I. Mykhalevych1, T. Lorenc1, A. Zacharzewska-Gondek2, J. Dołowy2, K. Tupikowski2; 1Warsaw/PL, 2Wrocław/PL
Purpose: The PI-RADS 2.1 sector map is a standardized 39-sector grid of the prostate that enables unambiguous lesion localization on mpMRI. We aimed to train a DL-based model for automatic prostate sector mapping and to evaluate its performance on real-world data drawn from an expert-curated mpMRI reference dataset.
Methods or Background: DeepSector-PI is built on a DenseNet-based classification network. To evaluate its performance, we assembled data from 321 mpMRI cases with identified suspicious lesions that were independently reported by three expert radiologists, who annotated all lesions and completed structured reports (SR) including the PI-RADS 2.1 sector map. In total, 845 lesion–sector pairs were used for training, and 106 pairs were reserved for validation. Model performance was stratified by zonal location (PZ/TZ/both), anatomical level (base/mid/apex), and lesion size quantified by the number of sectors involved.
Results or Findings: F1 scores and balanced accuracy were computed across all stratifications. By lesion size, F1 was 0.65 / 0.78 / 0.86 for lesions involving <4, 4–6, and ≥7 sectors, respectively; by zonal location, F1 was 0.77 / 0.73 / 0.67 for lesions confined to PZ, TZ, and both zones, respectively. Balanced accuracy for the same stratifications was 0.90 / 0.87 / 0.91 (size) and 0.92 / 0.91 / 0.84 (zone). We also remapped predictions to a simplified 24-sector scheme, yielding F1 of 0.72 / 0.84 / 0.86 (size) and 0.83 / 0.75 / 0.77 (zone), with corresponding balanced accuracy of 0.91 / 0.90 / 0.90 (size) and 0.94 / 0.92 / 0.86 (zone).
Conclusion: DeepSector-PI provides robust classification of lesion-involved prostate sectors, supporting standardized reporting and targeted biopsy planning. DeepSector-PI performance depends on lesion extent and zonal/anatomical level.
Limitations: Classification requires sector-labeled data from experts or another AI model.
Funding for this study: This work has been funded by the Polish National Centre for Research and Development under the program INFOSTRATEG I, project INFOSTRATEG-I/0036/2021 “AI-augmented radiology - detection, reporting and clinical decision making in prostate cancer diagnosis”.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Intralesional and perilesional radiomics strategy based on different machine learning for the prediction of international society of urological pathology grade group in prostate cancer
Yongsheng Zhang, Hangzhou / China
Author Block: Y. Zhang, Z. Li; Hangzhou/CN
Purpose: To develop and evaluate a intralesional and perilesional radiomics strategy based on different machine
learning model to differentiate International Society of Urological Pathology (ISUP) grade > 2 group and ISUP ≤ 2
prostate cancers (PCa).
Methods or Background: 340 case of PCa patients confirmed by radical prostatectomy pathology were obtained from two hospitals.
The patients were divided into training, internal validation, and external validation groups. Radiomic features were
extracted from T2-weighted imaging, and four distinct radiomic feature models were constructed: intralesional,
perilesional, combined tumoral and perilesional, and intralesional and perilesional image fusion. Four machine
learning classifiers logistic regression (LR), random forest (RF), extra trees (ET), and multilayer perceptron (MLP) were
employed for model training and evaluation to select the optimal model.
Results or Findings: The AUCs for the RF classifier were higher than that of LR, ET, and MLP, and was selected as the final radiomic
model. The nomogram model integrating perilesional, combined intralesional and perilesional, and intralesional and
perilesional image fusion had an AUC of 0.929, 0.734, 0.743 for the training, internal, and external validation cohorts,
respectively, which was higher than that of the individual intralesional, perilesional, combined intralesional and
perilesional, and intralesional and perilesional image fusion models.
Conclusion: The proposed nomogram established from perilesional, combined intralesional and perilesional, and
intralesional and perilesional image fusion radiomic has the potential to predict the differentiation degree of ISUP PCa
patients.
Limitations: Its retrospective design and limited case number may introduce selection bias. Notably, only patients who underwent
radical prostatectomy were included, potentially excluding patients with the most aggressive cancers not suitable for surgery, thus limiting generalizability.
Funding for this study: This study was supported by Zhejiang Traditional Medicine and Technology Program (2024ZL688, 2024ZL668).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: This retrospective study has been approved by the local institutional review boards. It was determined that written informed consent was not required for this retrospective research (No. 2022KY042).
6 min
From Imaging to Outcomes: A PI-RADS–Driven Radiomics and Clinical Machine Learning Model for Detecting Clinically Significant Prostate Cancer
Dimitrios Samaras, Larissa / Greece
Author Block: D. Samaras1, G. Agrotis2, M. Vakalopoulou3, M. Vlychou1, I. Tsougos1; 1Larissa/GR, 2Amsterdam/NL, 3Paris/FR
Purpose: This study aimed to develop and evaluate a machine-learning (ML) framework based on the PI-RADS protocol for detecting clinically significant prostate cancer (csPCa) using multiparametric MRI (mpMRI), simulating radiologists’ decision-making.
Methods or Background: The publicly available PI-CAI (Prostate Imaging Cancer AI) dataset was employed, comprising 1,500 cases from 1,476 patients across 11 centers using seven MRI scanners. Among these, 1,075 cases were benign or clinically insignificant prostate cancer (cinsPCa), while 425 represented csPCa, defined as Gleason score (GS) ≥ 3+4. Ground truth labels were derived from biopsy results conducted by urologists, radiologists, or trained medical personnel under supervision. The ML framework followed a two-branch architecture: T2-weighted (T2W) images for the transition zone, and diffusion-weighted imaging (DWI) with apparent diffusion coefficient (ADC) maps for the peripheral zone. In addition to a radiomics-only model, a combined radiomics + clinical model was developed incorporating PSA, age, and prostate volume. Feature extraction was performed using Pyradiomics, including shape and texture features from original and filtered images. Feature space dimensionality was progressively reduced through a multi-stage pipeline: low-variance filtering (threshold 0.01), Pearson correlation pruning (ρ≥0.85), and Wilcoxon rank-sum testing (p≤0.1), followed by supervised feature selection restricted to training folds. The dataset was split into 80% training/validation and 20% testing, with five-fold cross-validation. Performance metrics included AUC, sensitivity, specificity, accuracy, balanced accuracy, and F1-score.
Results or Findings: The combined model (radiomics+clinical) outperformed the radiomics-only model, achieving higher AUC in both training (0.79±0.02 vs. 0.76±0.02) and testing set (0.78 vs. 0.73).
Conclusion: Our approach demonstrates strong potential for improving csPCa detection, supporting biopsy decisions, and enhancing patient outcomes.
Limitations: The modest external test set and absence of deep learning benchmarks limit generalizability. Validation on larger, multicenter cohorts and integration into clinical workflows are warranted.
Funding for this study: This work has been partially supported by project MIS 5154714 of the National Recovery and Resilience Plan Greece 2.0 funded by the European Union under the NextGenerationEU Program.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Cross-Sequence Consistency Deep Learning for Detecting Clinically Significant Prostate Cancer in PI-RADS 1–3 Lesions at MRI
Lu Bai, Xi'an / China
Author Block: L. Bai, L. Luo, H. Han, W. Wang, J. Yang; Xi'an/CN
Purpose: To develop and validate a deep learning model with a cross-sequence consistency module (CSCM) for improving csPCa detection in bpMRI among PI-RADS 1–3 lesions.
Methods or Background: To develop and validate a deep learning model with a cross-sequence consistency module (CSCM) for improving csPCa detection in bpMRI among PI-RADS 1–3 lesions.This retrospective multicenter study included patients who underwent biparametric MRI (bpMRI) including T2WI, DWI, and ADC sequences. A deep learning framework integrating three 3D ResNet-18 streams for feature extraction and a transformer-based fusion module was developed, with additional CSCM to enhance feature alignment across sequences. The model was tested using two external test sets. Model performance was evaluated using AUC, accuracy, sensitivity, and specificity using histopathologic outcomes as standard.
Results or Findings: A total of 1050 patients with PI-RADS 1-3 lesions were divided into a training set (n=332), internal test set (n=83), external test set one (n=281) and external test set two (n=354). At histopathologic analysis, 22% (230/1050) of patients had csPCa lesions. The DL framework showed AUCs of 0.825, 0.851 and 0.821 for in internal test set, external test set one and two, with corresponding accuracy of 0.907, 0.786 and 0.771, respectively.
Conclusion: A CSCM-based deep learning model with improved identification of csPCa in PI-RADS 1–3 lesions using bpMRI, demonstrating strong generalizability and potential to reduce diagnostic variability and unnecessary biopsies.
Limitations: It is a retrospective study, it carries inherent selection bias,and the analysis did not include dynamic contrast-enhanced (DCE) sequences.
Funding for this study: None.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Ethic committee of The First Affiliated Hospital of Xi'an Jiaotong University
6 min
AI-Assisted Prostate Cancer Detection with MRI: A Clinical Routine Simulation Study
Lorenzo Pinto, Naples / Italy
Author Block: L. Pinto1, G. Di Costanzo1, C. Riccio2, A. G. Tucci1, L. Palumbo1, R. Cuocolo3, A. R. R. Padhani4, M. Imbriaco2, A. Ponsiglione2; 1Pozzuoli/IT, 2Napoli/IT, 3Salerno/IT, 4Northwood/UK
Purpose: Interest in AI-driven detection of clinically significant prostate cancer (PCa) on MRI is growing. We evaluated a commercial AI system as a concurrent decision-support tool, assessing its impact on radiologists of varying expertise.
Methods or Background: In our retrospective study, consecutive patients underwent multiparametric MRI (mpMRI) for clinical suspicion of PCa. Scans were reviewed by six readers with different expertise, with and without AI assistance. Intra- and inter-reader agreements and the impacts of AI-assisted on patient-level csPCa scores were assessed. Diagnostic performance metrics at patient level and benefit-to-harm ratios were evaluated.
Results or Findings: The study included 100 patients (26% with csPCa). There was no improvement in inter-reader agreement with AI readings (0.584 vs 0.573). Residents were most likely to change PI-RADS scores with AI assistance compared to basic and expert readers (19, 9, and 7 changes, respectively). Overall, there was no significant difference in AUROC between AI-assisted and unassisted readings (0.87 vs 0.86; p = 0.734). At a PI-RADS ≥3 threshold, sensitivity was slightly lower with AI (0.87 vs 0.89), while specificity (0.73), PPV (0.53–0.54), and NPV (0.94–0.95) remained similar. Subgroup analyses showed no significant differences in diagnostic performance. A slight increase in grade selectivity and selective biopsy avoidance was observed among experts and residents, respectively, with AI-assisted readings when applying a PI-RADS cut-off of 3 or PSA density ≥0.15.
Conclusion: AI decision support does not significantly improve diagnostic accuracy for csPCa detection across readers of varying expertise, with minor impacts on benefit-to-harm ratios.
Limitations: We did not fully account for MRI-negative patients who avoided biopsies, so accuracy metrics should be interpreted relative to the evaluated cohort. Lack of per-lesion analysis.
Funding for this study: None
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Local IRB
6 min
Artificial Intelligence as a Gatekeeper in Prostate Cancer Imaging: Can We Avoid Unnecessary Biopsies?
Karen Andreina Bolivar Gil, Buenos Aires / Argentina
Author Block: K. A. Bolivar Gil, F. D. Losada Lopez, J. Camean, L. A. MIQUELINI; Buenos Aires/AR
Purpose: Detecting suspicious lesions suggestive of clinically significant cancer which require biopsy for diagnosis, remains an imaging challenge. The aim of this study is to evaluate the added value of artificial intelligence in radiology reports, in order to avoid unnecessary biopsies.
Methods or Background: This retrospective study analysed a cohort of 141 patients who underwent prostate mpMRI between 2022 and 2025 with the assistance of an AI-based diagnostic tool. All examinations were initially interpreted by an experienced radiologist. Subsequently, patients underwent transperineal cognitive fusion biopsy (MRI–US) targeting between one and four lesions per patient, resulting in a total of 163 lesions. Histopathological evaluation of the biopsy specimens served as the reference standard.
Results or Findings: Diagnostic concordance between expert radiologists and the AI program was observed in 85 cases (60.3%); of these, 68 (80.0%) were positive for clinically significant prostate cancer (Gleason score ≥7) and 17 (20.0%) negative.
On independent analysis, expert radiologists reported 81/141 positive cases (57.4%) and 60/141 negative cases (42.6%). The AI, when limited to concordant cases, identified 68 positives (80.0%) and 17 negatives (20.0%).
AI detected suspicious lesions in 104/141 patients (73.8%). Among the 37 patients without AI-reported lesions (26.2%), histopathology was also negative in 30 (81.1%) and positive for clinically significant prostate cancer in 7 (18.9%). The AI system achieved a sensitivity of 90.7% and specificity of 45.5% for detecting clinically significant prostate cancer, with a PPV of 65.4%, NPV of 81.1%, and overall accuracy of 69.5%.
AI improved the NPV to 85% in low to intermediate risk lesions.
Conclusion: The AI system demonstrated a high NPV, enhancing triage of low to intermediate risk lesions and potentially helping to avoid unnecessary prostate biopsies.
Limitations: Due to the retrospective design, not all AI-reported positive cases underwent biopsy.
Funding for this study: No funding was provided for this study.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Adapting Generative Cross-Modality Image Translation Models for Prostate Magnetic Resonance Imaging (MRI) Denoising: A Large-Scale Study on T2-Weighted Imaging
Rafael Moreno Calatayud, Silla / Spain
Author Block: R. Moreno Calatayud, P. Rodríguez Belenguer, J. Gómez-Martínez, J. Aquerreta-Escribano, G. Ribas, A. Galiana-Bordera, L. Cerda Alberich, L. Marti-Bonmati; Valencia/ES
Purpose: Image quality is critical in prostate MRI, as noise and artifacts can obscure anatomical structures, reduce diagnostic confidence, and compromise the performance of downstream automated algorithms. This study evaluated translation-inspired generative models for denoising prostate T2-weighted MRI, aiming to improve image quality while preserving anatomical fidelity.
Methods or Background: This retrospective study included a cohort of 805 patients (>18 years old), divided into a training set and a validation set of 700 and 105 patients respectively, with a confirmed pathology diagnosis of prostate cancer who underwent T2-weighted MRI volumes at Hospital Universitari i Politècnic La Fe from January-2015 to December-2022. Synthetic low-quality images were generated by adding Gaussian noise, bias field inhomogeneities, blurring, and ghosting, while the original scans were used as high-quality ground truths. Two state-of-the-art generative models were adapted to this framework: Biting et al.'s edge-aware GANs (Ea-GAN) and Kim et al.’s adaptive latent diffusion model (ALDM VQGAN). These models were developed for brain MRI to translate T1-weighted images into T1ce, T2, and FLAIR sequences but were repurposed to denoise prostate T2-weighted MRI.
Results or Findings: Quantitative image quality metrics calculated on the model output resulted in a peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) and normalized mean square error (NMSE) of 29.5, 0.86 and 0.04 respectively with the high-quality ground truth image. These values indicate a strong resemblance between the reconstructed and reference images, highlighting the model’s ability to reduce noise while preserving underlying anatomical structures.
Conclusion: This pilot study showed generative models can improve image quality of prostate T2-weighted MRI, highlighting potential clinical utility in the early diagnosis of prostate cancer when dealing with noisy acquisitions. Further research is required to validate the model in larger cohorts.
Limitations: No limitations were identified.
Funding for this study: Funding was provided by the Instituto de Salud Carlos III call for Research, Development, and Innovation (R&D&i) Projects related to Personalized Medicine and Advanced Therapies (Transmissions Initiative), co-financed by the European Union-NextGenerationEU /Recovery Plan, transformation and Resilence (RPTR).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: This study was approved by an Ethics Committee with the following reference number 2023-1107-1.
6 min
Integrating Radiomics and CNN-Based Approaches for Automated PI-RADS 3–5 Classification in Prostate MRI
Saman Fouladi, Milan / Italy
Author Block: S. Fouladi, F. Darvizeh, R. Di Meo, L. Di Palma, A. Maiocchi, E. Damiani, M. Alì, D. Fazzini, G. Gianini; Milan/IT
Purpose: Accurate classification of clinically significant prostate cancer remains challenging, particularly in distinguishing aggressive tumors from indolent ones. Although multiparametric MRI (mpMRI) has enhanced lesion detection, effective categorization using the Prostate Imaging Reporting and Data System (PI-RADS) remains complex. This study aims to develop and evaluate complementary automated approaches for PI-RADS classification, focusing on categories 3, 4, and 5, using ADC, DWI, and T2W sequences.
Methods or Background: Three approaches were investigated. First, hand-crafted radiomic features were extracted from manually segmented lesions using the PyRadiomics library. Second, we extended this approach by incorporating fully automated lesion and zonal segmentation to simulate a practical, manual-free pipeline. Third, a custom convolutional neural network (CNN) was trained on ADC images and lesion masks to learn high-level features directly. These features were subsequently used to train multiple machine learning models for multi-class PI-RADS classification.
Results or Findings: Features derived from ADC consistently achieved superior performance, with one ensemble model reaching an accuracy of 0.77 and an AUC of 0.83. Combining features from all sequences further improved robustness (accuracy = 0.73, AUC = 0.84). PI-RADS 5 classification proved most reliable (AUC ≥ 0.94), whereas PI-RADS 3 remained the most challenging to distinguish.
Conclusion: ADC-derived features are highly effective for PI-RADS classification, and integrating automated radiomic extraction with deep learning enhances robustness and practical applicability. Combining multi-sequence information and learning-based approaches offers a promising pathway for automated risk stratification in prostate cancer.
Limitations: The study is limited by the dataset size and the reliance on manually segmented lesions in some approaches, which is time-consuming. Nevertheless, the results are promising, and performance is expected to improve further with larger datasets and fully automated pipelines.
Funding for this study: Funding The work was partially supported by the MUSA-Multilayered Urban
Sustainability Action project, funded by the European Union-NextGenerationEU,
under the Mission 4 Component 2 Investment Line of the National Recovery and
Resilience Plan (NRRP) Mission 4 Component 2 Investment Line 1.5: Strengthening
of research structures and creation of R&D ”innovation ecosystems”, set up
of ”territorial leaders in R&D” (CUP G43C22001370007, Code ECS00000037);
Program ”piano sostegno alla ricerca” PSR and the PSR-GSA-Linea 6; Project
ReGAInS (code 2023-NAZ-0207/DIP-ECC-DISCO-23), funded by the Italian
University and Research Ministry, within the Excellence Departments program
2023-2027 (law 232/2016).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Approval on September 11, 2024 by CET Lombardia 3 Ethical Committee (Study ID: 5105)
6 min
Diagnostic Pitfalls of AI-Assisted Multiparametric MRI in Prostate Cancer Detection: When AI Gets It Wrong
Antonella Borrelli, Rome / Italy
Author Block: A. Borrelli, L. Laschena, M. Pecoraro, S. Lucciola, E. Messina, V. Panebianco; Rome/IT
Purpose: To evaluate the diagnostic pitfalls of AI-assisted mpMRI in prostate cancer detection
Methods or Background: Artificial Intelligence (AI) applied to mpMRI is increasingly integrated into prostate cancer (PCa) workflows. Its potential in PCa detection has been investigated, but limitations remain underexplored. MRI pitfalls are a major source of false positives. Prostate mpMRI, although accurate, is influenced by anatomical, clinical, and technical factors that can mimic cancer. In this retrospective single-center study, 458 mpMRI scans were reviewed: 150 biopsy-proven PCa, 156 true negatives, and 152 scans previously identified as Pitfalls. After exclusions, 362 cases were processed with AI software.Diagnostic performance was compared with expert radiologist reports and histopathology when available.Errors were classified as false positives and recurring misclassification patterns.
Results or Findings: In the overall cohort of 458 patients (150 cancers, 152 pitfalls, 156 negatives), AI demonstrated a sensitivity of 92.0%, but specificity was only 70.5%, leading to an accuracy of 77.5% and an AUC of 0.66. In the restricted cohort of 306 patients (150 cancers and 156 negatives, excluding pitfalls),sensitivity remained stable at 92.0%,while specificity improved to 93.6%,yielding an accuracy of 92.8% and an AUC of 0.93 in line with literature.
BPH caused the largest number of AI errors, DWI artifacts and ectopic BPH showed the highest misclassificatio rate resembling novice reader challenges and confirming that AI currently mirrors known interpretative limitations.
Conclusion: These results highlight the current limitations of AI tools in prostate imaging, particularly in differentiating cancer from benign mimics, reinforcing the need for expert radiologist oversight in clinical practice.
Limitations: Single-center retrospective study.
The tested AI software may not represent other platforms.
Our Cohort intentionally enriched with pitfalls and challenging cases making it less comparable to everyday practice
Funding for this study: None
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: