EIBIR Poster Session

EIBIR 3 - EIBIR Stage bonus session 3

March 5, 15:00 - 16:00 CET

9 min
Deep learning for differentiating Progressive Supranuclear Palsy from Corticobasal Degeneration using T1w-MRI
Radhika Juglan, Dresden / Germany
Author Block: R. Juglan1, A. Robasco1, Z. I. Carrero1, H. H. Kitzler1, D. Truhn2, J. Kather1; 1Dresden/DE, 2Aachen/DE
Purpose: Progressive Supranuclear Palsy (PSP) and Corticobasal Degeneration (CBD) are rare neurodegenerative disorders that present with overlapping clinical phenotypes, yet differ in underlying neuropathology. Accurate differentiation remains challenging with conventional MRI assessment. We investigated whether a brain MRI foundation model can enable automated and interpretable classification of PSP versus CBD.
Methods or Background: A self-supervised foundation model pre-trained on 42,000 UK Biobank T1-weighted MRIs was used as a feature extractor. A linear classification layer was trained on the 4RTNI cohort to separate PSP and CBD. Model performance was evaluated on a held-out test set with independent subjects using AUROC, AUPRC, and threshold-based diagnostic metrics. Interpretability was assessed with Grad-CAM heatmaps and atlas-based regional quantification. Longitudinal analyses examined prediction score trajectories and t-SNE embeddings across baseline, 6-month, and 12-month follow-up scans.
Results or Findings: In classifying PSP from CBD, the model achieved an AUROC of 0.78 (95% CI: 0.67–0.88) and AUPRC of 0.73 (95% CI: 0.58–0.87). At the optimal threshold determined by Youden’s J (0.53), the model achieved an accuracy of 0.75 with sensitivity of 0.78, specificity of 0.72, and F1 score of 0.75. With time progression, discrimination between the two diseases improved with AUROC increasing from 0.68 at baseline to 0.81 at 1-year follow-up, along with greater divergence in the embedding space. Grad-CAM localized highest attention to atlas-derived midbrain and thalamic structures, consistent with PSP pathology.
Conclusion: A lightweight linear classifier built on a foundation model distinguished PSP from CBD with good accuracy. Model-derived attention maps aligned with known disease-specific neuroanatomical patterns, supporting the potential of MRI foundation models to aid stratification in rare neurodegenerative syndromes.
Limitations: The limitation of the study is that it was restricted to a single cohort.
Funding for this study: Funding was provided by the the European Union EU’s Horizon Europe research and innovation programme (ODELIA, 101057091; GENIAL, 101096312), German Cancer Aid DKH (DECADE, 70115166), the German Federal Ministry of Research, Technology and Space BMFTR (PEARL, 01KD2104C; CAMINO, 01EO2101; TRANSFORM LIVER, 031L0312A; TANGERINE, 01KT2302 through ERA-NET Transcan; Come2Data, 16DKZ2044A; DEEP-HCC, 031L0315A; DECIPHER-M, 01KD2420A; NextBIG, 01ZU2402A), the German Research Foundation DFG (CRC/TR 412, 535081457; SFB 1709/1 2025, 533056198), the German Academic Exchange Service DAAD (SECAI, 57616814), the German Federal Joint Committee G-BA (TransplantKI, 01VSF21048), the European Research Council ERC (NADIR, 101114631), the National Institutes of Health NIH (EPICO, R01 CA263318) and the National Institute for Health and Care Research NIHR (Leeds Biomedical Research Centre, NIHR203331).
This work is partly supported by BMBF (Federal Ministry of Education and Research) in DAAD project 57616814 (SECAI, School of Embedded Composite AI, https://secai.org/) as part of the program Konrad Zuse Schools of Excellence in Artificial Intelligence.

This research has been conducted using the UK Biobank Resource under Application Number 92261. Data used in the preparation of this abstract were obtained from the 4-Repeat Neuroimaging Initiative (4RTNI) database and the Frontotemporal Lobar Degeneration Neuroimaging Initiative (FTLDNI) (http://4rtni-ftldni.ini.usc.edu/).
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: The overall analysis was approved by the Ethics board at University Hospital Carl Gustav Carus, Dresden, Germany. This study adhered to the tenets of the Declaration of Helsinki.
9 min
Can radiomic features derived from T1- and T2-weighted MRI improve differentiation of benign and malignant soft-tissue tumours?
Matthew Marzetti, Leeds / United Kingdom
Author Block: M. Marzetti1, M. P. A. Starmans2, P. Robinson1, D. L. Buckley1, A. Scarsbrook1, S. Klein2; 1Leeds/UK, 2Rotterdam/NL
Purpose: Soft-tissue sarcomas (STSs) are rare malignant tumours, while benign soft-tissue tumours (STTs) are common. Differentiation by imaging is challenging, often requiring invasive biopsy or resection. This study evaluated whether radiomics applied to MRI can reliably distinguish STSs from benign STTs, potentially accelerating diagnosis, reducing patient anxiety and diagnostic workload.
Methods or Background: A large retrospective dataset of 951 patients referred to a sarcoma multidisciplinary team (2007–2023) was selected. Tumours were automatically segmented using a deep-learning model on T1-weighted and T2-weighted fat-suppressed MRI, with manual corrections applied when necessary, before radiomic feature extraction. Nested cross-validation was used to train and test a logistic regression classifier. Performance was measured using area under the receiver operating characteristic curve (AUC). To reduce false negatives, a classification threshold ensuring ≥95% sensitivity was selected using the training dataset in the inner cross-validation. Final models from each outer cross-validation fold were ensembled and tested on two independent datasets:
1. Prospectively acquired data from the local centre (n=154).
2. External data from open-access sources and collaborators (n=155).
Results or Findings: The model achieved a mean AUC of 0.88 (range: 0.86-0.90) across the outer folds of the nested cross-validation. A threshold was calculated that provided a sensitivity of 95% and specificity of 57%. The model performed well on the external dataset (AUC=0.84, sensitivity=94%, specificity=45%) and the prospective dataset (AUC=0.80, sensitivity=91%, specificity=49%). Further analysis demonstrated variation in model performance across STT subtypes, which was investigated.
Conclusion: Radiomics can identify a significant proportion of benign lesions while maintaining high sensitivity for malignancy (≥91%), supporting its potential to reduce diagnostic workload with minimal risk to patient safety.
Limitations: The model was developed using data from a single centre, although tested on external data. Benchmarking against radiologist performance is still required.
Funding for this study: This study/project is funded by the NIHR Doctoral Clinical and Practitioner Academic Fellowship (NIHR302901).
The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The retrospective training and test dataset received institutional and Caldicott guardian approval but indicated this did not require formal ethics committee approval as it was considered as a service evaluation project. The independent prospective dataset used for model validation was approved by Yorkshire & The Humber - South Yorkshire Research Ethics Committee (Ref 23/YH/0151)
9 min
DigitalTwin for Breast Cancer Risk Monitoring from Mammography
Alberto Mosconi, Milano / Italy
Author Block: F. Darvizeh1, A. Mosconi2, M. Interlenghi2, A. Venturi2, C. Salvatore2, M. Alì1, S. Papa1, I. Castiglioni1, D. Fazzini1; 1Milan/IT, 2Milano/IT
Purpose: To validate a Digital Twin (DT) platform for the automatic monitoring of Breast Cancer (BC) risk in patients subjected to mammography.
Methods or Background: Female patients undergoing mammography for assessing BC risk, between November 2024 and May 2025, were included. The vendor-neutral DT platform (Trace4DigitalTwin™) was integrated in RIS-PACS of 3 centers (CDI-Centro Diagnostico Italiano, SME Varese, and Bionics; 2 mammography systems from 2 vendors). The platform includes a deep-learning model for automatic ACR breast density prediction. DT-predicted breast density was compared with radiologist classification.
Results or Findings: The DT automatically monitored 14,736 patients, for a total of 29,472 images (Medio-Lateral-Oblique projections). Agreement with radiologists’ classification was 81.2% for DT-predicted breast density in the ACR four classes A, B, C, and D (Cohen's kappa 62.6%). ACR BI-RADS 1 was 79.7% in DT-predicted class A, 67.9% in B, 65.8% in C, 66.1% in D. BI-RADS 2 was 18.3% in DT-predicted class A, 28.6% in B, 31.0% in C, and 30.0% in D. No BI-RADS 5 was found in DT-predicted class A. The DT-predicted D-to-B ratio of BI-RADS 5 was more than 200%.
Conclusion: We demonstrated that a RIS-PACS integrated, DT platform, including a deep-learning model for automatic breast density and ACR BI-RADS reporting, is feasible, supporting efficient and standardized breast patient monitoring and breast cancer risk assessment.
Limitations: Risk stratification considering other risk factors should be included.
Funding for this study: No funding was received for this study.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: N/A
9 min
CNNs for Automated Detection of Lung Emphysema in Chest CT studies
Matteo Interlenghi, Milan / Italy
Author Block: F. Darvizeh, E. Schiavon, M. Interlenghi, A. Mosconi, A. Lad, A. Venturi, C. Salvatore, D. Fazzini, I. Castiglioni; Milan/IT
Purpose: Development and evaluation of Convolutional Neural Networks (CNNs) for automatic detection of pulmonary emphysema from chest CT studies
Methods or Background: Chest CT studies from two radiology departments (Centro Diagnostico Italiano-CDI Milano n=320, Centro SME Varese n=90) were retrospectively collected (StudyID 1944 approved 8/2/2021). Patients signed informed consent.
A pipeline was developed including automatic image preprocessing and classification via CNNs mimicking visual assessment of radiologists for emphysema or non-emphysema detection.
Four ResNet-18 were trained with different image: 2D lung-density maps of segmented lungs (Otsu thresholding) and color-windowing in three density-zones: low (<−950 HU), normal [−950 to −810 HU], and high (>−810 HU) density, with automatic selection of most representative coronal slice (ResNet2DM) (1); full 3D lung-density maps (ResNet3DM) (2); full 3D-CT volumes without (ResNet3D) (3) and with lung segmentation (ResNet3DSL) (4), with preprocessing window (1500 HU width, −600 HU level) for (3) and (4).
Cases were divided in training (146-141, emphysema and non-emphysema, respectively), validation (41-40), and external-testing (22-21), with Ground-truths assigned by board-certified radiologists and a trained engineer, based on diagnostic reports. McNemar’s test was used for comparison of performance (sensitivity, specificity) of the CNNs.
Results or Findings: The performance achieved are: ResNet2DM 0.77 sensitivity, 0.86 specificity; ResNet3DM 0.82 and 0.81; ResNet3D 0.86 and 0.81; ResNet3DSL 0.77 and 0.76.
No significant differences in performance were found between ResNet3D, and ResNet3DM (best models). ResNet2DM and ResNet3DSL performance was significantly inferior. Radiologist preference indicated ResNet3DM for higher explainability.
Conclusion: Lung emphysema can be automatically detected with 3D CNNs on chest CTs. 3D ResNet-18 is a valid solution, with high explainability when used with lung-density coloured-maps.
Limitations: Increasing the sample size from multiple centers will be recommended for further validation
Funding for this study: No funding was received for this study.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: N/A
9 min
Development of an auto-segmentation model for dose accumulation in repeated liver brachytherapy
Anna Sophie Duque, Munich / Germany
Author Block: A. S. Duque, M. Rottler, P. Rogowski, F. Fuchs, C. C. Cyran, J. Ricke, M. Seidensticker, C. Kurz, S. Corradini; Munich/DE
Purpose: In order to assess clinical outcome of repeated CT-guided high-dose-rate liver brachytherapy administered over several years, accurate calculation of accumulated dose is needed. To drive image registration, an AI model was developed to facilitate liver segmentation while minimizing distortions by brachytherapy catheters.
Methods or Background: For 35 patients with multiple liver brachytherapy sessions, first-session planning CTs were segmented using commercial AI algorithms for radiotherapy organs-at-risk delineation (algorithms A, B). Resulting liver contours were corrected by experienced radiation oncologists, serving as ground truth. The patient set was split into training and validation set (n=30) and test set (n=5). Training was performed using nnU-Net with five-fold cross-validation. Ground truth contours were compared to algorithms A, B, an additional pre-trained open-source AI algorithm (C) and the resulting custom model (D*) in terms of Dice Similarity Coefficient (DSC) and 95th percentile Hausdorff Distance (HD95).
Results or Findings: D* showed a higher robustness towards brachytherapy catheters compared to the other algorithms (Fig. 1). After cross-validation, a mean DSC of 0.965 and mean HD95 of 4.1 mm was reached by D*. After applying AI algorithms A-C to the validation set, mean DSCs ranged from 0.921 (B) to 0.947 (C) (Fig. 2) . Mean HD95 ranged from 7.4 mm (C) to 12 mm (A) (Fig. 3). On the test set, the DSC was 0.969 (D*), 0.954 (A), 0.968 (B) and 0.962 (C). Mean HD95 was 3.1 mm (D*), 7.0 mm (A), 4.0 mm (B) and 5.4 mm (C).
Conclusion: A custom model for liver brachytherapy segmentation was developed, providing contours less disrupted by catheters compared to existing auto-segmentation algorithms.
Limitations: The patient data sets were comparably small. Since clinical contours were obtained by correcting results of algorithms A and B, the comparison could be biased.
Funding for this study: No funding or industrial support was received for this study.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
9 min
Developing an Artificial Intelligence Algorithm for T-staging Rectal Cancer MRI: a Pilot Study
Seema Toso, Geneva / Switzerland
Author Block: S. Muncner1, A. Wahd1, N. Frymire1, R. MacEwan1, S. Toso2, S. Liu1, A. Hareendranathan1, H. Wang1, J. L. Jaremko1; 1Edmonton/CA, 2Geneva/CH
Purpose: Pelvic MRI is the preferred imaging modality for staging rectal cancer (RC). Accurate staging is critical for treatment decision-making, and detemines if patients require neoadjuvant treatment. However, interpretation can be challenging despite review by multidisciplinary teams (MDT). Techniques to improve MRI staging may improve RC patient outcomes. In this pilot study, we evaluated the performance of a novel AI algorithm for RC MRI interpretation.
Methods or Background: Two expert interpreters labelled RC T-stage on 99 2D MRI images each (1.5T, T2-weighted) from 34 unique patients. 156 images(27 patients) were allocated for training and 42 images(seven patients) for testing, to prevent data leakage. An adapted version of Meta’s SAM2 foundation model was used to perform visual in-context learning (ICL) for image segmentation. Dice scores and diagnostic performance of AI T-stage (≥T3) prediction vs. human labels was calculated. Finally, an expert user awarded AI results a qualitative score (grade-A: incorrect T-stage; grade-B:correct T-stage, incorrect contours; grade-C: correct T-stage, correct contours).
Results or Findings: Of 42 test images, average Dice score was 63.6% (normal bowel 73.09%, T1/2 tumor 74.91%, T3 tumor 22.03%). For ≥T3-stage, average accuracy was 79%, and sensitivity and specificity averaged 75.13% and 87.50% respectively. On qualitative assessment, AI results were high-quality in 79% of cases (35.7% grade-3 predictions (n=15), 42.9% grade-2 (n=18)).
Conclusion: This pilot study demonstrates remarkably strong preliminary AI performance with 79% accuracy for T-stage.
Limitations: This study had a very small training set (156 images/27 patients). Ongoing testing including multicentre training data will improve the accuracy of the algorithm.
Funding for this study: Funding was provided by the Clinician Investigator Program (University of Alberta, Edmonton, Canada).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: University of Alberta Research Ethics Board Pro00076657