Research Presentation Session: Artificial Intelligence and Imaging Informatics

RPS 205 - Improving diagnosis and prognosis through AI in CNS diseases

March 4, 10:00 - 11:00 CET

6 min

Assessment of the stability of intracranial aneurysms using a deep learning model based on computed tomography angiography

Lu Zeng, Chongqing / China

Author Block: L. Zeng; Chongqing/CN
Purpose: The aim of this study was to construct a deep learning model (DLM) to identify unstable aneurysms on computed tomography angiography (CTA) images.
Methods or Background: The clinical data of 1041 patients with 1227 aneurysms were retrospectively analyzed from August 2011 to May 2021. Patients with aneurysms were divided into unstable (ruptured, evolving and symptomatic aneurysms) and stable (fortuitous, nonevolving and asymptomatic aneurysms) groups and randomly divided into training (833 patients with 991 aneurysms) and internal validation (208 patients with 236 aneurysms) sets. One hundred and ninety-seven patients with 229 aneurysms from another hospital were included in the external validation set. Six models based on a convolutional neural network (CNN) or logistic regression were constructed on the basis of clinical, morphological and deep learning (DL) features. The area under the curve (AUC), accuracy, sensitivity and specifcity were calculated to evaluate the discriminating ability of the models.
Results or Findings: The AUCs of Models A (clinical), B (morphological) and C (DL features from the CTA image) in the external validation set were 0.5706, 0.9665 and 0.8453, respectively. The AUCs of Model D (clinical and DL features), Model E (clinical and morphological features) and Model F (clinical, morphological and DL features) in the external validation set were 0.8395, 0.9597 and 0.9696, respectively.
Conclusion: The CNN-based DLM, which integrates clinical, morphological and DL features, outperforms other models in predicting IA stability. The DLM has the potential to assess IA stability and support clinical decision-making.
Limitations: The maximum slices of aneurysm images were used to construct 2D models, and some important factors that may affect aneurysm stability may have been missed, possibly leading to bias. More advanced DL algorithms are needed.
Funding for this study: This study was supported by the Science and Technology Commission of Chongqing City, China (CSTB2023NSCQ-MSX0668), the Joint Project of Science and Health of Chongqing City, China (2023MSXM022) and the Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN202200407).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The local ethics committee (Banan Hospital, 2021015; Xinqiao Hospital, 202248201) approved this retrospective study and agreed to the informed consent waiver of patients.

6 min

Glioblastoma Response Prediction and Tumor and Organ-at-Risk Segmentation with Radiomics and Deep Learning

Alejandro Mora Rubio, Valencia / Spain

Author Block: A. Mora Rubio¹, C. Bravo Vergara¹, M. Beser-Robles¹, G. Ribas¹, P. Garcia Verdu¹, I. Popp², A. L. Grosu², L. Marti-Bonmati¹, M. Carles Fariña¹; ¹Valencia/ES, ²Freiburg/DE
Purpose: The current standard treatment for Glioblastoma Multiforme (GBM) involves radiation therapy (RT) and requires manual tumour segmentation, which is labour-intensive and susceptible to inter-observer variability. Additionally, given the high recurrence rate of GBM patients, accurate response prediction methods can help to improve patient prognosis stratification and optimize treatment plans. The aim of this study is to develop and evaluate automatic segmentation methods based on deep learning and assess the ability of mathematical models employing clinical and radiomics features (RF) to improve the accuracy of response prediction.
Methods or Background: The study included 253 patients with primary/recurrent GBM, prospectively (185) and a retrospectively recruited in 13 institutions. The open-source cohort BraTS2021 of primary glioma patients was also used. The nnU-Net open-source framework was used to develop models for Enhancing Tumour, Peritumoral Edema, RT Planning Target Volume and six Organs-at-Risk, based on segmentation performed by neuroradiologist and radiation oncologist. For response prediction in recurrent GBM of the prospective cohort, the Cox Proportional Hazard and Logistic Regression models for overall survival, time to progression, and early recurrence prediction, were applied.
Results or Findings: The segmentation models achieved good to excellent performance, with average DSC scores in the test set of 0.79 (0.69-0.98). Clinical and MR-RF showed significant discrimination between patients with early and late progression on validation and test sets (p < 0.05 in Kaplan-Meier curves). Wavelet transform RF and clinical features like age, methylation status, and tumour localisation were notably significant predictors.
Conclusion: The good performance of the automatic segmentation models supports their use in clinical workflows to simplify procedures, reduce time investment, and increase robustness. Radiomics analysis suggested that the MR-RF model has potential for predicting time to progression.
Limitations: Ongoing work is about the FET-PET complementary information.
Funding for this study: This work was supported by the MATTO-GBM Project under the European TRANSCAN-3 ERA-NET 2022 Program, funded by ISCIII (AC23_1/00012), FAECC (TRANSCAN2022-784-104), and the European Union through the Next Generation EU Funds. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: All patients gave written informed consent according to institutional and federal guidelines. All study protocols were approved by the corresponding ethics committee.

6 min

Diagnostic accuracy of a commercial AI tool for intracranial large- and medium-vessel occlusion detection in a multicenter emergency-CTA cohort

Henrik Andersson, Lund / Sweden

Author Block: H. Andersson, B. Hansen, J. Wassélius; Lund/SE
Purpose: To evaluate the diagnostic accuracy of an AI tool (AIDOC VO), the first commercial AI tool designed to detect both large (LVO) and medium vessel occlusions (MeVO) on head‑and‑neck CT angiography (CTA), in a region-wide multicenter emergency setting.
Methods or Background: Prospective diagnostic‑accuracy study of consecutive emergency CTAs from 3 031 adults (mean age 67 years; 51 % women), acquired 1 March - 8 July 2024 across a ten hospitals healthcare region. AI analysed each scan; the routine radiology report served as comparator. Examinations flagged positive or doubtful by either test underwent rereading by interventional neuroradiologists to establish the reference standard. Sensitivity, specificity, predictive value, and accuracy were calculated for the primary VO analysis, with prespecified sub‑analyses for LVO and MeVO. Paired differences were tested with McNemar’s test.
Results or Findings: Of 3 031 CTAs, 2 804 (92.5 %) yielded valid AI output, among which VO was identified in 224 (8 %) examinations. VO sensitivity/specificity were 81.7%/99.6% for AI versus 81.2%/99.3% for the clinical radiology report (p = 0.91/p = 0.12). LVO sensitivity was 92.8% versus 87.0% (p=0.42); MeVO 76.1% versus 79.2% (p=0.55). Paired overall accuracy showed no significant differences (VO p=0.38; LVO p=0.06; MeVO p=0.76). AI identified VO in 42 examinations missed by radiologists (18.8% enhanced detection; 15 per 1,000) and generated 11 false alerts (3.9 per 1,000).
Conclusion: Stand-alone AI matched radiologist performance, with a similar number of enhanced detections and few false alerts, supporting complementary use in emergency stroke workflows.
Limitations: About 7.5% of CTAs failed quality control, only positive/doubtful cases were reread so some false negatives may have been missed, AI heat-map localization was not verified, and effects of AI assistance on readers were not assessed.
Funding for this study: Funding was provided by the Crafoord Foundation, VINNOVA, and SUS Stiftelser & Fonder; sponsors had no role in the study.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The Swedish Ethical Review Authority approved the study (#2023-00387-01) and waived informed consent.

6 min

Development of predictive models to identify the intracranial aneurysm responsible for subarachnoid hemorrhage in patients with multiple saccular aneurysms

Guangxian Wang, Chongqing / China

Author Block: G. Wang; Chongqing/CN
Purpose: To develop and test machine learning (ML) models using CTA to identify the intracranial aneurysm (IA) responsible for subarachnoid hemorrhage (SAH) accurately in patients with multiple saccular IAs and to determine whether these models outperform traditional predictive markers.
Methods or Background: 207 SAH patients with 460 IAs from four hospitals were included and randomly divided into training (80%) and internal validation (20%) sets. Additionally, an external validation set comprising 65 patients with 147 IAs from other four hospitals was used. The predictive models were developed using ML methods that integrated the morphological features of IAs (e.g., size and shape) to identify the responsible IA. These models were then compared with traditional predictive markers that relies on hemorrhage patterns and the maximum IA size.
Results or Findings: The areas under the curves (AUCs) for the hemorrhage patterns and the maximum IA size were 0.496–0.505, 0.502–0.523, and 0.488–0.498 in the training, internal validation, and external validation sets, respectively. Among the 13 ML models, the best-performing models were the Gaussian process, logistic regression, and quadratic discriminant analysis models, with AUCs of 0.912, 0.894, and 0.890, respectively, for the training set; 0.869, 0.872, and 0.853, respectively, for the internal validation set; and 0.898, 0.892, and 0.897, respectively, for the external validation set. DeLong tests revealed no significant differences among these models, but all the models outperformed traditional predictive markers (P<0.001).
Conclusion: ML models that integrate multiple morphological features can predict the IA responsible for SAH accurately in patients with multiple IAs. These models outperform traditional predictive markers in identifying the responsible IA, thereby facilitating prompt and effective treatment.
Limitations: The results may not be applicable for predicting the rupture risk of unruptured IAs or for patients with a single IA.
Funding for this study: This study was supported by the Science and Technology Commission of Chongqing City, China (CSTB2023NSCQ-MSX0668) and the Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN202200407).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: This multicenter study received approval from the ethics committees of our hospitals. Since this was a retrospective study, informed consent from patients was not needed.

6 min

Diffusion-based deep learning model for automated detection of focal cortical dysplasia using multi-modality PET-MRI

Maha Alshammari, London / United Kingdom

Author Block: M. Alshammari¹, M. Yakubu¹, E. Guedj², N. Girard², J. Cardoso¹, A. Hammers¹; ¹London/UK, ²Marseille/FR
Purpose: Co-registration of FDG PET and MRI has been shown to improve detectability of epileptogenic lesions in focal cortical dysplasia (FCD). In previous work, we generated pseudo-normal PET using a diffusion model, which achieved high sensitivity but resulted in an average of 22 false positives per case. Here, we integrate multi-modality data (co-registered PET and MRI) for the generation of pseudo-normal PET.
Methods or Background: A weakly-supervised 2D-UNet-based diffusion model with dual input channels (co-registered PET and MRI) was trained on 280 slices from 35 healthy controls, with synthetic FCD lesions in 140 slices (50%). The model was trained using 5-fold cross-validation, yielding 5 independent models. Each test case was processed through all 5 cross-validated models at 2 different noise levels with 5 samples each, generating 50 pseudo-normal reconstructions. Deviations between original and pseudo-normal images were quantified as voxel-wise robust Z-scores, enhanced using Probabilistic Threshold-Free Cluster Enhancement (pTFCE), and thresholded at FWER-corrected value. Performance was evaluated on 10 independent synthetically lesioned test cases.
Results or Findings: The multi-modality model achieved 80% sensitivity (8/10 lesions detected) with 2.0±2.72 false-positive detections per case. This represents an 11-fold improvement in specificity compared to our previous PET-only approach while maintaining comparable detection sensitivity.
Conclusion: The multi-modality diffusion model successfully detects FCD lesions with dramatically improved specificity. The reduction from 22 to 2 false positives per case demonstrates significant clinical potential for automated epileptogenic zone localisation in presurgical epilepsy evaluation.
Limitations: Validation was performed on synthetically lesioned cases only. Further evaluation with clinical FCD cases and larger diverse datasets is required to establish robust real-world performance.
Funding for this study: Funding for this study: The School of Biomedical Engineering and Imaging Sciences is supported by the Wellcome EPSRC Centre for Medical Engineering at King’s College London (WT 203148/Z/16/Z) and the Department of Health via the National Institute for Health Research (NIHR) comprehensive Biomedical Research Centre award to Guy’s & St Thomas’ NHS Foundation Trust in partnership with King’s College London and King’s College Hospital NHS Foundation Trust. MA is supported by the Saudi Arabia Cultural Bureau in London under the Saudi scholarship program.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:

6 min

Unmasking Bias in BrainAGE: The Role of Scanner Manufacturer, Field Strength and Sequence Parameters

Christian Bijan Fink, Düsseldorf / Germany

Author Block: V. L. Ivan¹, M. Vach¹, J. Caspers¹, D. Weiß¹, C. B. Fink¹, D. M. Hedderich², C. Rubbert¹; ¹Düsseldorf/DE, ²Munich/DE
Purpose: Brain Age Gap Estimation (BrainAGE) has emerged as a promising biomarker of brain health and disease. Yet, systematic bias may arise not only from study design but also from technical factors such as scanner manufacturer, field strength, or sequence parameters. This study aimed to quantify their influence on BrainAGE in healthy subjects.
Methods or Background: We included 2,414 cognitively normal participants from four population-based studies (ADNI, HCPA, OASIS3, PPMI). BrainAGE was computed using a standardized pipeline (CAT12/SPM12 preprocessing, PCA, Gaussian process regression) and a model trained on 2,953 independent controls. Differences in BrainAGE were assessed across cohorts, scanner manufacturers, field strengths (1.5T vs 3T), and T1-sequence acceleration using Welch’s t-tests or ANOVA with Tukey post-hoc tests. Mean absolute error (MAE) was also calculated.
Results or Findings: Mean BrainAGE differed significantly between cohorts (ADNI: –5.9±5.5 years, HCPA: –4.1±6.2, OASIS3: –4.8±5.4, PPMI: –3.0±5.8; p<0.0001). Field strength had a strong impact: 1.5T scans yielded smaller BrainAGE (–2.2±4.8 years, MAE ~4.2) compared to 3T scans (–5.1±5.6, MAE ~6.7; p<0.05 across cohorts). Scanner manufacturer also mattered: in ADNI, Philips scanners produced a significantly smaller BrainAGE (–4.9±5.1, MAE 6.0) compared to GE (–6.3±6.2, MAE 7.3; p=0.008), while in PPMI Siemens differed from GE (p=0.008). By contrast, accelerated vs. unaccelerated 3T T1 acquisitions showed no significant effect (p=0.47).
Conclusion: Scanner manufacturer and field strength systematically bias BrainAGE in healthy cohorts, while sequence acceleration does not. These findings highlight that BrainAGE is not scanner-independent and emphasize caution when comparing results across studies or pooling multi-cohort data.
Limitations: The retrospective use of heterogeneous cohort data with only limited scanner parameter detail may confound effects of acquisition protocols with cohort-specific factors
Funding for this study: None
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Positive ethical approval

6 min

Synergistic A.I.: Unique Approach of Dual Algorithm Integration Transforms Trauma Imaging Workflow

Carolien Margot Toxopeus, Amsterdam / Netherlands

Author Block: C. M. Toxopeus, D. Duyndam, M. Gorzeman, A. Driessen; Amsterdam/NL
Purpose: Increasing CT scan demand, particularly brain and cervical spine trauma imaging, places significant pressure on emergency departments (ED). OLVG Hospital began evaluating algorithms from AIDOC for these scans in 2021. Previous studies in our hospital demonstrated that AIDOC implementation significantly reduced trauma patient throughput times for CT brain in context of trauma. Since these scans are often combined with CT cervical spine, we investigated the potential for further ED throughput time reduction through a cervical spine fracture algorithm. Algorithm accuracy was thoroughly tested before release to emergency physicians.
Methods or Background: We analyzed 2,564 cases (June 2022 - June 2023) to assess algorithm accuracy. Training modules were developed for emergency physicians. Given lower sensitivity for cervical fractures in severe arthrosis, algorithm use was restricted to patients <65 years, excluding high-energy trauma or neurological deficits. Emergency physicians could consult radiologists for excluded categories. During pilot testing (December 2023 - April 2024), emergency physicians used the cervical algorithm during shifts.
Results or Findings: The accuracy study yielded a positive predictive value >50% and negative predictive value >99.45% and deemed safe within the context of our Level II trauma center. Emergency physician evaluation of cervical spine algorithm use was consistently positive, leading to definitive integration into our collaborative ED workflow.
Conclusion: This study demonstrates that combined application of AIDOC algorithms for CT brain and cervical spine in a Level II trauma center, together with emergency physician training: 1) is safe, 2) results in significant workload reduction for radiologists during shifts, 3) increases efficiency of patient flow in the ED, and 4) has a positive effect on collaboration between radiologists and clinicians.
Limitations: This study was conducted at a Level II trauma center, which may limit generalizability to other trauma centers.
Funding for this study: Innovation Fund of OLVG Hospital.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: