Research Presentation Session: Artificial Intelligence and Imaging Informatics

RPS 1305 - The radiomics reality check: building robust biomarkers for thoracic and abdominal disease

March 6, 09:30 - 11:00 CET

6 min

Robustness of Radiomics in Dual-Layer Spectral CT: A Phantom Study on the Impact of Acquisition and Reconstruction Parameters

Jinyi Jiang, Hangzhou City, Zhejiang Province / China

Author Block: J. Jiang¹, L. Shi¹, M. Lin¹, C. Huang¹, Y. Wang², T. OuYang¹, Q. Zhou¹, J. Hu¹, Y. Zhou¹; ¹Hangzhou/CN, ²Shanghai/CN
Purpose: To assess the impact of acquisition and reconstruction factors on the robustness of radiomics within Dual-Layer Spectral CT.
Methods or Background: A chest phantom consisting of 12 pulmonary nodules was scanned with different acquisition and reconstruction factors, including tube voltage (120 kV vs 100 kV), tube current (10–90mAs vs 100mAs), slice thickness(0.67 mm, 3 mm, 5 mm vs 1 mm), iterative reconstruction levels(idose0–idose7 vs idose4), reconstruction kernels(smooth [A], standard [B], sharp [C], lung [E], y-Sharp [YA] vs y-Detail [YB]), collimation, (16 × 0.625, 32 × 0.625, 64 × 0.625 vs 128 × 0.625) and pitch(0.66, 1.473 vs 1). A total of 31 different scanning sets of 40–100keV virtual monochromatic images and conventional images were reconstructed. Regions of interest were segmented using a semi-automated approach, and 108 radiomics features were extracted. Reproducibility was quantified by the intraclass correlation coefficient (ICC)and concordance correlation coefficient (CCC), while variability was measured using coefficient of variation (CV) and quartile coefficient of dispersion (QCD).
Results or Findings: Across all virtual monochromatic and conventional images, the percentages of features were high for both ICC > 0.90 (median: 77.78%; interquartile range [IQR]: 72.22%-80.56%) and CCC > 0.90 (median: 75.93%; IQR: 69.44%-78.70%) when pitch, collimation, tube voltage, or iterative reconstruction level were modified, as well as under higher tube current (≥70 mAs) conditions. Reproducibility was low (median: 30.56%.; IQR: 24.77%-39.12% for ICC; median: 27.78%; IQR: 23.15%-36.57% for CCC) under conditions of altered reconstruction kernel or slice thickness. The inter-protocol variability suggested that 30.56% and 51.85% of features had a CV < 10% and QCD < 10%, respectively.
Conclusion: Dual-Layer Spectral CT radiomics was robust to tube voltage, collimation, pitch, iterative reconstruction level, and tube current, but remained sensitive to slice thickness and reconstruction kernel.
Limitations: Phantom study.
Funding for this study: None
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:

6 min

Radiomics Feature Reproducibility Across Virtual Monoenergetic and VNC Reconstructions Using Native Phase as Reference

Andrey Ustalov, Moscow / Russia

Author Block: A. Ustalov, S. A. Shmeleva, E. V. Kondratyev, S. Tamaeva, V. Aznaurov, D. Bogomolov, V. Gurina, V. Shirokov, I. Gruzdev; Moscow/RU
Purpose: To evaluate the reproducibility of radiomics features extracted from virtual monoenergetic (MonoE) and virtual non-contrast (VNC) reconstructions of the liver parenchyma compared with native phase imaging, using a standardized 3D region of interest (ROI).
Methods or Background: Fifty patients who underwent abdominal CT were included. A spherical 3D ROI (15 mm diameter) was placed in a homogeneous, vessel-free area of the liver parenchyma. Segmentation was initially performed on one phase and propagated across other reconstructions. Radiomics features were extracted from: native phase (reference standard), VNC from arterial phase, VNC from portal venous phase, MonoE at 200 keV (portal).
All features were z-score normalized prior to analysis. Reproducibility was assessed using intra-class correlation coefficient (ICC 2,1), concordance correlation coefficient (CCC), within-subject coefficient of variation (wCV), mean absolute percentage error (MAPE), and paired t-tests with Benjamini–Hochberg FDR correction. A composite instability score was used to identify the least reproducible features.
Results or Findings: VNC-portal demonstrated the highest agreement with native phase, with mean CCC >0.90 for shape and first-order features and the largest proportion of “core-like” reproducible features (≈45%). VNC-arterial showed slightly lower reproducibility, particularly for texture classes (GLCM, GLRLM). MonoE200 exhibited significantly reduced reproducibility, with numerous features showing CCC <0.70, MAPE >25%, and systematic bias (q<0.05). The least reproducible features were predominantly wavelet- and LoG-based texture metrics from MonoE200. Shape and first-order features remained robust across all reconstructions.
Conclusion: VNC-portal reconstructions provide radiomics features most comparable to native images and may serve as a reliable surrogate when native acquisitions are unavailable. MonoE200 reconstructions introduce substantial variability, particularly in high-frequency texture features, and should be used with caution in radiomics studies.
Limitations: Retrospective single-center design, limited sample size
Funding for this study: No external funding.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Approved by the local ethics committee, Protocol № IH-2025/03, 15 June 2025.

6 min

Augmented Intelligence for Crohn's Disease: Boosting Prognostic Accuracy by Integrating CT Radiomics with LLM-Processed Electronic Health Records Data

Zhoulei Li, Guangzhou / China

Author Block: Z. Li¹, Y. Yi², S. Li¹, Y. Wang¹, J. Lin¹, S-T. Feng¹, X. Li²; ¹Guangzhou/CN, ²Macau/CD
Purpose: Accurate early prediction of adverse outcomes in Crohn's disease (CD) is paramount for personalized therapy. Current approaches often operate in silos: quantitative CT radiomics models or qualitative clinical narrative assessments. We hypothesize that integrating structured imaging biomarkers with unstructured, context-rich data from Electronic Health Records (EHRs) via Large Language Models (LLMs) will create a superior, holistic predictive framework.
Methods or Background: In this multicenter study, we enrolled 212 CD patients from six hospitals. We developed a multimodal fusion model that synergistically combines: 1) A CT Radiomics Model (VAT-RM), extracting 850 features (texture, first-order, wavelet) from baseline CT using pyradiomics and employing a Support Vector Machine (SVM) classifier; and 2) An LLM-based Clinical Narrative Analyzer, processing unstructured EHR data (symptom history, medication usage, endoscopic reports) using Gemini-2.5 to generate quantitative clinical embeddings. The outputs of both models were integrated using a logistic regression meta-learner to generate the final prognostic prediction.
Results or Findings: The integrated multimodal model demonstrated exceptional performance, achieving an AUC of 0.920 (95% CI 0.871–0.956) in the external test cohort. This significantly outperformed both the standalone VAT-RM (AUC=0.882, P=0.039) and the best-performing standalone LLM (Gemini-2.5 AUC=0.791, P<0.001). The fusion model also showed superior calibration and reclassification metrics, indicating its enhanced clinical utility. The LLM component provided unique value in interpreting complex clinical narratives, such as subtle symptom progression and medication adherence patterns, complementing the robust imaging biomarkers.
Conclusion: Fusing CT radiomics with LLM-derived clinical insights creates a powerful synergistic model that surpasses either modality alone. This integrated approach offers a comprehensive and superior strategy for risk stratification in CD, paving the way for more precise and data-driven clinical decision support.
Limitations: Not applicable.
Funding for this study: None.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: This study was approved by the Institutional Ethics Review Committee at the First Affiliated Hospital of Sun Yat-sen University (ethics number [2024]668).

6 min

Robustness of radiomics features between poly-energetic images and virtual mono-energetic images: synchronizing kiloelectron-volt level for better radiomics reproducibility

Jingyu Zhong, Shanghai / China

Author Block: J. Zhong, Y. Song, Z. Xu, H. Zhang, W. Yao; Shanghai/CN
Purpose: To find the kiloelectron-volt (keV) level of virtual mono-energetic images (VMIs) matching to the poly-energetic images (PEIs) that provide better radiomics reproducibility for subsequent clinical analysis.
Methods or Background: A phantom of twenty diverse texture materials was scanned using single-source mode at tube voltages of 70, 80, 90, 100, 110, 120, 130, 140, and 150 kVp, and dual-source mode at tube voltage combinations of 70/150Sn, 80/140, 80/150Sn, 90/150Sn, and 100/150Sn kVp, respectively, all with a radiation dose of 5 mGy. Nine sets of PEIs were reconstructed as reference. Thirty-one sets of VMIs at 40 to 190 keV with a stepwise of 5 keV were reconstructed for five dual-source scans, resulting 155 sets of VMIs. Ninety-three radiomics features were extracted from each material per PyRadiomics. The reproducibility of features between PEIs and VMIs was evaluated using intraclass correlation coefficient (ICC) and concordance correlation coefficient (CCC).
Results or Findings: According to ICC and CCC values, the keV levels of VMIs for highest radiomics reproducibility were 55, 60, 65, 65, 65, 70, 75, 75, and 75 keV for PEIs at 70, 80, 90, 100, 110, 120, 130, 140, and 150 kVp, respectively. The radiomics features showed highest percentage of 82.2% and 79.6% with ICC>0.90 and CCC>0.90 at 70 keV VMI for 120 kVp PEIs.
Conclusion: The ideal keV levels of VMIs that provide appropriate radiomics reproducibility increased with the tube voltages for PEIs. Synchronizing keV levels can provide better radiomics reproducibility between VMIs and PEIs. The keV levels are important for generalizability radiomics models across conventional and spectrum CT platforms.
Limitations: The limitations of the study are: (1) phantom study; (2) the exact best keV level VMI for PEI at each kVp not confirmed; (3) impact on diagnostic performance not investigated.
Funding for this study: Funding was provided by National Natural Science Foundation of China (82302183, 82471935, 82271934), Research Found of Health Commission of Shanghai Municipality (20244Y0214), Research Found of Health Commission of Changing District, Shanghai Municipality (2023QN01), Laboratory Open Fund of Key Technology and Materials in Minimally Invasive Spine Surgery (2024JZWC-ZDA04, 2024JZWC-YBA07), and Research Fund of Tongren Hospital, Shanghai Jiao Tong University School of Medicine (TRKYRC-XX202204, TRYJ2021JC06).
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:

6 min

Machine Learning Survival Prediction in Esophageal Cancer Using Radiomics and Body Composition from Pretreatment and Follow-Up T12-Level CT

MingCheng Liu, Taichung / Taiwan, Chinese Taipei

Author Block: M. Liu, S-C. Lin, C-H. Liao, W-H. Chen, Y-J. Liu; Taichung/TW
Purpose: This study aimed to develop prognostic models for esophageal cancer by integrating body composition indices and radiomic features of skeletal muscle and adipose tissue at the T12 level from both pretreatment and follow-up CT scans, along with clinical and demographic data.
Methods or Background: This retrospective study included 212 esophageal cancer patients who underwent concurrent chemoradiotherapy, with both pretreatment and follow-up chest CT scans available. Body composition analysis (BOA) and radiomic features were extracted from skeletal muscle and adipose tissue at the T12 level using automated tools. Four feature subsets (no-radiomics, pretreatment only, follow-up only, and combined inputs) were developed using logistic regression with LASSO for feature selection, followed by Cox regression. Prognostic models—including nomogram, support vector classifier, logistic regression, and extra trees classifier—were constructed to predict 1-, 2-, and 3-year overall survival.
Results or Findings: The model integrating both BOA and radiomics from pretreatment and follow-up CT, combined with clinical data, achieved the highest AUC (0.91), sensitivity (0.81), and specificity (0.88) using the logistic regression model. The most predictive features included both clinical variables, body composition indices, and radiomic features, particularly from follow-up VAT. Follow-up imaging contributed significantly to model performance, reinforcing its value in treatment response evaluation.
Conclusion: This is the first study to demonstrate that BOA indices and their corresponding radiomics at the T12-level from both pretreatment and follow-up CT scans—combined with clinical data—can provide accurate prognostic information for esophageal cancer. This approach offers a practical alternative when L3-level imaging is unavailable and supports the clinical integration of automated T12-based imaging biomarkers. The integration of these imaging features with clinical parameters enhances the prediction of survival outcomes.
Limitations: Conducted retrospectively at a single center, with relatively small and heterogeneous patient cohort.
Funding for this study: None
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: This retrospective study was approved by the IRB of Taichung Veterans General Hospital.

6 min

AI-Assisted CT Diagnosis of Appendicits: Synergistic Gains Across Radiologist Experience Levels

Stefan Reischl, Munich / Germany

Author Block: S. Reischl, J. D. B. Brandt, F. Reischl, S. Ziegelmayer, A. Sauter, F. Lohöfer, M. R. Makowski, D. Rueckert, R. Braren; Munich/DE
Purpose: To evaluate the diagnostic performance of a deep learning system for appendicitis in CT and its synergistic effect when used as decision support across radiologists with varying experience levels.
Methods or Background: We developed AIppendix, an end-to-end deep learning pipeline combining a 3D Retina-UNet for detection and a 3D ResNet18 for classification of appendicitis and co-pathologies. Training was performed on 580 annotated CT scans with 116 cases reserved for validation and 162 independent cases for external testing. The system was assessed in a reader study including 12 participants: medical students (n=4), residents with >3 years of experience (n=3), and board-certified radiologists (n=5, including 2 gastrointestinal subspecialists). Readers classified if appendicitis was present on abdominal CT scans of patients with abdominal discomfort first unaided, then with AI support.
Results or Findings: The AI system achieved an accuracy 0.93 (sensitivity 0.96, specificity 0.90) on internal data and 0.88/0.87 AUC on external testing, matching senior radiologists and surpassing residents in specificity. Without AI support, accuracy increased with experience (students 0.72, residents 0.91, senior radiologists 0.93). With AI assistance, performance improved in all groups. Students benefited most (accuracy 0.72→0.87; sensitivity 0.78→0.87; specificity 0.67→0.88). Residents improved from 0.91→0.97, mainly through specificity gains (0.83→0.96), while senior radiologists showed only marginal benefit (0.93→0.95). The AI thus compensated for inexperience, reduced false positives among residents, and acted as a safeguard for experts.
Conclusion: The model achieves radiologist-level performance in diagnosing appendicitis and demonstrates the strongest impact when combined with less experienced readers, harmonizing diagnostic quality across experience levels.
Limitations: Retrospective single-center study with moderate dataset size; end-to-end performance remains limited by detection accuracy.
Funding for this study: None
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The study was approved by the ethical review board of the Technical University of Munich. Informed consent was waived according to the regulations of the university for retrospective analyses.

6 min

CT-based radiomics for HCC recurrence prediction after liver transplantation: which choices really matter? Insights on segmentation margins, bin width, and contrast phase

Virginia Piva, Milan / Italy

Author Block: V. Piva, G. Zorzi, F. Rizzetto, G. Bruschi, C. De Mattia, A. Vanzulli, P. E. Colombo; Milano/IT
Purpose: Aim of this study was to evaluate the impact of CT contrast phase, segmentation margins and radiomic features extraction bin width on the performance of a machine learning model for predicting post-transplant HCC recurrence.
Methods or Background: We retrospectively included 54 histologically confirmed HCC patients who underwent pre-transplant liver CT imaging between 2010 and 2019 at a single institution, yielding a total of 116 lesions. A single radiologist segmented the lesions with four different peritumoral expansions (PT 0, 5,10 and 15 mm) on arterial, venous and delayed phases. Radiomic features were extracted using PyRadiomics with varying bin widths (3,5,10,20,25) and isotropic resampling. Principal Component Analysis (PCA) was used to select relevant features for each combination. A multilayer perceptron (MLP) model was trained with stratified 10-fold cross-validation and hyperparameters optimized via GridSearchCV. Model performance was evaluated across different combinations of segmentation margin, contrast phase, and bin width to compare their effect.
Results or Findings: The MLP achieved its best performance on the arterial phase with no peritumoral expansion and a medium bin width of 10 (AUC 0.82, accuracy 0.80, specificity 0.87). With PT0–PT5, the arterial phase outperformed the delayed phase, achieving higher values for AUC (0.72–0.82 vs. 0.62–0.75), accuracy (0.78–0.80 vs. 0.77–0.79), and comparable specificity (0.87–0.88 vs. 0.88–0.89). Larger expansions (PT10–PT15) decreased performance across phases. The portal phase did not achieve predictive value (AUC<0.60). Model performance also declined with very small or very large bin widths (AUC<0.78).
Conclusion: CT-based radiomics can support prediction of post-transplant HCC recurrence; however, model performance is highly dependent on contrast phase, segmentation strategy, and feature extraction settings. Careful optimization of these factors is essential to achieve reliable predictions.
Limitations: The small patient cohort and its retrospective nature.
Funding for this study: None
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Institutional Review Board approved the retrospective data collection in anonymous form.

6 min

The impact of deep learning image reconstruction on image quality and diagnostic confidence of liver tumors in 40 kev virtual monochromatic imaging

Caijun Huang, Guiyang / China

Author Block: C. Huang, C. He; Guiyang/CN
Purpose: To investigate the effect of deep learning image reconstruction (DLIR) on the synchronous visualization of liver tumors in 40 keV virtual monochromatic images (VMIs).
Methods or Background: This prospective study enrolled 50 patients who underwent abdominal contrast-enhanced dual energy CT (DECT) imaging, 40 keV VMIs were obtained during the venous phase. Images were reconstructed with different algorithms: filtered back projection (FBP), adaptive statistical iterative reconstruction V (ASIR-V 60%), DLIR-M, and DLIR-H. Measurements included liver SD, lesion-background contrast-to-noise ratio (CNR). Two radiologists independently scored image quality, lesion conspicuity and diagnostic confidence using a 5-point Likert scale. Lesions were categorized as hypervascular or hypovascular based on blood supply for subgroup analysis. Noise equivalent dose (NED) and image quality index (IQF) were calculated considering image noise and radiation dose.
Results or Findings: The smallest SDs of liver was observed in the DLIR-H group, there was no significant difference between ASIR-V 60% and DLIR-M, though both were lower than FBP. For liver lesion CNR, DLIR-H significantly outperformed ASIR-V 60% and FBP, while no significant difference was found between DLIR-M and ASIR-V 60%. In lesion conspicuity and diagnostic confidence assessments—for both hypovascular and hypervascular tumors—DLIR-H yielded superior scores compared to ASIR-V 60%. The NED was lowest in the DLIR-H group, comparable between ASIR-V 60% and DLIR-M, both lower than FBP; conversely, the IQF showed an opposite trend.
Conclusion: 40 keV VMIs with DLIR-H markedly enhances the visualization and diagnostic confidence of liver tumors, demonstrating the highest dose-to-image quality efficiency.
Limitations: First, single-center study, the distribution of lesion types is uneven, and it is difficult to avoid bias. Second, the number of cases is small, and diagnostic control studies cannot be conducted. Third, no images of other energies are explored.
Funding for this study: No funding was received for this study.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The study was approved by Guiqian International Hospital Institutional Ethics Committee (NO. 20250001)

6 min

CT-based Radiomics in Colorectal Liver Metastases: Redefining Pre-operative Survival Prediction

Angela Ammirabile, Milan / Italy

Author Block: A. Ammirabile¹, G. Matteucci¹, F. Fiz², E. Lanza¹, L. Cavinato¹, A. Laghi¹, G. Torzilli¹, F. Ieva¹, L. Viganò¹; ¹Milan/IT, ²Genoa/IT
Purpose: Surgery with perioperative chemotherapy is potentially curative for colorectal liver metastases (CRLM). Selection of candidates should rely on survival prediction; however, available prognostic factors have limited reliability, and most biomarkers are assessed on surgical specimens. This study evaluated preoperative CT-based radiomics for overall survival (OS) prediction, focusing on the impact of CT–surgery interval and peritumoral tissue analysis and comparison with clinical scores.
Methods or Background: All consecutive patients undergoing CRLM resection (2010-2020) with contrast-enhanced CT performed ≤60 days before surgery and at least one CRLM ≥10 mm were considered. Manual tumor segmentation (Tumor-VOI) and automatic 5-mm peritumoral expansion (Margin-VOI) were performed on portal phase images. From each VOI, 110 IBSI-compliant radiomic features were extracted. Three prediction models were developed: Clinical, Clinical+Tumor-radiomics, Clinical+Tumor/Margin-radiomics. Features selection used Boruta algorithm, followed by Random Forest classification with 10-fold cross-validation. Models were evaluated in the whole cohort and in patients with CT-surgery interval ≤30 days. Inter‑tumour heterogeneity was assessed with Tree-Edit Distance and Hierarchical Clustering to stratify patients, and resulting clusters were integrated into survival models.
Results or Findings: 306 patients were included (mean age 63 years; 187 men). Five-year survival was 40.9% (mean follow-up 34 months). At validation, the clinical model achieved C-index=0.629. Radiomics provided modest improvement in the entire cohort, with greater impact in the 212 patients with a CT-surgery interval ≤30 days: the Clinical+Tumor-radiomics model reached C-index=0.691, increasing to 0.717 with Margin-VOI features. Clinical–radiomic models outperformed established scores (Fong, GAME, m-CS; C-indices=0.553–0.613). Inter-tumor heterogeneity did not improve prediction.
Conclusion: Radiomic features of CRLM and peritumoral tissue improve preoperative survival prediction beyond clinical scores, with better performance at shorter CT-surgery intervals. Combined models may refine CRLM treatment strategies, acting as a preoperative filter.
Limitations: Retrospective single-center design; Manual segmentation; OS-only analysis.
Funding for this study: AIRC grant #2019−23822
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The study was performed according to the declaration of Helsinki and its later amendments. The local review board approved the study protocol (83/20). Because of the retrospective nature of the study, the need for informed consent was waived.

6 min

Multiple Instance Learning with Radiomics Features Enables Peritoneal Carcinomatosis Detection from CT Images: preliminary results

Konstantinos Vrettos, Heraklion / Greece

Author Block: K. Vrettos¹, B. Huang², I. Moberg², I. Rouvelas², M. Klontzas¹, A. Tzortzakakis²; ¹Heraklion/GR, ²Stockholm/SE
Purpose: Peritoneal carcinomatosis (PC) correlates with advanced cancer and its identification can significantly impact treatment planning and patient outcomes. Current detection methods such as CT and circulating protein biomarkers have limitations in terms of accuracy and efficiency. Radiomics, an AI-driven approach that extracts quantitative features from medical images, holds promise for improving PC detection. The aim of this work is to develop an open-access radiomics model that assists radiologists in identifying PC from CT scans, while maintaining human accountability and ethical oversight, through a human-in-the-loop approach.
Methods or Background: The model architecture is an attention-based Multiple Instance Learning(MIL) network, which utilizes an instance feature extractor, an attention mechanism and a final classifier operating on aggregated features. The dataset consists of 141 patients with gastric tumors who underwent laparoscopy and biopsy to detect PC. Radiomics features were extracted from CT images for 12 biopsy spots(the tumor was not included) corresponding to the conventional laparoscopic based Peritoneal Cancer index score . Boruta feature selection and interpretability analysis were performed . The model's performance was evaluated using AUC,Accuracy,F1-score,Sensitivity and Specificity on a held-out test set. CLAIM guidelines were followed.
Results or Findings: The radiomics model achieves an AUC of 0.7,Accuracy of 0.81 and Specificity of 0.94. The model's performance was non-inferior to expert radiologists' predictions of PC from CT scans. The human-in-the-loop approach yielded comparable results with an increased sensitivity of 0.8, when radiologist performance was combined with model performance. Analysis of existing literature, revealed that the proposed model outperforms current PC detection approaches, including CA125 blood test.
Conclusion: This study highlights the potential of radiomics-based models to assist radiologists in detecting PC on CT scans. It further shows that radiologist-AI collaboration can improve detection performance while ensuring transparency.
Limitations: Dataset size
Funding for this study: Funding from the Cancer Research Funds of Radiumhemmet
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: EPM diarienummer: Dnr 2018/970-31/1

6 min

Effectiveness of Deep-Learning Based Denoising on the Image Quality and Diagnostic Performance of Low-Dose Abdominal CT for Acute Appendicitis

CHUNGHWAN SHIN, Iksan / Korea, Republic of

Author Block: C. SHIN, Y. H. Lee; Iksan/KR
Purpose: The aim of this study is to evaluate the effectiveness of deep-learning based denoising images using low-dose abdominal CT in terms of image quality and diagnostic performance for evaluating acute appendicitis.
Methods or Background: We retrospectively analysed 53 patients underwent low-dose abdominal CT for suspected acute appendicitis. Images were reconstructed using filtered-back-projection (FBP) and iterative reconstruction (IR), then processed using deep-learning-based denoising software (ClariCT.AI, Claripi), resulting in four image sets per patient. For quantitative analysis, image noise and signal-to-noise ratio were measured. Two radiologists independently scored for qualitative analysis (noise, sharpness, artifacts, overall) on a 4-point Likert scale and assessed the presence of acute appendicitis. Statistical analyses included repeated measures ANOVA, Friedman test, Wilcoxon signed-rank test, and McNemar’s test.
Results or Findings: Four image sets demonstrated significant differences in both noise and SNR (p<0.001). The noise progressively decreased in the order of FBP(25.53±1.98), IR(17.07±1.27), denoised FBP(11.45±0.86), and denoised IR(8.36±0.63). SNR was highest in denoised IR (23.11±2.84), followed by denoised FBP(16.92±2.04), IR(11.37±1.36), and FBP(7.65±0.93). Denoised FBP achieved higher overall quality than FBP, IR, and denoised IR in both readers (p<0.01), providing superior sharpness and fewer artifacts, while denoised IR images achieved lower image noise. All image sets showed 100% sensitivity for acute appendicitis. In reader 1, denoised images showed higher specificity (95.35% each) than undenoised FBP and IR(90.70% each). In reader 2, FBP showed lower specificity (88.38%) than other image sets (95.35% each). No statistically significant difference was observed for specificity in either readers.
Conclusion: Deep-learning-based denoising markedly improves noise reduction in low-dose abdominal CT. Reconstructed images with denoising provided superior image quality and higher diagnostic accuracy than undenoised FBP or IR images for evaluating acute appendicitis.
Limitations: Quantitative assessment was performed in regions unrelated to the appendix.
Funding for this study: This study received no specific funding.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The study was approved with a waiver of informed consent by the institutional review board of Wonkwang University Hospital (IRB No.2023-12-031)

6 min

Deep-learning-based liver age estimation predicts major health outcomes

Robin Tibor Schirrmeister, Freiburg Im Breisgau / Germany

Author Block: R. T. Schirrmeister, M. Jung, M. Reisert, F. Bamberg, J. Weiß; Freiburg Im Breisgau/DE
Purpose: Various studies have shown that AI can estimate biological age from medical imaging and predict outcomes beyond traditional risk factors. In this study we developed a deep learning model (MR-LiverAge) that estimates liver age from MR imaging and investigated its association with different health outcomes.
Methods or Background: MR-LiverAge was developed in a two-step approach using 30025 subjects from the German National Cohort (NAKO): Model 1) segmentation of the liver from T1-weighted abdominal MRI; Model 2) takes the segmented pancreas mask of Model 1 as the only input and outputs a liver age estimate in years.Independent validation was performed in 40151 subjects of the UK Biobank (UKB). To account for potential confounders such as BMI, sex or liver fat, generalized additive models were fitted to predict the AI-estimated MR-LiverAge from those confounders. The remaining difference between MR-LiverAge and confounder-estimated age was used to group individuals into decelerated, normal and accelerated aging groups. Cox proportional hazards regression assessed the association between age groups and incident diabetes, liver disease, cardiovascular events and mortality.
Results or Findings: MR-LiverAge had a Pearson-R correlation of 80.4% in the NAKO and 50.9% in the UKB with chronological age. Individuals with accelerated MR-LiverAge in the UKB had a higher risk of incident diabetes (HR=1.66, 95% CI=1.37–2.01, p < 0.001), liver disease (HR=1.52, 95% CI=1.19–1.96, p=0.001), cardiovascular events (HR=1.38, 95% CI=1.1–1.73, p=0.005) and mortality risk (HR=1.45, 95% CI=1.16–1.81, p=0001) independent of traditional risk factors.
Conclusion: MR-LiverAge has potential as a novel opportunistic imaging biomarker to estimate risk of major health outcomes.
Limitations: Results have only been obtained from MRI on the UKB dataset and would need to be further validated on other datasets and other modalities.
Funding for this study: We acknowledge support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), grant number 525002713.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Study has been approved by the local IRB.