Research Presentation Session: Artificial Intelligence and Imaging Informatics

RPS 805 - Automating diagnosis and pattern recognition: AI performance in chest radiography and lung disease

March 5, 10:00 - 11:00 CET

6 min
Artificial intelligence solution in B-lines detection on lung ultrasound
Martin Stevik, Martin / Slovakia
Author Block: M. Stevik1, M. Malík1, A. Dzian1, F. Babic2, Š. Vetešková1, M. Bundzel2, J. Magyar2, K. Zelenak1; 1Martin/SK, 2Košice/SK
Purpose: The main limitation of LUS is its high operator dependency. This has led to significant interest in developing artificial intelligence (AI) approaches for the interpretation of LUS imaging. The primary aim of this study was to evaluate the accuracy of a trained AI model in B-lines detection in the LUS movies using a novel designed hybrid solution that combines the convolutional neural network (CNN) and analytical approach. The secondary aim was to evaluate the accuracy of a radiology resident beginner in LUS in B-lines detection and to evaluate the educational potential of AI in LUS.
Methods or Background: In this single-center prospective study, a machine learning based software, the LUS AI solution, was used for automated detection and marking of B-lines in the LUS footages. 75 consecutive patients were enrolled, total of 300 LUS videos. The LUS videos were reviewed and evaluated for the presence of the B-line by two radiologists expert and one radiology resident. Then radiology resident was allowed to revise the initial conclusion regarding B-lines presence.
Results or Findings: Accuracy, sensitivity, specificity, positive and negative predictive values of artificial inelligence in B-line detection were 0.85, 0.9, 0.832, 0.661 and 0.958 respectively. The resident’s values were 0.69, 0.575, 0.732, 0.438 and 0.958 respectively. The resident’s values after correction based on artificial intelligence results were 0.823, 0.912, 0.791, 0.613 and 0.961 respectively.
Conclusion: Artificial intelligence solution showed higher accuracy in B – lines detection. It could play role in residents’ education.
Limitations: There was overall a relatively small number of patients recruited. Our presented AI program results still show some limitations, which is due to the limited amount of data and inability to use some data augmentation methods to address this issue.
Funding for this study: This research is funded by the Slovak Research and Development Agency, grant number APVV 20-24-0454.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Approval was granted by the Ethics Committee of Jessenius Faculty of Medicine in Martin (No. EK 44/2021). Date of approval: 29 June 2021.
6 min
Design of a CT-Based Deep Learning Model to Predict the Metastatic Potential of Sub-centimetric Pulmonary Nodules in Oncology Patients
Funda Dinç, Muğla / Turkey
Author Block: N. E. ÖZEN1, O. Yeniceri1, F. Dinç1, S. Yılmaz2, N. G. Narin2; 1Mugla/TR, 2Muğla/TR
Purpose: Accurate determination of the malignant potential of pulmonary nodules smaller than 1 cm in oncology patients remains a diagnostic challenge. The smaller the nodule, the greater the uncertainty in interpretation, which can lead to staging inaccuracies and delays. This study aimed to investigate the utility of deep learning methods in predicting the malignancy of sub-centimetric pulmonary nodules detected on lung CT scans at the time of initial diagnosis.
Methods or Background: This study represents a preliminary report of a project that we have just started. A total of 933 nodules were analyzed, comprising 443 retrospectively confirmed benign nodules from patients without known malignancy and 490 malignant nodules that demonstrated interval growth and were reported as metastatic. Malignant nodules originated from primary tumors of the rectum, colon, renal cell carcinoma, prostate, uterus, cervix, ovary, larynx, and breast. Of the total dataset, 653 nodules were allocated to the training set, 139 to the validation set, and 141 to the test set. Model development was based on the ResNet-50 architecture with transfer learning.
Results or Findings: Using a dataset of 933 nodules, the model achieved a best validation accuracy of 91.3% and a test accuracy of 83.69%. The area under the ROC curve (AUC) was 0.896, demonstrating strong discriminative performance.
Conclusion: Findings from this preliminary study suggest that deep learning–based approaches may provide valuable support in the staging process at the time of diagnosis, particularly in oncology patients presenting with sub-centimetric pulmonary nodules on CT scans in the presence of a known primary tumor elsewhere.
Limitations: This paper is a preliminary report of a deep learning study using a relatively small dataset. A larger dataset is in preparation.
Funding for this study: This study is currently being conducted without funding. However, an application has been submitted to TÜBİTAK 1001 for funding.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Predicting rapid progression and prognosis of idiopathic inflammatory myopathies-associated interstitial lung disease using AI-based quantitative CT analysis of pulmonary vessel-related structures
Yuhui Qiang, Beijing / China
Author Block: Y. Qiang, H. Wang, M. Liu, H. Dai; Beijing/CN
Purpose: Pulmonary vessel-related structure (PVRS) abnormalities in idiopathic inflammatory myopathies-related interstitial lung disease (IIM-ILD) remains poorly understood. This two-center study investigated PVRS parameters on HRCT as predictors of rapid progression and prognosis in IIM-ILD.
Methods or Background: 578 IIM-ILD patients (412 females, median age 53) came from the prospective ILD cohort of two centers. AI-based quantification of baseline HRCT assessed PVRS and interstitial lesions. An independent external cohort of 64 IIM-ILD patients (43 females, median age 54) from the second center was used to validate the generalizability of PVRS in predicting IIM-ILD progression.
Results or Findings: In the first center, 249 patients with rapidly progressive ILD (RP-ILD) exhibited significantly higher mean pulmonary vascular diameter (mPVD) (P<0.05) at shorter vascular-pleural distances, increased PVRS volume, and greater standard deviation of pulmonary vascular diameter (sdPVD) (P<0.001) compared to non-RP-ILD. Age (HR: 1.03, 95% CI: 1.01-1.06), ground glass opacity (GGO) percentage (HR: 1.04, 95% CI: 1.02-1.06), and sdPVD at 6mm and 18mm from the pleura were identified as independent risk factors for poor prognosis in anti-synthetase syndrome (ASS) patients (concordance index = 0.819). In contrast, age (HR: 1.06, 95% CI: 1.02-1.11), mPVD at 6mm from the pleura, and lactic dehydrogenase were independent risk factors for poor prognosis in anti-MDA5-positive dermatomyositis (MDA5+ DM) patients (concordance index = 0.835). Validation in the second center using the multivariate Cox regression model from the internal training cohort revealed predictive C-indices of 0.841 (ASS) and 0.814 (MDA5+ DM) in the external cohort.
Conclusion: Baseline PVRS parameters on HRCT serve as prognostic indicators for rapid progression and adverse prognosis in IIM-ILD.
Limitations: The quantitative PVRS could not differentiate between arterial and venous vessels.
Funding for this study: National Key Technologies R & D Program Precision Medicine Research, and the National Natural Science Foundation of China.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: China-Japan friendship hospital
6 min
Per-pixel Bone Attenuation Contribution Map Generation using Machine Learning for Chest Radiographs
Tina Dorosti, Neuried / Germany
Author Block: T. Dorosti, L. Kaster, M. Lochschmidt, J. B. Thalhammer, S. Peterhansl, F. Schaff, F. Pfeiffer, D. Pfeiffer; Munich/DE
Purpose: We aim to generate attenuation contribution masks for bone structures present in real and synthetic frontal chest radiographs (CXR) on a pixel level using machine learning. Such bone attenuation contribution (BAC) maps will allow for a personalized, per-pixel correction of beam hardening artifacts in novel imaging modalities such as X-ray dark-field (DF) imaging.
Methods or Background: A total of 5959 chest CT scans were retrieved from two publicly available datasets of the Luna16 (n=656) and the RSNA PE challenge (n=5303). Additionally, CXRs from 72 subjects (33 healthy: 20 men, mean age[range]=62.4[34, 80]; 39 with COPD: 25 men, mean age[range]=69.0[47, 91]) were retrospectively selected (10.2018-12.2019) from our in-house dataset. All CT scans and their corresponding 3D binary bone segmentations were forward projected using a simulated X-ray spectrum to generate synthetic CXRs and relative bone thickness projections referred to as BAC maps, respectively. A U-Net model was trained and tested on synthetic radiographs from the public datasets. Model performance was assessed quantitatively for the public synthetic data with the mean absolute percentage error (MAPE), Pearson correlation, and two-sided Student t distribution. For the real in-house CXRs, data was assessed qualitatively, as no reference BAC data is available for real radiographs.
Results or Findings: The predicted BAC maps showed low error rates and strong correlations with the reference. Specifically, for the Luna16 test set (n=131), an MAPE=18.1% and a correlation of 0.81 (P<0.001) were achieved. For the RSNA PE test data (n=1060), an MAPE=12.5% and a correlation of 0.91 (P<0.001) were obtained.
Conclusion: The U-Net successfully generated per-pixel BAC maps for synthetic and real CXRs, demonstrating potential for applications in DF image processing.
Limitations: The sample of real radiographs was restricted to healthy and COPD subjects from a single medical center.
Funding for this study: We acknowledge financial support through the European Research Council (ERC Synergy Grant SmartX, SyG 101167328), and the Free State of Bavaria under the Excellence Strategy of the Federal Government and the States, as well as by the Technical University of Munich – Institute for Advanced Study.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: All data was analyzed retrospectively and anonymously. The study was approved by the ethical review committee and was conducted in accordance with the regulations of our institution (approval code: 87/18 S, Institutional Review Board of the Faculty of Medicine,
Technical University of Munich, Germany).
6 min
RadGuide-SSP-Net: A Radiomics-Guided Self-Training Semi-Supervised Deep Learning Framework for Multi-Class Classification of Pneumonia Subtype
Yuchi Tian, Shanghai / China
Author Block: Y. Tian1, F. Pan2, X. Liang1, L. Yang2; 1Shanghai/CN, 2Wuhan/CN
Purpose: RadGuide-SSP-Net introduces a novel radiomics-guided semi-supervised learning framework using knowledge distillation to accurately classify pneumonia subtypes (bacterial, viral, fungal, tuberculosis) from chest CT scans with minimal annotated data. Integrating radiomics with deep learning, it enhances diagnostic precision and offers scalable, cost-effective solutions for resource-constrained radiology settings.
Methods or Background: RadGuide-SSP-Net innovatively combines radiomics-based machine learning with 3D convolutional neural networks via knowledge distillation. A radiomics "teacher" model, trained on a small annotated subset, extracts high-dimensional features capturing lesion heterogeneity, morphology, and texture. These priors are distilled through soft-label generation and transferred to a 3D-ResNet-18 "student" model using Kullback-Leibler divergence loss, augmented by cross-entropy on labeled data. This semi-supervised approach leverages unlabeled data to enhance generalizable representations. A retrospective cohort of 1,148 chest CT scans (training:test=7:3) was enrolled.
Results or Findings: Fully supervised 3D-ResNet-18 with 100% labeled data achieved a test set macro-AUC of 0.9178 (95% CI: 0.8946-0.9386). With 30% labeled data, its macro-AUC fell to 0.8495 (95% CI: 0.8223-0.8734). The radiomics-only model with 100% labeled data reached 0.9092 (95% CI: 0.8911-0.9269), dropping to 0.8853 (95% CI: 0.8646-0.9043) with 30% labeled data. RadGuide-SSP-Net, using 30% labeled and unlabeled data, achieved a macro-AUC of 0.9174 (95% CI: 0.8963-0.9373), surpassing both 30% labeled models and nearly matching the 100% labeled benchmark. These findings highlight that radiomics-derived features, though statistically engineered, can effectively guide deep learning in label-scarce settings, likely converging on patterns similar to data-driven features, enhancing label efficiency and generalizability for multi-disease diagnostics.
Conclusion: RadGuide-SSP-Net redefines medical imaging AI with radiomics-guided knowledge distillation, achieving superior pneumonia subtype classification with minimal labeled data. It enhances diagnostic accuracy, reduces radiologist workload, and is scalable in annotation-scarce settings, making it a transformative tool for precision diagnostics and global healthcare.
Limitations: Only one center
Funding for this study: No
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Comparative evaluation of an open source versus commercial artificial intelligence solution for detection of acute lung pathologies on chest radiographs
Li Yi Tammy Chan, Singapore / Singapore
Author Block: L. Y. T. Chan, S. Y. Yee, Y. J. Toh, P. Yogendra; Singapore/SG
Purpose: To evaluate the accuracy of an open-source solution (CheXNeXt) and a commercial product (Rayscape CXR) for chest radiograph interpretation compared with board-certified radiologists.
Methods or Background: Chest radiographs (CXRs) are among the most frequently requested first-line investigations for suspected cardiopulmonary pathology. Rising imaging volumes and radiologist shortages have driven interest in Artificial Intelligence (AI) solutions, which have potential to triage urgent cases for expedited reporting and serve as diagnostic adjuncts to enhance efficiency and accuracy.

This single-institution retrospective study analysed 1003 emergency department CXRs performed after hours. Radiology reports served as the reference standard, whereby outcomes were extracted directly from report documentation. Two AI models (CheXNeXt and Rayscape CXR) independently analysed CXRs for pneumonia, pleural effusion, pneumothorax, and pulmonary oedema. Model performance was evaluated using sensitivity, specificity, and Gwet’s AC1, with agreement strength interpreted according to Landis and Koch. McNemar’s test assessed statistical differences between models.
Results or Findings: Rayscape CXR demonstrated higher overall sensitivity (97.4% vs. 91.7%), while CheXNeXt showed higher specificity (71.5% vs. 40.0%). For effusion, pneumothorax, and oedema, both models achieved almost perfect agreement with the reference standard, with Rayscape CXR (0.902–0.997) outperforming CheXNeXt (0.812–0.890). Conversely, CheXNeXt outperformed Rayscape CXR in pneumonia detection (0.829 vs. 0.765). McNemar’s tests revealed significant differences in error patterns across all pathologies (p <0.001).
Conclusion: Rayscape CXR had greater agreement for effusion, pneumothorax and oedema while CheXNeXt demonstrated greater agreement for pneumonia. Overall, Rayscape CXR demonstrated higher sensitivity and CheXNeXt showed higher specificity. These complementary strengths suggest context-specific deployment strategies: sensitivity-optimised models may aid triage of urgent cases, while specificity-focused models may provide confirmatory support.
Limitations: The study was limited by differing training definitions for pathologies used by each AI algorithm, potentially influencing comparative performance.
Funding for this study: No funding was received for this study.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Repeated evaluation of AI for lung nodule detection in chest radiographs: version-to-version evaluation in a multicentre study
Marlie Besouw, Boxmeer / Netherlands
Author Block: M. Besouw, M. De Rooij, M. J. Rutten, B. Van Ginneken, S. S. Schalekamp; Nijmegen/NL
Purpose: Artificial intelligence (AI) for lung nodule detection in chest radiographs is increasingly implemented, yet reproducible methods to evaluate new and updated AI product versions are lacking. Our Project AIR framework was established to benchmark CE-certified AI tools, and most recently performed in 2023. In this follow-up study, updated versions of AI products for lung nodule detection were assessed.
Methods or Background: Sixteen vendors were invited to participate in a new round for the evaluation of lung nodule detection in chest radiographs. Up to now, four commercial AI products have been evaluated. Performance was tested on the same hidden multicentre dataset of 386 scans. The primary outcome was the area under the receiver operating characteristic curve (AUC). Statistical comparison between versions was performed, and all results were benchmarked against the average performance of radiologists.
Results or Findings: Three of four products showed an increase in AUC (+0.07 to +0.09). The average AUC of these products significantly increased from 0.84 (95% CI 0.79–0.87) to 0.89 (95% CI 0.86–0.92) (p<0.05). For the other system, identical case-level outputs resulted in an unchanged AUC of 0.88. Across the evaluated products, AUCs in the current analysis ranged from 0.87 to 0.91. Three of the four products performed significantly better than the average radiologist, with an AUC of 0.81 (95% CI 0.77–0.85).
Conclusion: Our project AIR framework enables repeated testing of new and updated AI products. For three out of four AI products, performance in lung nodule detection improved compared to 2023. All currently evaluated AI products performed at or above the average radiologist benchmark
Limitations: Two vendor submissions were still being processed at the time of analysis, and not all products provided updated versions. Results are preliminary and limited to a single standardised dataset.
Funding for this study: This study was supported by the Netherlands Organisation for Scientific Research (NWO), project no. OSF23.1.018.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Agreement and Diagnostic Accuracy of an FDA- and CE-Cleared AI Solution Versus Junior Radiologist for Chest Radiographs
Nahdiya Sadaf, Hyderabad / India
Author Block: N. Sadaf, S. K. Marupaka, M. M. Sameer, R. P. Babu; Hyderabad/IN
Purpose: To compare the diagnostic performance of an FDA- and CE-cleared computer-aided detection (CADe) solution with a junior radiologist (JR) for chest radiograph (CXR) interpretation, using a senior radiologist (SR) as reference
Methods or Background: In this retrospective, single-center study, 906 CXRs from patients aged ≥12 years (mean 47.7 years; M/F: 455/451) were analyzed. Each CXR was independently reviewed by: (a) an FDA- and CE-cleared CADe solution (DeepTek.ai), (b) a JR (<5 years’ experience), and (c) an SR (10 years’ experience). Diagnostic performance metrics (sensitivity, specificity, PPV, NPV, accuracy) were calculated using SR as reference. Reader agreement was assessed via Cohen’s κ. Institutional Review Board approval was obtained with a waiver of consent.
Results or Findings: AI demonstrated superior sensitivity for detecting suspicious CXRs (65% vs 53% for JR; p < 0.001), corresponding to 60 additional suspicious cases detected compared with JR. Agreement with SR in suspicious cases was also higher for AI (64.6% vs 53.4%). JR exhibited higher specificity for non-suspicious cases (90.3% vs 78.1%; p < 0.001). Overall accuracy was slightly higher for AI (70% vs 68%), with moderate overall concordance (κ=0.41 vs 0.40).
Conclusion: The AI solution outperformed a junior radiologist in detecting suspicious findings and showed strong concordance with an experienced radiologist, highlighting its potential as a reliable second reader in CXR interpretation. While JR excelled in non-suspicious-case specificity, AI’s superior detection of suspicious cases underscores its value in supporting early diagnosis and clinical decision-making.
Limitations: This study is limited by the use of a single senior radiologist as the reference standard, which could introduce bias, and by its retrospective, single-centre design, which may restrict the generalizability of the findings.
Funding for this study: This research received no external funding
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: