Research Presentation Session: Imaging Informatics and Artificial Intelligence

RPS 1605 - Artificial intelligence: real world results and large European projects

February 28, 16:00 - 17:30 CET

  • ACV - Research Stage 3
  • ECR 2025
  • 12 Lectures
  • 90 Minutes
  • 12 Speakers

Description

7 min
Assessing the effectiveness of artificial intelligence (AI) in prioritising CT Head interpretation: a stepped-wedge cluster-randomised trial (ACCEPT-AI)
Katrina Nash, Oxford / United Kingdom
Author Block: K. Nash1, K. Vimalesvaran2, R. Dharmadhikari3, M. Hall4, A. Novak1, S. Ather1, D. J. Lowe4, H. Shuaib2, *. Accept-Ai Investigators2; 1Oxford/UK, 2London/UK, 3Northumberland/UK, 4Glasgow/UK
Purpose: To evaluate the effectiveness of artificial intelligence (AI) in prioritisation of non-contrast head computed tomography scan (NCCT). We evaluated; 1) whether there was a reduction in report turnaround time (TAT) of prioritised NCCT, 2) the accuracy of the AI algorithm, and 3) the technical performance of the algorithm.
Methods or Background: This large-scale multi-centre trial has been conducted across three emergency departments between November 2023 to July 2024. Individuals above the age of 18 who presented to the emergency department and underwent NCCT were included. Data collected included demographic data, time from acquisition to report of CT scan, referral, and discharge, and death within 28 days. Findings were categorised into prioritised findings (intracranial haemorrhage, mass effect, fracture), non-prioritised findings (atrophy and infarct), and no prioritised findings. The study was conducted in 3 stages; pre-implementation, implementation and post-implementation. Baseline data was collected during the pre-implementation phase, whilst the implementation phase enabled training and integration of the AI tool into radiology workflow at each trust. During the post-implementation phase, AI results were visible to radiologists. Radiology reports were coded to assess AI accuracy, with discrepant cases sent for ground truthing by two independent radiologists.
Results or Findings: 7,500 scans from the pre-implementation phase, and 8,453 scans from the post-implementation phase have been included. The median TAT (minutes) for prioritised scans in the pre-implementation was 34 (inter-quartile range 20-55) and 35 in the post-implementation phase (22-59).
Conclusion: We have successfully conducted a prospective trial of AI implementation across multiple centres in the United Kingdom. Initial results do not show a difference in TAT for NCCT reports after implementation of AI.
Limitations: TAT has been calculated during preliminary analysis of a larger dataset, therefore full results will be available for presentation.
Funding for this study: This study has been awarded funding from NHSx AI Award (Award reference: AI_Award02354).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: The study protocol was approved by the Research Ethics Committee (REC) of East Midlands (Leicester Central), in May 2023 (REC 23/EM/0108) and was conducted in accordance with the principles of Good Clinical Practice.
7 min
ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation
Léo Machado, Paris / France
Author Block: L. Machado1, H. Philippe2, E. Ferreres2, J. Khlaut2, J. Gregory1, M. Ronot1, D. Tordjman2, P. Manceron2, P. Herent2; 1Clichy/FR, 2Paris/FR
Purpose: Carcinogenesis leads to tumors with diverse shapes and behaviors. Although RECIST 1.1 remains the standard for evaluation, its reliance on linear measurements and high inter-reader variability often result in misclassification. Volumetric biomarkers, such as total tumor burden, provide more information but need automated segmentation. Traditional segmentation models struggle with complex lesions, have a narrow focus, and lack interactivity. Foundation models with transformer architecture address these issues through zero-shot learning and visual prompts (point-click, bounding box). We developed ONCOPILOT, a foundation model that enhances RECIST 1.1 measurements, enabling volumetric analysis while integrating seamlessly into radiology workflows.
Methods or Background: ONCOPILOT was trained on 7,500 CT scans, including normal anatomy and oncological cases. Its segmentation performance was assessed against nnUnet, a state-of-the-art baseline, using the DICE coefficient. The evaluation also included comparisons with radiologists for RECIST 1.1 long-axis measurements, annotation speed, and inter-reader variability.
Results or Findings: ONCOPILOT outperformed state-of-the-art models, achieving a mean DICE of 0.78 post-editing versus 0.70 for the baseline. Its RECIST error (7.4%, 1.1 mm) was not significantly different to radiologists' (8.6%, 1.3 mm). ONCOPILOT-assisted measurements were quicker than manual (17.2 vs. 20.6 seconds, p < .05), with reduced inter-reader variability (1.7 vs. 2.4 mm, p < .05).
Conclusion: ONCOPILOT is among the first foundation model applications in radiology, acting as an interactive AI assistant for oncological evaluation. It improves RECIST reproducibility, facilitates access to volumetric biomarkers, and seamlessly integrates into radiology workflows, reducing inter-reader variability and measurement time. This approach offers substantial potential to advance oncology research and enhance clinical care.
Limitations: ONCOPILOT showed reduced performance on small tumors, particularly lung lesions, which were overrepresented in the test set. Future iterations should use a more balanced dataset and cover a broader range of tumor types.
Funding for this study: This work was granted access to the HPC resources of IDRIS under the allocation 2024-AD011013489R2 made by GENCI.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: The data used is anonymous and publically available
7 min
Evaluating artificial intelligence for lung cancer detection on chest radiographs: multi-vendor comparison of diagnostic accuracy in a real-world UK population
Ahmed Maiter, Sheffield / United Kingdom
Author Block: A. Maiter, P. Metherall, J. Taylor, S. Matthews, K. Hocking, E. Burton, E. Anderson, A. Swift, C. S. Johns; Sheffield/UK
Purpose: Automated detection of lung cancer on chest radiographs by AI could streamline diagnostic pathways. This study evaluated the diagnostic accuracy of commercially available software from seven vendors using a large dataset of radiographs from a real-world UK population.
Methods or Background: Consecutive chest radiographs obtained at our tertiary UK centre were retrospectively identified. Chest radiographs requested from primary care for adult patients, regardless of indication, were eligible for inclusion. Software from the seven vendors evaluated each radiograph independently. The radiologist report for each radiograph was also interrogated. Diagnostic accuracy metrics were determined by comparing the software outputs and radiologist reports against the diagnosis of lung cancer by multidisciplinary team decision.
Results or Findings: 5,722 chest radiographs were included from 5,592 patients (median age 59 years, 54% female, 79% white, 1.6% prevalence of lung cancer). The software yielded the following (mean±SD): sensitivity 46±9%, specificity 95±4%, positive predictive value (PPV) 15±5%, negative predictive value (NPV) 99±0%, accuracy 70±4% and false positive per image rate (FPPI) 0.05±0.04. Radiologist reports yielded the following: sensitivity 66%, specificity 98%, PPV 36%, NPV 99%, accuracy 82% and FPPI 0.02.
Conclusion: All software demonstrated high specificity and NPV comparable to radiologists, and could add value by helping to exclude lung cancer. However, all software also showed lower sensitivity and PPV than radiologists, suggesting that they could increase the number of false negative and false positive results. While AI has potential to improve the efficiency of “straight to CT” and other diagnostic pathways for suspected lung cancer, the risks of diagnostic errors require careful consideration.
Limitations: This study used a retrospective dataset from a single centre. Further testing of software performance in multi-centre patient cohorts and evaluation of downstream impacts are essential prior to clinical deployment.
Funding for this study: This study was funded by the NHS South Yorkshire Integrated Care System.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: This study received local research ethics committee approval (23/EM/0186). The need for dedicated patient consent was waived.
7 min
Evaluation of AI-assisted Diagnosis of Clinically Significant Prostate Cancer on MRI at Scale: Preliminary Findings from the PI-CAI Consortium
Jasper Jonathan Twilt, Nijmegen / Netherlands
Author Block: J. J. Twilt1, A. Saha1, J. S. Bosma1, D. Yakar2, M. Elschot3, J. Veltman4, J. Fütterer1, H. Huisman1, M. De Rooij1; 1Nijmegen/NL, 2Groningen/NL, 3Trondheim/NO, 4Almelo/NL
Purpose: To assess whether utilizing an internationally validated prostate artificial intelligence (AI) system enhances the accuracy of prostate MRI evaluations in diagnosing clinically significant prostate cancer (csPCa; Gleason Grade ≥2), compared to non-assisted assessments, in a comprehensive international reader study.
Methods or Background: In this retrospective study, imaging and a prostate AI system developed and benchmarked on 10,207 examinations through an international confirmatory study (PI-CAI) were used. A total of 780 biparametric prostate MRI examinations (2015-2021) from men suspected of csPCa, with diagnostic-sufficient image quality and no prior csPCa findings or treatment, were included. Reference was established through histopathology and ≥3 years of follow-up. The AI was calibrated to generate patient-level csPCa suspicion scores (0-10) and associated lesion-detection maps using 420 examinations from three Dutch centers. The remaining 360 examinations (three Dutch, one Norwegian center) were used for outcome analysis. Sixty-one readers (53 centers, 17 countries) provided PI-RADS 3-5 annotations and patient-level csPCa suspicion scores (0-100) with and without AI assistance in two phases, separated by a 4-week washout period. Multi-reader, multi-case analysis compared diagnostic outcomes at a per-patient level.
Results or Findings: Preliminary analysis of 22 readers (1-13 years of prostate MRI experience) demonstrated that AI assistance improved csPCa diagnosis, with AUROCs of 0.919 (95% CI: 0.896-0.942) compared to 0.870 (95% CI: 0.831-0.908) without assistance. Sensitivity and specificity at PI-RADS ≥3 improved from 94.1% to 96.5% and 46.6% to 49.1%, respectively.
Conclusion: AI assistance improves csPCa diagnosis and holds promise for improving clinical outcomes. Further research is needed to confirm generalizability and assess workflow efficiency.
Limitations: Preliminary findings limit the sample size and subgroup analysis considering reader expertise. Reuse of data may introduce generalization bias, and AI was not used to guide histologic verification.
Funding for this study: EU Horizon 2020: ProCAncer-I (grant number 952159), Health~Holland (grant number LSHM20103).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Retrospective use of anonymous patient data was approved by institutional or regional review boards at each contributing center (identifiers: REK 2017/576; CMO 2016-3045; IRB 2018-597; ZGT23-37), and was conducted in accordance with the principles of the Declaration of Helsinki. Informed consent was waived.
7 min
The PANORAMA study results: Pancreatic Cancer Diagnosis - Radiologists meet AI
Megan Schuurmans, Nijmegen / Netherlands
Author Block: M. Schuurmans, N. Alves, P. Vendittelli, G. Litjens, J. J. Hermans, H. Huisman; Nijmegen/NL
Purpose: The PANORAMA study transparently evaluates radiologists and AI in detecting pancreatic ductal adenocarcinoma (PDAC) using contrast-enhanced computed tomography (CECT).
Methods or Background: This retrospective study includes 3338 abdominal CECTs of patients without prior history of treatment or positive histopathology findings of PDAC acquired between 2006 and 2021 from 5 centers (Netherlands, Norway, and Sweden). Of these, 2238 cases (676 PDAC) are publicly available to develop and train AI algorithms, and 100 and 1000 cases were sequestered for AI tuning and testing, respectively. The test set comprises data from two external centres not present in the other cohorts. A subset of 400 testing cases is used for the PANORAMA reader study. Both AI and radiologists indicate PDAC likelihood and localization of lesions. Additionally, radiologists provide a newly introduced PANC-RADS score to assess the urgency of expert referrals. AI is openly developed and evaluated using common metrics through the Grand Challenge platform. Patient-level performance and lesion localization performance are assessed using the area under the receiver operating characteristic curve (AUROC) and average precision (AP), respectively.
Results or Findings: The PANORAMA reader study results will be presented for the first time at ECR 2025. Currently, 57 radiologists (39 centers,13 countries, 2-30 years of experience, median: 9 years) participate in the study. The baseline AI algorithm (nnU-Net with cross-entropy loss) achieves 0.9776 AUROC and 0.7226 AP in the tuning set. There are currently 165 registered AI challenge participants across 11 teams.
Conclusion: Transparently benchmarked AI can enable early PDAC detection at the expert-radiologist level.
Limitations: While histopathology and follow-up were considered as the reference standard for the sequestered testing set, we could not guarantee this level of evidence for all cases in the public training set.
Funding for this study: This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 101016851, project PANCAIM.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Writen ethics committee approval was obtained from every center participating in the study.
7 min
Development and Implementation of a Web-Based Platform for Clinical Validation of Artificial Intelligence Models in Five Cancer Types: An Initiative of the European CHAIMELEON Project
Adrian Galiana-Bordera, L'Alfàs del Pi / Spain
Author Block: A. Galiana-Bordera, J. Aquerreta-Escribano, P. M. Martínez Gironés, P. Lozano, G. Ribas, P. Jimenez, L. Cerda Alberich, I. Blanquer, L. Marti-Bonmati; Valencia/ES
Purpose: This study evaluates AI-based predictive oncology models for prostate, lung, breast, and colorectal cancers through the creation of an innovative web-platform. The goal is to bridge the gap between AI development and clinical implementation, advancing AI adoption in healthcare as part of the European CHAIMELEON project.
Methods or Background: We developed a web-platform with a microservices architecture, including a REST API, ORTHANC PACS for image management, and Keycloak security. The frontend is a custom-built application serving as a control panel for users. It allows clinicians to open a patient-specific DICOM viewer alongside clinical information, AI predictions, and validation buttons. This interface facilitates a three-stage case review: standard evaluation, AI-aided assessment, and final comparison with ground truth. AI models, developed through an open challenge using anonymized data, were integrated into the platform.
Results or Findings: The platform demonstrates efficiency, customizability, and scalability, enhancing oncological study reproducibility. It creates an environment where medical images, clinical data, and AI predictions coexist, adapting to various clinical and research settings. Over 70 clinicians are using the system to evaluate more than 2000 patients, with 60% reviewed in less than a month. This rapid adoption highlights the platform's user-friendly design and potential for improving clinical workflow.
Conclusion: This platform represents a significant advancement in validating and adopting AI models in oncology. It provides a foundation for integrating AI into clinical environments, benefiting both clinicians and patients. The study highlights the importance of user-friendly interfaces and structured evaluation processes in bridging the gap between AI development and clinical application.
Limitations: The study's retrospective nature limits long-term outcome assessment. Further research is needed on implementation across diverse hospital infrastructures, the learning curve for clinicians, and ethical and regulatory aspects of AI integration in clinical practice.
Funding for this study: This particular study did not receive direct funding. However, it was developed as part of the CHAIMELEON project, which received funding from the European Union's Horizon 2020 research and innovation program under Grant Agreement No. 952172.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Under the CHAIMELEON project, all necessary ethical committee approvals have been obtained for the use of medical images and the development of artificial intelligence models. These approvals cover the collection, anonymization, and utilization of patient data for research purposes. Furthermore, the web platform developed for evaluating these AI models has been registered and approved for use in clinical settings. This ensures compliance with data protection regulations and ethical standards in medical research. The project adheres to strict protocols for data handling and user access, maintaining patient privacy and data integrity throughout the study.
7 min
Hospital patients' attitudes towards AI worldwide: Results from the COMFORT study in 74 hospitals and 43 countries
Felix Busch, Munich / Germany
Author Block: F. Busch1, L. Hoffmann2, L. Xu2, L. Zhang3, L. Saba4, M. R. Makowski1, H. Aerts5, L. C. Adams1, K. Bressem1; 1Munich/DE, 2Berlin/DE, 3Nanjing/CN, 4Cagliari/IT, 5Boston, MA/US
Purpose: Too often, we see healthcare technology implementations that focus only on the clinician's point of view and undervalue the patient's perspective. Given the exponential rise in artificial intelligence (AI) applications in healthcare, this international, multicentre, cross-sectional study aimed to assess hospital patients' attitudes towards AI in healthcare worldwide.
Methods or Background: The present COMFORT study, involving 74 network hospitals in 43 countries, employed a quantitative 26-item instrument available in 26 languages on-site between February and November 2023.
Results or Findings: 13806 patients from Europe (41.7%, n=5764/13806), Asia (25.2%, n=3473/13806), North America (16.5%, n=2284/13806), South America (9.7%, n=1336/13806), Africa (5.3%, n=728/13806) and Oceania (1.6%, n=221/13806) were included. Overall, 57.6% of respondents were positive about the use of AI in healthcare. Significant differences in attitudes were observed based on demographic characteristics, health status and technological literacy. Female participants and those in poorer health had less positive attitudes towards the use of AI in medicine. Conversely, higher levels of AI knowledge and frequent use of technological devices were associated with more positive attitudes. Notably, less than half of the participants expressed positive attitudes to all items related to trust in AI, with the lowest level of trust being in the accuracy of AI in providing information about treatment response. Patients showed a strong preference for explainable AI and clinician-led decision-making, even if this meant a slight compromise in accuracy.
Conclusion: This large-scale, multinational study provides a comprehensive perspective on patient attitudes towards AI in healthcare across six continents. The findings suggest the need for tailored AI implementation strategies that account for patient demographics, health status, and preferences for explainable AI and physician oversight. All study data has been made publicly available to encourage replication and further research.
Limitations: Non-probability sampling
Funding for this study: This research is funded by the European Union (101079894).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Ethical approval was obtained from Charité – Universitätsmedizin Berlin (EA4/213/22), which served as the lead institution.
7 min
Implementing Artificial Intelligence with a Multi-AI Platform across 15 centres: Experience and Strategy from an International Healthcare Organization
Alessandro Roncacci, The Hague / Netherlands
Author Block: D. Penha1, A. Juhos1, D. Tálos1, L. Rosa2, M. Santos2, E. Dias2, R. Paroczai1, R. Barone1, A. Roncacci1; 1Amsterdam/NL, 2Lisboa/PT
Purpose: Implementing artificial intelligence (AI) in radiology is challenging due to the variety of AI tools and the complexity of IT infrastructure and workflows. This presentation details a qualitative case study examining the implementation of six AI solutions using a multi-AI platform across 15 centers.
Methods or Background: A longitudinal qualitative case study was conducted in Portugal, in 15 radiology centers over two years (May to October 2024), focusing on the implementation of different AI tools (Veye Lung Nodules, Icobrain DM, Transpara, Keros/Polaris, qXR, and ARVA) using a multi-AI platform (Incepto).

Data collected included 833 days of work observations, 86 meeting observations, and from Incepto dashboard.
Results or Findings: The multi-country healthcare organization's AI team managed the process adhering to a standard operating procedure covering initiation, planning, implementation, and clinical/operational phases.

AI deployment started with Icobrain DM (11 centers) and Veye Lung Nodules (12 centers), processing 1,699 exams and 49,278 respectively. Transpara was deployed across three centers, resulting in 5,791 studies over 19 months. Keros/Polaris in two centers, with 706 studies over 18 months.
ARVA and qXR were not implemented due to clinical decision.
Overall, the AI implementation was successful, with over 56,000 studies processed by four tools across 15 centers.
The study identified advantages of the multi-AI platform, including workflow, cost-effectiveness, and improved patient care. However, disadvantages such as complexity, integration challenges, and potential vendor lock-in were also noted.
Conclusion: This case study provides valuable insights for healthcare organizations considering AI implementation via multi-AI platform over stand-alone AI tools.
Limitations: The implementation of a multi-AI platform in Radiology lacks a clear blueprint to follow, creating the need to define new procedures and metrics. These procedures suited our needs of but may not apply to other radiology institutions.
Funding for this study: Not applicable
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Ethics committe and data/ legal department involved and with full approval of the whole project
7 min
From theory to practice: Re-identification Challenge to test imaging data anonymization effectiveness
Rocío Catalán Flores, Valencia / Spain
Author Block: R. Catalán Flores1, I. Gómez-Rico1, P. Jimenez1, J. Gomes Carvalho2, S. Mazzetti3, R. Martínez Martínez1, M. França2, D. Regge3, L. Marti-Bonmati1; 1Valencia/ES, 2Porto/PT, 3Torino/IT
Purpose: DICOM image de-identification is an effective measure to protect patient privacy and ensure compliance with the GDPR. However, no standardized methods guarantee the irreversible anonymization of DICOM images or provide evidence on the robustness of these procedures. Organizations such as NEMA propose de-identification profiles for DICOM metadata, but the risk to data protection is assumed by the entity responsible for de-identification, as the dangers of using these profiles cannot be accurately measured.

Given this context, the Re-identification Challenge serves as a technical audit to assess the robustness of DICOM image de-identification methods. The objectives of this project are to validate the robustness of these de-identification methods and to gain insight into organizing a challenge within the context of a European project.
Methods or Background: The Re-identification Challenge consisted of a single phase in which the selected 68 participants were tasked to re-identify 38 pseudonymized DICOM studies from multiple European hospitals (including Spain, Portugal and Italy), modalities and anatomical regions. They were pseudonymized locally using the de-identification profiles of ChAImeleon and ProCancer-I European AI4HI projects.
Results or Findings: Despite 74% of the participants delivered result, none succeeded in re-identifying the studies. Based on participants' reports of their attempts, vulnerabilities were discovered that allowed them to narrow the population by obtaining their geographical region.
Conclusion: This challenge is a groundbreaking initiative that may pave the way for robust evaluations of de-identification methods and may culminate in the standardization of de-identification profiles in the field of radiological imaging.
Limitations: This challenge involved a limited sample of DICOM studies to ensure GDPR compliance and protect patients’ privacy. Participant selection was controlled through specific projects to prevent unauthorized access and reduce the risk of data breaches, marking a pioneering effort in de-identification research.
Funding for this study: This challenge is part of the ChAImeleon project, specifically of Work Package 10, titled 'Repository Sustainability'. ChAImeleon has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 952172.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Ethical considerations have been paramount throughout the Challenge's development, garnering approval from the Ethics Committee of the involved data providers institutions. Rigorous measures ensure privacy, security, and data legitimacy: patient consent forms were obtained for the medical studies; legal experts conducted a comprehensive data protection impact assessment; and the platform hosting the studies adheres to stringent security protocols and privacy policies.
7 min
Establishing an applied framework to establish the trustworthiness of an international secure data environment for AI in CT imaging (AICT Consortium)
John Kellas, Oxford / United Kingdom
Author Block: J. Kellas1, S. Van Wortswinkel2, E. Casany Pujol3, R. Lee1, E. R. Ranschaert4; 1Oxford/UK, 2Borgerhout/BE, 3Barcelona/ES, 4Ghent/BE
Purpose: The AICT consortium, funded by Horizon Europe (NetZeroAICT), consists of international clinical sites across 3 continents and proposes a comprehensive trustworthiness framework by systematically integrating ethical, legal, sustainability and stakeholder engagement elements throughout research and development. The goal is to ensure acceptability and trust in our research, innovation pipeline and AI-driven radiology applications.
Methods or Background: The NetZero AICT trustworthiness framework is based on the European Commission's Trustworthy AI model, including foundational elements—Lawfulness, Ethics, and Robustness—structured with key pillars: Human Agency, Technical Robustness, Privacy, Transparency, Diversity, Societal Wellbeing, and Accountability. The project adheres to GDPR compliance, local applicable privacy laws, and the EU AI Act, and integrates ethics and privacy by design, broad and deep public involvement, sustainability, innovation management, clinical validation, and regulatory compliance.
Results or Findings: This is an interim report (year 1 of Horizon Europe program). Here, we showcase the ‘ethics by design’ approach and an applied model of patient and public patient involvement and engagement. A public advisory group (PAG) has been formed with current membership from 4 countries and diverse backgrounds) with member representation on project leadership groups (adopting and adapting a tiered model implemented by large-scale UK health data infrastructure projects (including OpenSAFELY).
Feedback from the Public Advisory Group and the project team strongly supported the view that integrating ethics by design from the concept stage through to deployment has improved public confidence and ensured compliance with ethical and legal standards.
Conclusion: By incorporating ethics by design, legal compliance, sustainability considerations and stakeholder engagement, the (NetZero) AICT consortium established a reliable framework for radiology AI, promoting fairness and trustworthiness among users and stakeholders.
Limitations: This framework applies to healthcare imaging AI.
Funding for this study: Horizon Europe and UK Research Innovation
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: HRA number 22/HRA/2302
7 min
Time impact of AI-Assisted knee MRI reading in a real-world multi-center study within a radiology network
Benoît Rizk, Villars-Sur-Glane / Switzerland
Author Block: B. Rizk1, P. Cordelle2, B. Dufour1, N. Heracleous1, C. Thouly1, P. Zille2, F. Zanca3; 1Sion/CH, 2Poitiers/FR, 3Leuven/BE
Purpose: We explored the impact on reading time of KEROS, a knee-MRI multifaceted AI algorithm, across three distinct interpretation workflows.
Methods or Background: Clinical routine data was gathered from ten different centers during the daily workflow of eight radiologists, including four musculoskeletal subspecialists (MSKs) and four general knowledge radiologists (GENs), two of whom were junior (0 to 1 year of experience in private practice). We use a standardized report relying on voice recognition without secretariat formatting assistance. Data collection was performed in three phases: Phase 1, generated reports without AI assistance; Phase 2, KEROS AI diagnosis was available before their interpretation; Phase 3, AI findings were pre-filling and auto-integrated into the structured report. Reporting time was measured from report opening to validation, using the radiology information system (RIS). The Kruskal-Wallis test (p<0.01) assessed significant differences, and time differences were calculated using weighted means.
Results or Findings: In Phase 1, 431 exams were read with an average reading time of 10.8±9.5 to 26.3±9.3 minutes. In Phase 2 had 429 exams, with reading times between 12.8±8.2 and 27.5±9.2 minutes. Finally, Phase 3 included 425 exams with times ranging from 9.2±6.4 to 21.6±5.6 minutes. Cases were nearly equally read by MSKs and GENs. Compared to Phase 1, Phase 3 showed an average time reduction of 2.1 minutes (p<0.01) (13%), primarily driven by GENs, who saved up to 3.1 minutes (17%) (p<0.01).
Conclusion: We observe the average 13.4% (p<0.01) time reduction after implementing KEROS and pre-filled reporting (Phase 3 vs. Phase 1). Generalists, the primary users of AI-assisted knee MRI readings, see an average 17.4% (p<0.01) decrease in their reading time. In clinical practice, AI-assisted knee MRI reporting saves time for general radiologists, who benefit the most from AI guidance.
Limitations: No limitations
Funding for this study: No funding
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: We acquired the green light by the ethical committee in Switzerland for our research structure including data registry and patient general consent for research. For the specific study presented here, only timing of radiologist reporting data was used and not patient data.
7 min
PARROT: A Collaborative Polyglottal Annotated Radiology Reports Database for Open Testing of Large Language Models
Bastien Le Guellec, Lille / France
Author Block: B. Le Guellec1, K. Bressem2; 1Lille/FR, 2Munich/DE
Purpose: To create a database of annotated radiology reports in diverse languages on which to test Large Language Models (LLMs).
Methods or Background: Large Language Models (LLMs) represent one of the most important advancements in artificial intelligence in recent years. In the field of medicine, they hold the potential to tranform how physicians interact with and interpret medical data. However, the current research and applications of LLMs in medicine are predominantly focused on English-language datasets. This narrow focus raises significant concerns about the ability of LLMs to generalize across the thousands of other languages. As a result of difficulties accessing high-quality data from low-resources languages, patients may be excluded from the benefits of AI-driven advancements in healthcare. To address this critical gap, we have launched the Polyglottal Annotated Reports for Open Testing (PARROT) project. PARROT seeks to gather fictional medical reports from diverse linguistic and cultural backgrounds, manually annotated by experts for ICD-10 codes and make them freely accessible to the global research community. PARROT is a completely open-source initiative, inviting radiologists and medical professionals from around the world to contribute.
Results or Findings: 2648 annotated radiology reports from 75 radiologists from 20 countries in 13 languages have been collected. Most prevalent languages were Polish (808 reports), French (480 reports) and Italian (285 reports). Contributions from the Global South included Ivory Coast, Mexico, Madagascar, Togo, Gabon, Argentina, Algeria, Turkey and China.
Conclusion: The to the collaborative effort of 75 radiologists from 20 countries, PARROT is the largest multilingual open database of radiology reports to date.
Limitations: Annotation for this first iteration of PARROT is limited to ICD-10 codes. Most reports originated from European countries and European languages.
Funding for this study: No specific funding
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: None required

Notice

This session will not be streamed, nor will it be available on-demand!

CME Information

This session is accredited with 1.5 CME credits.

Moderators

  • Luis Marti-Bonmati

    Valencia / Spain

Speakers

  • Katrina Nash

    Oxford / United Kingdom
  • Léo Machado

    Paris / France
  • Ahmed Maiter

    Sheffield / United Kingdom
  • Jasper Jonathan Twilt

    Nijmegen / Netherlands
  • Megan Schuurmans

    Nijmegen / Netherlands
  • Adrian Galiana-Bordera

    L'Alfàs del Pi / Spain
  • Felix Busch

    Munich / Germany
  • Alessandro Roncacci

    The Hague / Netherlands
  • Rocío Catalán Flores

    Valencia / Spain
  • John Kellas

    Oxford / United Kingdom