Research Presentation Session: Artificial Intelligence and Imaging Informatics

RPS 905 - Typing your question instead of googling it: how chatbots are changing radiology practice

March 5, 13:00 - 14:00 CET

6 min
Promoting Sustainable Breast Imaging and Interventional Practices with an AI-Based Chatbot
Gianmarco Della Pepa, Milan / Italy
Author Block: G. Della Pepa, G. Irmici, C. De Berardinis, E. D'Ascoli, L. Corradini, G. Rossini, C. Depretto, G. P. Scaperrotta; Milan/IT
Purpose: To develop and evaluate an educational chatbot powered by a low-footprint Large Language Model (LLM), aimed at increasing awareness and knowledge of sustainable clinical practices in breast imaging among radiology professionals.
Methods or Background: GreenBreastBot was developed as a Custom GPT using GPT-3.5, a pre-trained LLM, ensuring negligible energy consumption per interaction. The chatbot was populated with bilingual (Italian/English) structured content derived from the ESR Green Radiology position paper, WHO climate-health recommendations, and internal Breast Unit guidelines. The tool adapts its explanations based on user expertise and delivers microlearning units, interactive quizzes, flashcards, and clinical scenarios. A four-week pilot was conducted in a tertiary Breast Unit. Participants included radiologists, residents, and radiographers. Outcomes included usage frequency, satisfaction (5-point Likert scale), and self-reported awareness before and after interaction.
Results or Findings: Twenty-nine professionals participated: 12 consultants, 11 residents, and 6 radiographers. Users completed an average of 3.6 chatbot sessions per week. Overall satisfaction was high (mean 4.5/5); 91% found the chatbot useful or very useful. Post-intervention, 67% of participants reported improved awareness of sustainable imaging practices, with greatest gains in understanding paperless consent workflows, appropriateness in follow-up imaging, and environmentally conscious interventional preparation. No significant technical barriers were reported.
Conclusion: GreenBreastBot demonstrates that a pre-trained, low-energy LLM chatbot can effectively deliver eco-education in breast imaging. Its integration of institutional and international guidelines enables scalable, impactful, and environmentally coherent training for radiology teams.
Limitations: The main limitations are the monocentric design and the absence of objective performance measures.
Funding for this study: None
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Chatgpt for structured reporting in ct brain: resident experience
Karanvir Singh Chhabra, Jalandhar / India
Author Block: K. S. Chhabra1, D. B. Dahiphale2, S. S. Sarda2; 1Jalandhar/IN, 2Aurangabad/IN
Purpose: To assess the feasibility and utility of ChatGPT in generating structured CT brain reports, evaluate its impact on reporting time and accuracy, and document resident perspectives in a tertiary care setting.
Methods or Background: This pilot study included 20 radiology residents who reported 200 CT brain examinations between January and April 2025. Each case was reported twice: once using conventional free-text reporting and once with ChatGPT-assisted structured reporting. Residents used a standardised prompt library covering common CT brain findings such as haemorrhage, infarct, mass effect, hydrocephalus, and extra-axial collections. Metrics assessed included reporting time, completeness (based on a 10-point checklist), inter-observer consistency, and resident satisfaction. Accuracy was validated against consultant-reviewed reference reports.
Results or Findings: ChatGPT-assisted structured reporting reduced mean reporting time from 14.2 minutes to 9.1 minutes (36% reduction). Completeness scores improved significantly (mean 9.3/10 vs 7.8/10, p<0.01), with better coverage of critical elements such as haemorrhage location, mass effect, and ventricular status. Inter-observer agreement improved, particularly for standardised terminology. Accuracy compared with consultant reports was maintained, with no significant increase in errors. Resident feedback highlighted improved clarity and confidence, though some noted occasional generic or redundant phrasing requiring manual refinement.
Conclusion: ChatGPT shows promise as a practical tool for structured reporting in CT brain studies, enhancing efficiency, completeness, and inter-observer consistency without compromising accuracy. While human oversight remains essential, integration of AI-driven structured templates can support radiology training and streamline reporting in high-volume settings.
Limitations: Single-centre pilot design.
Funding for this study: No funding was received for this study.
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Achieving Truly Informed Consent? A Prospective Controlled Trial Using Retrieval-Augmented Generation Before CT Examinations
Felix Busch, Munich / Germany
Author Block: F. Busch, T. Lemke, S. Ziegelmayer, M. Graf, A. W. Marka, P. Prucker, M. R. Makowski, K. K. Bressem, L. C. Adams; Munich/DE
Purpose: This prospective comparative study aimed to investigate the feasibility, usability, and effectiveness of a Retrieval-Augmented Generation (RAG)-powered Patient Information Assistant (PIA) chatbot for pre-CT information counseling, compared to standard physician-led consultation and informed consent procedures.
Methods or Background: Eighty-six patients scheduled for CT imaging (November-December 2024) were randomly assigned to either the PIA group (n=43), receiving pre-CT information via a RAG-powered chatbot, or the control group (n=43), receiving standard doctor-led consultation. Patient satisfaction, information clarity, comprehension, and concerns were assessed using six ten-point Likert-scale questions. Consultation duration was recorded, and patients in the PIA group indicated their preferred mode of future counseling. Two radiologists independently evaluated each PIA session based on five criteria: overall quality, scientific and clinical evidence, clinical usefulness and relevance, consistency, and up-to-dateness.
Results or Findings: Both groups reported similarly high ratings for information clarity (PIA: 8.64 ± 1.69; control: 8.86 ± 1.28; p=0.82) and overall comprehension (PIA: 8.81 ± 1.40; control: 8.93 ± 1.61; p=0.35). Physician-led consultations more effectively alleviated patient concerns (8.30 ± 2.63 vs. 6.46 ± 3.29; p=0.003). Patients in the PIA group required significantly shorter subsequent consultation times (median: 120 s [IQR: 100-140] vs. 195 s [IQR: 170-220]; p=0.04). Radiologists rated PIA chats favorably across all evaluated categories.
Conclusion: A RAG-powered PIA chatbot can effectively deliver pre-CT information while reducing physician consultation time. Although patient satisfaction and comprehension were comparable to standard consultations, physician-led interactions remained superior in addressing patient concerns. These findings highlight the potential of AI-based chatbot solutions to streamline patient counseling for imaging procedures. At the same time, physician engagement remains crucial for addressing patient worries, suggesting a complementary role for both approaches in clinical practice.
Limitations: Single-center, sample size, self-reported outcomes.
Funding for this study: None.
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Technical University of Munich (2024–469-S-KK)
6 min
AI-Supported MR Safety Assessment of Implanted Devices: First Clinical Evaluation
Hanna Kreutzer, Aachen / Germany
Author Block: H. Kreutzer, D. Rashid, D. Truhn, S. Nebelung; Aachen/DE
Purpose: MR safety checks for patients with implanted devices are time-consuming and error-prone. Clinicians must identify the exact device model, retrieve the manufacturer’s handbook, and extract applicable scanning conditions. We developed an AI-agent that streamlines device-specific MR eligibility assessment using manufacturer documentation and scientific literature.
Methods or Background: The agent is built in LangGraph with a router node classifying user queries. Device-specific queries are directed to a retrieval-augmented generation (RAG) pipeline that utilizes manufacturer handbooks. General MR safety queries are handled by a separate RAG pipeline that utilizes peer-reviewed scientific literature. A central GPT-4.1 node composes the final output.
A web-based interface (chatbot-like) allows free-text queries or image uploads of implant ID-cards, which are analysed with GPT-4.1 Vision. The interface displays both the reasoning steps and the retrieved handbook/literature pages for transparency.
Evaluation was performed using consecutive patients with cardiac devices from our hospital. An MR-physicist documented the final safety decision (scan eligibility and protocol parameters), which served as the reference standard.
Results or Findings: The agent’s recommendation was correct in 15/19 cases. In the remaining four cases, the system flagged missing documentation, thereby avoiding unsupported recommendations. Importantly, no incorrect recommendations were made. Correct guideline source pages were displayed in 13 of the 15 correct cases.
Conclusion: An AI-agent grounded in manufacturer guidance can reliably answer MR safety questions. Early testing demonstrates promising accuracy and interpretability, with transparent display of reasoning and sources. If scaled beyond cardiac devices and expanded into comprehensive device databases, such agents have the potential to fundamentally transform MR safety practice by accelerating workflows, reducing errors, and setting new standards for patient safety in radiology.
Limitations: Some device manuals were unavailable. Evaluation was restricted to cardiac devices. Use of GPT-4.1 requires anonymised data.
Funding for this study: This research is supported by the Deutsche Forschungsgemeinschaft - DFG (701010997, 517243167, 515639690) , the German Federal Ministry of Research, Technology and Space (Transform Liver - 031L0312C, DECIPHER-M, 01KD2420B) and the European Union Research and Innovation Programme (ODELIA - GA 101057091, SAGMA – GA 101222556).
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information:
6 min
Energy Usage of Large Language Models and Segmentation Models in Radiology
Martin Segeroth, Basel / Switzerland
Author Block: M. Segeroth, S. Yang, J. Wasserthal, J. Cyriac, T. Heye, E. M. Merkle, M. Bach, J. Vosshenrich; Basel/CH
Purpose: Neural networks, in particular large language models (LLMs), are increasingly valuable tools that support human tasks rather than simply automating them. However, their use requires substantial amounts of energy. In clinical practice, justified privacy concerns favor the evaluation of open-source models, which allows for assessing their energy consumption.
Methods or Background: Within our institutional healthcare network, we deployed privateGPT and Ollama as the primary platforms for LLM utilization, and Nora for image analysis. The models were hosted on a server equipped with eight NVIDIA A100 GPUs (80 GB each). For LLM experiments, we tested Llama3-70B, and for medical image segmentation, we used TotalSegmentator. Task scheduling was managed with Slurm 23.11.4, while energy consumption was monitored using nvidia-smi 550.163.01 and turbostat 2023.11.07. Additional overall server-level measurements were performed.
Results or Findings: The server’s eight GPUs allow a maximum power of 400 W each, yet during our tests total peak power consumption reached 4235 W, with more than 1000 W attributable to non-GPU components. Idle consumption was 63 W per GPU and 1150 W for the full server. A single LLM request consumed 5.94 Wh (95% CI: 5.87–5.98 Wh), with GPU utilization at 86.39% (CI: 86.39–86.39%). TotalSegmentator training for MRI segmentations required 8389.14 Wh (CI: 8193.84–8730.37 Wh), with GPU utilization at 78.93% (CI: 78.93–78.93%). Inference with TotalSegmentator consumed 0.96 Wh (CI: 0.96–0.97 Wh) per case for tissue types, and complete MRI segmentation required 1.47 Wh (CI: 1.47–1.48 Wh).
Conclusion: Neural networks in clinical deployment consume a noticeable amount of energy, with individual tasks requiring 1–6 Wh, several times more than a typical Google search (~0.2 Wh). Nonetheless, their ability to augment clinical performance and support decision-making can justify the additional energy expenditure.
Limitations: Additional models and hardware are under evaluation.
Funding for this study: None
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: