Same studies, different scores: how RQS, RQS2, and METRICS shape radiomics quality assessment
Author Block: M. Bobowicz, M. Kosno, K. P. Brzozowski, E. Szurowska; Gdańsk/PL
Purpose: Radiomics has emerged as a powerful method for extracting quantitative insights from medical imaging, with the potential to enhance diagnostic and prognostic precision. We aimed to critically evaluate RQS, RQS2, and METRICS tools to assess their robustness, reliability, and practical utility, thereby guiding reproducible and high-quality radiomics research.
Methods or Background: A comprehensive search of PubMed, Embase, Scopus, Web of Science, and IEEE Xplore databases identified studies employing predicting pathological complete response (pCR) in breast cancer patients receiving neoadjuvant therapy based on MRI radiomics. The study quality was assessed using RQS, RQS2 and METRICS. We compared metrics across seven overarching categories: study design and protocol, imaging protocol quality, image preparation and processing, segmentation and ROI definition, feature extraction and selection, model building and validation, and reporting, transparency, and open science. Inter-reader agreement was evaluated with Cohen’s κ, and overall score reliability was determined using ICC.
Results or Findings: RQS and RQS2 emphasise clinical aspects, while METRICS provides more holistic perspective. The result of this approach is reflected in the higher median score achieved by METRICS compared to RQS and RQS2. The correlation between total scores was weak to moderate. The RQS vs. RQS2 analysis yielded a result of ρ = 0.312 (p ≈ 0.068). Similarly, the RQS vs. METRICS analysis produced a result of ρ = 0.180 (p ≈ 0.302). Finally, the RQS2 vs. METRICS comparison yielded a result of ρ = 0.412 (p = 0.014).
Conclusion: The METRICS tool is the most equitable choice, as each supercategory addresses multiple facets of the issue. Each research problem has unique characteristics, and the effectiveness of RQS, RQS2, or METRICS may differ. Therefore, when evaluating a model's quality, at least two forms should be used.
Limitations: None
Funding for this study: This project has received funding from the Digital Europe Programme under grant agreement No. 101100633 (EUCAIM); the European Union’s Horizon Europe and Horizon 2020 research and innovation programme under grant agreement No. 101057699 (RadioVal) and No. 952103 (EuCanImage)
Has your study been approved by an ethics committee? Not applicable
Ethics committee - additional information: