Author Block: C. Pauling, O. Arthurs, B. Kanber, S. C. Shelmerdine; London/UK
Purpose: The purpose of this study was to assess the generalisability of an artificial intelligence (AI) model, trained on open-source data, for the detection of fractures and other abnormalities in paediatric wrist radiographs using a novel, external, and multi-centric dataset.
Methods or Background: A novel retrospective case dataset was curated from two paediatric trauma centres in London, England. The dataset comprises 865 images with a mean patient age of 10.4 ± 3.5 [standard deviation] years. Ground truth annotations for the external test dataset were established by consensus opinion of at least two paediatric radiologists. To imitate real-world prospective data, no pre-processing was applied to the external data and only invalid scans were excluded.
A YOLOv7-X model was trained on GRAZPEDWRI-DX, an open-source paediatric wrist trauma radiograph dataset. After achieving an optimal performance on the test split of data, the model was used to perform inference on the novel external data and the performance metrics were compared.
Results or Findings: The sensitivity of the model for the detection of fractures was 89.0% on the test split of the open-source data. When evaluating on the novel external data, the sensitivity decreased by 32.6%. The reduction in the performance of the model across all detection classes was less severe, with a change to mean Average Position (mAP) of
[email protected] (-0.067 mAP@[0.5:0.95]).
Conclusion: The model failed to adequately generalise to an external dataset evidenced by a notable decline in fracture detection sensitivity. It is of critical importance to ensure that AI models intended for use in a prospective clinical setting are externally validated. Additionally, data quality and pre-processing procedures can significantly impact model performance.
Limitations: The open-source training dataset contains annotations for additional pathologies which are not included in the external test dataset.
Funding for this study: CP is funded by the Great Ormond Street Hospital Children’s Charity (GOSHCC) (Award Number: VS0618).
OJA is funded by an NIHR Career Development Fellowship (NIHR-CDF-2017-10-037).
SCS is funded by an NIHR Advanced Fellowship Award (NIHR-301322).
Has your study been approved by an ethics committee? Yes
Ethics committee - additional information: Ethical approval was provided by the National Health Service (NHS) Health Research Authority (HRA) (IRAS ID: 274278, REC reference 22/PR/0334)