Research Presentation Session

RPS 604b - Deep learning in chest radiograph and chest CT interpretation

Lectures

1
RPS 604b - Leveraging deep learning artificial intelligence in detecting mismatched anatomy in chest images acquired with abdomen protocol: prevalence analysis and performance metrics

RPS 604b - Leveraging deep learning artificial intelligence in detecting mismatched anatomy in chest images acquired with abdomen protocol: prevalence analysis and performance metrics

06:08K. Younis, Waukesha / US

Purpose:

In 2018, the two highest volume categories of x-ray procedures performed in the main radiology departments were chest (44%) and abdomen/pelvis (18%). X-ray images may be acquired using a protocol for different anatomy, resulting in using incorrect x-ray exposure parameters and image processing algorithms. Using the incorrect DICOM tags may also prevent the consistent and reliable use of hanging protocols in PACS. We propose an AI algorithm to automatically detect protocol mismatch in Chest Frontal (CF) x-ray images and issue a warning to the technologist, enabling the technologist to reprocess the image or acquire the exposure again but with the right protocol and correct the DICOM tags.

Methods and materials:

We curated 732 unique x-ray images acquired with an abdomen protocol from mobile and fixed systems in the USA and Ireland. All images were evaluated by a radiologic technologist and divided into two groups, namely chest and abdomen, and studies prevalence of mismatch. A deep learning CF detection algorithm was developed using 29,778 unique x-ray images from USA and Canada.

Results:

The anatomy mismatch prevalence of chest exams taken with the abdomen protocol changed among the sites studied with an average of 11.2% and a maximum of 12% in a largest hospital. Performance of the CF detection algorithm was evaluated using a confusion matrix analysis on the 732 unique images. The overall accuracy was 96.9%. The false-positive rate of flagging true abdomen images as chest was only 1.5%.

Conclusion:

A deep learning quality algorithm and workflow enhancement can be used to warn a radiographer if a chest image is acquired using a protocol for a different anatomy enabling correcting the DICOM Tags and reprocessing or reacquiring the exposure with the right protocol.

Limitations:

n/a

Ethics committee approval

n/a

Funding:

No funding was received for this work.

2
RPS 604b - Quantitative analysis of airway and parenchymal lesions in idiopathic pulmonary fibrosis using an artificial intelligence-based technology

RPS 604b - Quantitative analysis of airway and parenchymal lesions in idiopathic pulmonary fibrosis using an artificial intelligence-based technology

06:32T. Handa, Kyoto / JP

Purpose:

Previous studies showed that the severity of traction bronchiectasis (TBE) might be associated with prognosis in idiopathic pulmonary fibrosis (IPF). This study aimed to quantify airway volumes as an indicator of TBE together with parenchymal lesions on chest high-resolution computed tomography (HRCT) and investigate their clinical significance in patients with IPF.

Methods and materials:

A total of 103 IPF patients who visited Kyoto University Hospital and underwent chest HRCT and pulmonary function tests were enrolled. An artificial intelligence-based image analysis software was developed in collaboration with FUJIFILM Corporation. The software automatically measured airway volumes peripheral to the main bronchi, as well as volumes of parenchymal lesions. The extents of these volumes were expressed as percentages to total lung volume and their associations with pulmonary function and survival were analysed.

Results:

Airway volumes had a moderate negative correlation with %FVC and strong positive correlation with FEV1/FVC. In univariate analysis, airway volumes, as well as volumes of some parenchymal lesions including consolidation and interstitial lung disease (ILD) (the sum of ground-glass opacity, reticulation, and honeycombing), were significantly associated with survival. In multivariate analysis, consolidation (hazard ratio [HR], 1.54; 95% confidence interval [95%CI], 1.25–1.90) and ILD volumes (HR, 1.06; 95%CI, 1.02-1.10), but not airway volumes, were independently associated with survival.

Conclusion:

Airway volumes are novel parameters associated with pulmonary function and survival in IPF. Consolidation volume might also be a novel prognostic imaging biomarker in IPF.

Limitations:

This is a single-centre, retrospective study with a moderate number of patients. Serial changes of CT parameters were not assessed.

Ethics committee approval

The Institutional Review Board of Kyoto university approved this retrospective study (IRB approval number R1353).

Funding:

This study was supported by a grant from FUJIFILM Corporation.

3
RPS 604b - 3D computer-aided volumetry (CADv) system with AI system: a comparison of quantitative nodule component measurement accuracy and pulmonary nodule differentiation capability on repeated CT examination

RPS 604b - 3D computer-aided volumetry (CADv) system with AI system: a comparison of quantitative nodule component measurement accuracy and pulmonary nodule differentiation capability on repeated CT examination

07:38Y. Ohno, Kobe / JP

Purpose:

To compare the capability for nodule component measurement and nodule differentiation between computer-aided volumetry (CADv) with and without convolutional neural network (CNN) on repeated CT in routine clinical practice.

Methods and materials:

170 consecutive patients detected with 215 pulmonary nodules (103 malignant and 112 benign nodules) at initial CTs underwent follow-up CT, pathological and bacterial examinations, treatment, or more than 2 years follow-up. In this study, each CADv automatically assessed solid and GGO component as well as total nodule (TN) volumes. In each patient, TN volume change per day (TN/day) and doubling time (DT) were also automatically evaluated from two serial CTs at each CADv. To evaluate the accuracy of volume measurement, the gold standard of each volume was computationally determined by the STAPLE method. To determine the utility of CNN, the measurement error of each volume was compared between both CADvs by t-test. To compare the capability for nodule differentiation, ROC analysis was performed. Finally, diagnostic accuracies were compared among all indexes determined by both CADvs by McNemar’s test.

Results:

Measurement errors of GGO and TN with CNN were significantly smaller than those without CNN (p<0.05). The area under the curve (Az) of TN/day with CNN (Az=0.94) was significantly larger than that of others (p<0.0001). In addition, Az of DT with CNN (Az=0.67) was significantly larger than that without CNN (Az=0.58, p=0.03).

Conclusion:

CADv with CNN is more useful than without CNN for quantitative nodule component assessment and nodule differentiation on routine CTs.

Limitations:

A limited study population.

Ethics committee approval

This prospective study was approved by our institutional review board of Kobe University Graduate School of Medicine and written informed consent was obtained from each subject.

Funding:

This study was financially and technically supported by Canon Medical Systems Corporation.

4
RPS 604b - Objectively evaluating the labelling accuracy of the Stanford CheXpert dataset: a multi-reader study

RPS 604b - Objectively evaluating the labelling accuracy of the Stanford CheXpert dataset: a multi-reader study

06:22V. Venugopal, New Delhi / IN

Purpose:

To quantify the labelling accuracy of the CheXpert dataset released by Stanford University by comparing it with labels established by radiologists.

Methods and materials:

284 frontal chest x-rays were randomly extracted from the CheXpert dataset and read by 3 radiologists (R1, R2, R3) having 12, 14, and 32 years of experience, respectively. Each radiologist reported ‘yes’ or ‘no’ for all the labels provided in the CheXpert dataset. ‘Support devices’ were excluded due to the inherent unclear nature of the label. Percentage observed agreement was calculated for all three radiologists and the consensus of radiologists to the CheXpert labels. In CheXpert, ‘-1’ label was attributed to ‘uncertain’ presence of findings in the image. We evaluated 4 scenarios with ‘-1’ treated as ‘Yes’, ‘No’, ‘N/A’, and a separate third label. Additionally, we intended to open-source our labels for these 284 images.

Results:

The mean percentage observed agreement for R1 was 0.80, 0.83, 0.83, and 0.79, for R2 was 0.79, 0.82, 0.82, and 0.78, and for R3 was 0.79, 0.82, 0.82, and 0.78 for each of the four scenarios of ‘-1’ labels. Percentage observed agreement between the consensus of 3 radiologists and CheXpert was highest for ‘fractures’ (0.94), ‘pneumothorax’ (0.93), and ‘pneumonia’ (0.87), and lowest for ‘lung opacity’ (0.63), ‘atelectasis” (0.72), and ‘pleural effusion’ (0.76).

Conclusion:

Our study demonstrates that labels extracted from Stanford’s database using natural language processing are accurate and can be used for training and validating deep learning algorithms. More such open-source datasets can help in the development of many more algorithms.

Limitations:

A small sample size of 284 x-rays is a major limitation of this study.

Ethics committee approval

n/a

Funding:

No funding was received for this work.

5
RPS 604b - Quantitative image quality comparison of bone suppression images generated by dual-energy subtraction techniques and deep learning-based software

RPS 604b - Quantitative image quality comparison of bone suppression images generated by dual-energy subtraction techniques and deep learning-based software

06:11A. Son, Seoul / KR

Purpose:

To investigate the quantitative image quality of bone suppression image (BSI) generated by deep learning-based (DL) software compared with dual-energy subtraction (DES) techniques.

Methods and materials:

This prospective study included 40 adult patients who underwent two digital chest radiographs (CXR) using x-ray equipment with DES and x-ray equipment with DL software. In intercostal and bone regions, respectively, 720 region-of-interests (ROIs) were extracted from the original CXR and BSI. For the comparison of objective image quality, peak signal-to-noise ratio (PSNR) and structure similarity index (SSIM) were calculated from ROIs extracted from original CXR and BSI, and compared between DES techniques and DL software groups.

Results:

In the intercostal regions, PNSR and SSIM of BSI generated by the DL software was significantly higher than those of DES technique (31.45±6.87 dB vs. 29.85±2.31 dB; and 97.51±8.76 % vs. 91.26±5.13 %; all P value < 0.001). In bone regions, PNSR of BSI generated by the DL software was significantly lower than that of DES technique (20.93±3.18 vs. 34.37±3.22 dB), but SSIM of BSI made by the DL software was significantly higher than that of DES technique (94.57±8.76 vs. 87.77±5.16 % in SSIM) (all P < 0.001).

Conclusion:

DL software creates effective bone removal images while maintaining the image quality of the soft tissue.

Limitations:

The image quality of the BSI generated by DES may be not an absolute criterion because the image quality of x-ray can be affected by x-ray exposure parameters.

Ethics committee approval

The present study protocol was reviewed and approved by the Institutional Review Board of Asan Medical Center (approval No. 2018-1348) and all patients gave written, informed consent.

Funding:

This research was supported by a research grant from Samsung Electronics (2018).

PEP Subscription Required

This course is only accessible for ESR Premium Education Package subscribers.