Test-retest repeatability of a deep learning architecture in detecting and segmenting clinically significant prostate cancer on apparent diffusion coefficient (ADC) maps

Eur Radiol. 2021 Jan;31(1):379-391. doi: 10.1007/s00330-020-07065-4. Epub 2020 Jul 23.

Abstract

Objectives: To evaluate short-term test-retest repeatability of a deep learning architecture (U-Net) in slice- and lesion-level detection and segmentation of clinically significant prostate cancer (csPCa: Gleason grade group > 1) using diffusion-weighted imaging fitted with monoexponential function, ADCm.

Methods: One hundred twelve patients with prostate cancer (PCa) underwent 2 prostate MRI examinations on the same day. PCa areas were annotated using whole mount prostatectomy sections. Two U-Net-based convolutional neural networks were trained on three different ADCm b value settings for (a) slice- and (b) lesion-level detection and (c) segmentation of csPCa. Short-term test-retest repeatability was estimated using intra-class correlation coefficient (ICC(3,1)), proportionate agreement, and dice similarity coefficient (DSC). A 3-fold cross-validation was performed on training set (N = 78 patients) and evaluated for performance and repeatability on testing data (N = 34 patients).

Results: For the three ADCm b value settings, repeatability of mean ADCm of csPCa lesions was ICC(3,1) = 0.86-0.98. Two CNNs with U-Net-based architecture demonstrated ICC(3,1) in the range of 0.80-0.83, agreement of 66-72%, and DSC of 0.68-0.72 for slice- and lesion-level detection and segmentation of csPCa. Bland-Altman plots suggest that there is no systematic bias in agreement between inter-scan ground truth segmentation repeatability and segmentation repeatability of the networks.

Conclusions: For the three ADCm b value settings, two CNNs with U-Net-based architecture were repeatable for the problem of detection of csPCa at the slice-level. The network repeatability in segmenting csPCa lesions is affected by inter-scan variability and ground truth segmentation repeatability and may thus improve with better inter-scan reproducibility.

Key points: • For the three ADCm b value settings, two CNNs with U-Net-based architecture were repeatable for the problem of detection of csPCa at the slice-level. • The network repeatability in segmenting csPCa lesions is affected by inter-scan variability and ground truth segmentation repeatability and may thus improve with better inter-scan reproducibility.

Keywords: Diffusion MRI; Neural network models; Prostate cancer; Test-retest reliability.

MeSH terms

  • Deep Learning*
  • Diffusion Magnetic Resonance Imaging
  • Humans
  • Male
  • Neural Networks, Computer
  • Prostatic Neoplasms* / diagnostic imaging
  • Reproducibility of Results