3D conditional generative adversarial networks for high-quality PET image estimation at low dose
Introduction
Positron emission tomography (PET) has been widely applied in hospitals and clinics for disease diagnosis and intervention (Daerr et al., ; Mosconi et al., 2008; Huang et al., 2014; Cerami et al., 2015). Different from other imaging techniques, such as computed tomography (CT) and magnetic resonance imaging (MRI), PET is a functional imaging technique that produces three-dimensional in-vivo observation of metabolism processes of human body (Karnabi, 2017). Specifically, the PET system detects pairs of gamma rays emitted indirectly from a radioactive tracer which is injected into the human body on a biologically active molecule. Then, three-dimensional PET images of tracer concentration within the human body are constructed using computer analysis (Bailey et al., 2005).
To†obtain high quality PET images for diagnostic needs, a full dose radioactive tracer is usually preferred. This inevitably raises concerns about potential health hazards. According to the report “Biological Effects of Ionizing Radiation (BEIR VII)”,1 the increased risk of incidence of cancer is 10.8% per Sv, so one brain PET scan increases lifetime cancer risk by about 0.04%. Although this number is small, the risks are multiplied for the patients who experience multiple PET scans during their treatments. Moreover, these risks are more serious for pediatric patients. Although it is desirable to reduce the dose during the PET scanning, a major drawback of dose reduction is that more noise may be involved in the reconstructed PET images, resulting in poor image quality.
A range of methods have been proposed to improve the image quality and reduce noise and artifacts in PET images, while preserving crucial image details. In (Bagci and Mollura, 2013), for denoising PET images and preserving structural information simultaneously, the authors used the singular value thresholding concept and Stein's unbiased risk estimate to optimize a soft thresholding rule. In order to address the issue of resolution loss associated with denoising, Pogam et al. (2013) considered a strategy that combines the complementary wavelet and curvelet transforms. In (Mejia et al., 2014), a multi-resolution approach for noise reduction of PET images was proposed in the transform domain by modeling each sub-band as a group of different regions separated by boundaries.
The aforementioned methods are mainly designed to improve the image quality for the full-dose PET images. In contrast, the goal of this study is to estimate the high-quality full-dose PET image from the low-dose PET image, which is an innovative and promising research field. To the best of our knowledge, there are only a few works along this research direction. Specifically, Kang et al. (2015) proposed to train a regression forest to estimate the full-dose PET image in a voxel-wise strategy. In (Wang et al., 2016), a mapping based sparse representation method was adopted for full-dose PET prediction, utilizing both low-dose PET and multimodal MRI. To take advantage of a large number of missing-modality training samples, the authors further developed a semi-supervised tripled dictionary learning method for full-dose PET image prediction (Wang et al., 2017). An et al. (2016) proposed a multi-level canonical correlation analysis framework to map the low-dose and full-dose PET into a common space and perform patch-based sparse representation for estimation. Although the above sparse learning based methods showed good estimation performance, a major limitation of these methods is that they are based on small patches, and adopt a voxel-wise estimate strategy which is very time-consuming when testing on new samples. Also, the final estimation of each voxel was obtained by averaging the overlapped patches, resulting in over-smoothed images that lack the texture information within typical full-dose PET images. This smoothing effect may limit the quantification of small structures in estimated PET images.
Convolutional neural networks (CNN) have drawn a tremendous amount of attention in machine learning and medical image analysis areas (Kamnitsas et al., 2017; Kleesiek et al., 2016; Valverde et al., 2017; Dolz et al., 2017; Kawahara et al., 2017). In PET estimation research field, Xiang et al. (2017) proposed a deep auto-context CNN that estimates full-dose PET image based on local patches in low-dose PET. This regression method integrated multiple CNN modules following the auto-context strategy, to iteratively improve the tentatively estimated PET image. However, the authors just extracted the axial slices from the 3D images and treated them as separate 2D images independently for training the deep architecture. This inevitably causes the loss of information in sagittal and coronal directions and discontinuous estimation results across slices.
Recently, generative adversarial networks (GANs) have attracted widespread attention since their introduction (Goodfellow et al., 2014; Denton et al., 2015; Chen et al., 2016; Ledig et al., 2016; Wu et al., 2016). GANs are generative models which comprise two units namely generator and discriminator. The generator learns to map the input low-dimensional vectors to plausible counterfeits, according to some pre-specified distribution. The discriminator learns to distinguish between the generated distribution and the real data distribution. Using GANs, many researches have gained encouraging performance through architectural improvements and modification to the training scheme (Radford et al., 2015; Arjovsky and Bottou, 2017; Arjovsky et al., 2017; Berthelot et al., 2017; Zhao et al., 2016; Bi et al., 2017). Previous works have also explored GANs in the conditional settings, i.e., conditional GANs (Mirza and Osindero, 2014; Reed et al., 2016; Isola et al., 2016). Just like GANs learn a generative model data, the conditional GANs learn a conditional model of data. In the real world, there are numerous 3D image data such as 3D medical image, however, many applications using GANs focus on 2D images. To tackle the problem of 3D medical images, Wolterink et al. (2017) proposed a 3D conditional GANs model for noise reduction in low-dose CT images. In this paper, inspired by the remarkable success of GANs and to overcome the limitations of existing estimation methods, we propose a novel end-to-end framework based on 3D conditional GANs (3D c-GANs) to estimate the high-quality full-dose PET image from the corresponding low-dose PET image. Like the original GANs, the training procedure of our proposed 3D c-GANs is similar to a two-player min-max game in which a generator network () and a discriminator network () are trained alternatively to respectively minimize and maximize an objective function. The novelties and contributions of the paper are as follows.
- i
To ensure the same size of the input and output of the generator network, we utilize both convolutional and up-convolutional layers in our generator architecture instead of using the traditional CNN network which just includes convolutional layers.
- ii
To render the same underlying information between low-dose and full-dose PET images, we adopt a 3D U-net-like deep architecture as the generator network and use the skip connections strategy to combine hierarchical features for generating the estimated image. The detailed U-net-like architecture will be fully described in Section 2.2.1. The trained U-net-like generator can be directly applied to test images to synthesize the corresponding full-dose PET images, which is very efficient compared with those voxel-wise estimation methods.
- iii
To take into account the differences between the estimated full-dose PET image and the ground truth (i.e., the real full-dose PET image), the estimation error loss is considered in the objective function to enhance the robustness of the proposed approach. Different from (Wolterink et al., 2017), we employ the L1 norm instead of L2 distance to encourage less blurring.
- iv
To further improve the estimated image quality, we propose a concatenated 3D c-GANs based progressive refinement scheme.
The rest of this paper is organized as follows. Section 2 introduces our proposed 3D c-GANs architecture and methodology. Experimental setup is conducted in Section 3 and Section 4 gives the experimental results. Finally, we discuss and conclude this paper in Section 5.
Section snippets
Methodology
Fig. 1 illustrates the proposed 3D c-GANs training procedure, which constitutes two networks: the generator network and the discriminator network . The generator network takes a low-dose PET image and generates an estimated PET image that approximates its corresponding real full-dose PET image. The discriminator network takes a pair of images as input including both the low-dose PET and the corresponding real/estimated full-dose PET images, and it aims to differentiate between the real and
Data acquisition
We evaluate our 3D c-GANs based method on a real human brain dataset, including two categories: 8 normal subjects and 8 subjects diagnosed as mild cognitive impairment (MCI). The detailed demographic information of these subjects is summarized in Table 1.
The PET scans were acquired on the Siemens Biograph mMR PET-MR scanner. The full-dose PET scans in our cohort was administered an average of 203MBq (from 191 MBq to 229 MBq) of 18F-2-deoxyglucose (18FDG). This corresponds to an effective dose
Experimental results
To study the effectiveness of the proposed 3D c-GANs method, our experiment explores the following questions:
- ⁃
Compared with 2D model, does the 3D c-GANs model gain better performance?
- ⁃
Compared with the model just use the generator network (i.e., 3D U-net-like network), does the adversarial training in the 3D c-GANs model increase the estimation performance?
- ⁃
Does the model adopting the concatenated 3D c-GANs based progressive refinement scheme improve the estimation quality?
- ⁃
For the lesion regions
Conclusion
In this paper, we have proposed a novel end-to-end framework based on 3D c-GANs to estimate the high-quality full-dose PET images from the low-dose PET images. Difference from the original GANs that consider the image appearance slice by slice, our proposed method is carried on in a 3D manner, avoiding the discontinuous cross-slice estimation that occurs in 2D GANs models. To render the same underlying information between the low-dose and full-dose PET images, we employ a 3D U-net-like
Conflicts of interest
The authors declare no conflict of interest.
Acknowledgements
This work was supported by National Natural Science Foundation of China (NSFC61701324) and Australian Research Council (ARC DE160100241).
References (43)
- et al.
Brain metabolic maps in mild cognitive impairment predict heterogeneity of progression to dementia
NeuroImage Clin.
(2015) - et al.
Motion compensation for brain PET imaging using wireless MR active markers in simultaneous PET–MR: phantom and non-human primate studies
NeuroImage
(2014) - et al.
Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation
Med. image Anal.
(2017) - et al.
BrainNetCNN: convolutional neural networks for brain networks; towards predicting neurodevelopment
NeuroImage
(2017) - et al.
Positron emission tomography-computed tomography standardized uptake values in clinical practice and assessing response to therapy
- et al.
Deep MRI brain extraction: a 3D convolutional neural network for skull stripping
NeuroImage
(2016) - et al.
Evaluation of atlas selection strategies for atlas-based image segmentation with application to confocal microscopy images of bee brains
NeuroImage
(2004) - et al.
LABEL: pediatric brain extraction using learning-based meta-algorithm
Neuroimage
(2012) - et al.
Advances in functional and structural MR image analysis and implementation as FSL
Neuroimage
(2004) - et al.
Improving automated multiple sclerosis lesion segmentation with a cascaded 3D convolutional neural network approach
NeuroImage
(2017)
Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI
Neurocomputing
Multi-level canonical correlation analysis for PET image estimation
IEEE Trans. Image Process.
Towards Principled Methods for Training Generative Adversarial Networks
Wasserstein GAN
Denoising PET images using singular value thresholding and stein's unbiased risk estimate
Positron Emission Tomography: Basic Sciences
BEGAN: Boundary Equilibrium Generative Adversarial Networks
Synthesis of positron emission tomography (PET) images via multi-channel generative adversarial networks (GANs)
InfoGAN: interpretable representation learning by information maximizing generative adversarial nets
Deep generative image models using a laplacian pyramid of adversarial networks
Cited by (346)
A 3D multi-scale CycleGAN framework for generating synthetic PETs from MRIs for Alzheimer's disease diagnosis
2024, Image and Vision ComputingSAGAN: Skip attention generative adversarial networks for few-shot image generation
2024, Digital Signal Processing: A Review JournalDeep clustering framework review using multicriteria evaluation
2024, Knowledge-Based SystemsGenerative adversarial networks for spine imaging: A critical review of current applications
2024, European Journal of Radiology