Elsevier

NeuroImage

Volume 174, 1 July 2018, Pages 550-562
NeuroImage

3D conditional generative adversarial networks for high-quality PET image estimation at low dose

https://doi.org/10.1016/j.neuroimage.2018.03.045Get rights and content

Highlights

  • To render the same underlying information between the low-dose and full-dose PET images, a 3D U-net-like deep architecture which can combine hierarchical features by using skip connections is designed as the generator network to synthesize the full-dose image.

  • To guarantee the synthesized PET image to be close to the real one, we take into account of the estimation error loss in addition to the discriminator feedback to train the generator network.

  • A concatenated 3D c-GANs based progressive refinement scheme is also proposed to further improve the quality of estimated images.

Abstract

Positron emission tomography (PET) is a widely used imaging modality, providing insight into both the biochemical and physiological processes of human body. Usually, a full dose radioactive tracer is required to obtain high-quality PET images for clinical needs. This inevitably raises concerns about potential health hazards. On the other hand, dose reduction may cause the increased noise in the reconstructed PET images, which impacts the image quality to a certain extent. In this paper, in order to reduce the radiation exposure while maintaining the high quality of PET images, we propose a novel method based on 3D conditional generative adversarial networks (3D c-GANs) to estimate the high-quality full-dose PET images from low-dose ones. Generative adversarial networks (GANs) include a generator network and a discriminator network which are trained simultaneously with the goal of one beating the other. Similar to GANs, in the proposed 3D c-GANs, we condition the model on an input low-dose PET image and generate a corresponding output full-dose PET image. Specifically, to render the same underlying information between the low-dose and full-dose PET images, a 3D U-net-like deep architecture which can combine hierarchical features by using skip connection is designed as the generator network to synthesize the full-dose image. In order to guarantee the synthesized PET image to be close to the real one, we take into account of the estimation error loss in addition to the discriminator feedback to train the generator network. Furthermore, a concatenated 3D c-GANs based progressive refinement scheme is also proposed to further improve the quality of estimated images. Validation was done on a real human brain dataset including both the normal subjects and the subjects diagnosed as mild cognitive impairment (MCI). Experimental results show that our proposed 3D c-GANs method outperforms the benchmark methods and achieves much better performance than the state-of-the-art methods in both qualitative and quantitative measures.

Introduction

Positron emission tomography (PET) has been widely applied in hospitals and clinics for disease diagnosis and intervention (Daerr et al., ; Mosconi et al., 2008; Huang et al., 2014; Cerami et al., 2015). Different from other imaging techniques, such as computed tomography (CT) and magnetic resonance imaging (MRI), PET is a functional imaging technique that produces three-dimensional in-vivo observation of metabolism processes of human body (Karnabi, 2017). Specifically, the PET system detects pairs of gamma rays emitted indirectly from a radioactive tracer which is injected into the human body on a biologically active molecule. Then, three-dimensional PET images of tracer concentration within the human body are constructed using computer analysis (Bailey et al., 2005).

Toobtain high quality PET images for diagnostic needs, a full dose radioactive tracer is usually preferred. This inevitably raises concerns about potential health hazards. According to the report “Biological Effects of Ionizing Radiation (BEIR VII)”,1 the increased risk of incidence of cancer is 10.8% per Sv, so one brain PET scan increases lifetime cancer risk by about 0.04%. Although this number is small, the risks are multiplied for the patients who experience multiple PET scans during their treatments. Moreover, these risks are more serious for pediatric patients. Although it is desirable to reduce the dose during the PET scanning, a major drawback of dose reduction is that more noise may be involved in the reconstructed PET images, resulting in poor image quality.

A range of methods have been proposed to improve the image quality and reduce noise and artifacts in PET images, while preserving crucial image details. In (Bagci and Mollura, 2013), for denoising PET images and preserving structural information simultaneously, the authors used the singular value thresholding concept and Stein's unbiased risk estimate to optimize a soft thresholding rule. In order to address the issue of resolution loss associated with denoising, Pogam et al. (2013) considered a strategy that combines the complementary wavelet and curvelet transforms. In (Mejia et al., 2014), a multi-resolution approach for noise reduction of PET images was proposed in the transform domain by modeling each sub-band as a group of different regions separated by boundaries.

The aforementioned methods are mainly designed to improve the image quality for the full-dose PET images. In contrast, the goal of this study is to estimate the high-quality full-dose PET image from the low-dose PET image, which is an innovative and promising research field. To the best of our knowledge, there are only a few works along this research direction. Specifically, Kang et al. (2015) proposed to train a regression forest to estimate the full-dose PET image in a voxel-wise strategy. In (Wang et al., 2016), a mapping based sparse representation method was adopted for full-dose PET prediction, utilizing both low-dose PET and multimodal MRI. To take advantage of a large number of missing-modality training samples, the authors further developed a semi-supervised tripled dictionary learning method for full-dose PET image prediction (Wang et al., 2017). An et al. (2016) proposed a multi-level canonical correlation analysis framework to map the low-dose and full-dose PET into a common space and perform patch-based sparse representation for estimation. Although the above sparse learning based methods showed good estimation performance, a major limitation of these methods is that they are based on small patches, and adopt a voxel-wise estimate strategy which is very time-consuming when testing on new samples. Also, the final estimation of each voxel was obtained by averaging the overlapped patches, resulting in over-smoothed images that lack the texture information within typical full-dose PET images. This smoothing effect may limit the quantification of small structures in estimated PET images.

Convolutional neural networks (CNN) have drawn a tremendous amount of attention in machine learning and medical image analysis areas (Kamnitsas et al., 2017; Kleesiek et al., 2016; Valverde et al., 2017; Dolz et al., 2017; Kawahara et al., 2017). In PET estimation research field, Xiang et al. (2017) proposed a deep auto-context CNN that estimates full-dose PET image based on local patches in low-dose PET. This regression method integrated multiple CNN modules following the auto-context strategy, to iteratively improve the tentatively estimated PET image. However, the authors just extracted the axial slices from the 3D images and treated them as separate 2D images independently for training the deep architecture. This inevitably causes the loss of information in sagittal and coronal directions and discontinuous estimation results across slices.

Recently, generative adversarial networks (GANs) have attracted widespread attention since their introduction (Goodfellow et al., 2014; Denton et al., 2015; Chen et al., 2016; Ledig et al., 2016; Wu et al., 2016). GANs are generative models which comprise two units namely generator and discriminator. The generator learns to map the input low-dimensional vectors to plausible counterfeits, according to some pre-specified distribution. The discriminator learns to distinguish between the generated distribution and the real data distribution. Using GANs, many researches have gained encouraging performance through architectural improvements and modification to the training scheme (Radford et al., 2015; Arjovsky and Bottou, 2017; Arjovsky et al., 2017; Berthelot et al., 2017; Zhao et al., 2016; Bi et al., 2017). Previous works have also explored GANs in the conditional settings, i.e., conditional GANs (Mirza and Osindero, 2014; Reed et al., 2016; Isola et al., 2016). Just like GANs learn a generative model data, the conditional GANs learn a conditional model of data. In the real world, there are numerous 3D image data such as 3D medical image, however, many applications using GANs focus on 2D images. To tackle the problem of 3D medical images, Wolterink et al. (2017) proposed a 3D conditional GANs model for noise reduction in low-dose CT images. In this paper, inspired by the remarkable success of GANs and to overcome the limitations of existing estimation methods, we propose a novel end-to-end framework based on 3D conditional GANs (3D c-GANs) to estimate the high-quality full-dose PET image from the corresponding low-dose PET image. Like the original GANs, the training procedure of our proposed 3D c-GANs is similar to a two-player min-max game in which a generator network (G) and a discriminator network (D) are trained alternatively to respectively minimize and maximize an objective function. The novelties and contributions of the paper are as follows.

  • i

    To ensure the same size of the input and output of the generator network, we utilize both convolutional and up-convolutional layers in our generator architecture instead of using the traditional CNN network which just includes convolutional layers.

  • ii

    To render the same underlying information between low-dose and full-dose PET images, we adopt a 3D U-net-like deep architecture as the generator network and use the skip connections strategy to combine hierarchical features for generating the estimated image. The detailed U-net-like architecture will be fully described in Section 2.2.1. The trained U-net-like generator can be directly applied to test images to synthesize the corresponding full-dose PET images, which is very efficient compared with those voxel-wise estimation methods.

  • iii

    To take into account the differences between the estimated full-dose PET image and the ground truth (i.e., the real full-dose PET image), the estimation error loss is considered in the objective function to enhance the robustness of the proposed approach. Different from (Wolterink et al., 2017), we employ the L1 norm instead of L2 distance to encourage less blurring.

  • iv

    To further improve the estimated image quality, we propose a concatenated 3D c-GANs based progressive refinement scheme.

The rest of this paper is organized as follows. Section 2 introduces our proposed 3D c-GANs architecture and methodology. Experimental setup is conducted in Section 3 and Section 4 gives the experimental results. Finally, we discuss and conclude this paper in Section 5.

Section snippets

Methodology

Fig. 1 illustrates the proposed 3D c-GANs training procedure, which constitutes two networks: the generator network G and the discriminator network D. The generator network takes a low-dose PET image and generates an estimated PET image that approximates its corresponding real full-dose PET image. The discriminator network takes a pair of images as input including both the low-dose PET and the corresponding real/estimated full-dose PET images, and it aims to differentiate between the real and

Data acquisition

We evaluate our 3D c-GANs based method on a real human brain dataset, including two categories: 8 normal subjects and 8 subjects diagnosed as mild cognitive impairment (MCI). The detailed demographic information of these subjects is summarized in Table 1.

The PET scans were acquired on the Siemens Biograph mMR PET-MR scanner. The full-dose PET scans in our cohort was administered an average of 203MBq (from 191 MBq to 229 MBq) of 18F-2-deoxyglucose (18FDG). This corresponds to an effective dose

Experimental results

To study the effectiveness of the proposed 3D c-GANs method, our experiment explores the following questions:

  • Compared with 2D model, does the 3D c-GANs model gain better performance?

  • Compared with the model just use the generator network (i.e., 3D U-net-like network), does the adversarial training in the 3D c-GANs model increase the estimation performance?

  • Does the model adopting the concatenated 3D c-GANs based progressive refinement scheme improve the estimation quality?

  • For the lesion regions

Conclusion

In this paper, we have proposed a novel end-to-end framework based on 3D c-GANs to estimate the high-quality full-dose PET images from the low-dose PET images. Difference from the original GANs that consider the image appearance slice by slice, our proposed method is carried on in a 3D manner, avoiding the discontinuous cross-slice estimation that occurs in 2D GANs models. To render the same underlying information between the low-dose and full-dose PET images, we employ a 3D U-net-like

Conflicts of interest

The authors declare no conflict of interest.

Acknowledgements

This work was supported by National Natural Science Foundation of China (NSFC61701324) and Australian Research Council (ARC DE160100241).

References (43)

  • L. Xiang et al.

    Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI

    Neurocomputing

    (2017)
  • L. An et al.

    Multi-level canonical correlation analysis for PET image estimation

    IEEE Trans. Image Process.

    (2016)
  • M. Arjovsky et al.

    Towards Principled Methods for Training Generative Adversarial Networks

    (2017)
  • M. Arjovsky et al.

    Wasserstein GAN

    (2017)
  • U. Bagci et al.

    Denoising PET images using singular value thresholding and stein's unbiased risk estimate

  • D.L. Bailey et al.

    Positron Emission Tomography: Basic Sciences

    (2005)
  • D. Berthelot et al.

    BEGAN: Boundary Equilibrium Generative Adversarial Networks

    (2017)
  • L. Bi et al.

    Synthesis of positron emission tomography (PET) images via multi-channel generative adversarial networks (GANs)

  • X. Chen et al.

    InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

  • Daerr, S., Brendel, M., Zach, C., Mille, E., Schilling, D., Zacherl, M. J., Burger, K., Danek, A., Pogarell, O.,...
  • E.L. Denton et al.

    Deep generative image models using a laplacian pyramid of adversarial networks

  • Cited by (346)

    View all citing articles on Scopus
    View full text