3D conditional generative adversarial networks for high-quality PET image estimation at low dose

doi:10.1016/j.neuroimage.2018.03.045

NeuroImage

Volume 174, 1 July 2018, Pages 550-562

https://doi.org/10.1016/j.neuroimage.2018.03.045 Get rights and content

Highlights

•
To render the same underlying information between the low-dose and full-dose PET images, a 3D U-net-like deep architecture which can combine hierarchical features by using skip connections is designed as the generator network to synthesize the full-dose image.
•
To guarantee the synthesized PET image to be close to the real one, we take into account of the estimation error loss in addition to the discriminator feedback to train the generator network.
•
A concatenated 3D c-GANs based progressive refinement scheme is also proposed to further improve the quality of estimated images.

Abstract

Positron emission tomography (PET) is a widely used imaging modality, providing insight into both the biochemical and physiological processes of human body. Usually, a full dose radioactive tracer is required to obtain high-quality PET images for clinical needs. This inevitably raises concerns about potential health hazards. On the other hand, dose reduction may cause the increased noise in the reconstructed PET images, which impacts the image quality to a certain extent. In this paper, in order to reduce the radiation exposure while maintaining the high quality of PET images, we propose a novel method based on 3D conditional generative adversarial networks (3D c-GANs) to estimate the high-quality full-dose PET images from low-dose ones. Generative adversarial networks (GANs) include a generator network and a discriminator network which are trained simultaneously with the goal of one beating the other. Similar to GANs, in the proposed 3D c-GANs, we condition the model on an input low-dose PET image and generate a corresponding output full-dose PET image. Specifically, to render the same underlying information between the low-dose and full-dose PET images, a 3D U-net-like deep architecture which can combine hierarchical features by using skip connection is designed as the generator network to synthesize the full-dose image. In order to guarantee the synthesized PET image to be close to the real one, we take into account of the estimation error loss in addition to the discriminator feedback to train the generator network. Furthermore, a concatenated 3D c-GANs based progressive refinement scheme is also proposed to further improve the quality of estimated images. Validation was done on a real human brain dataset including both the normal subjects and the subjects diagnosed as mild cognitive impairment (MCI). Experimental results show that our proposed 3D c-GANs method outperforms the benchmark methods and achieves much better performance than the state-of-the-art methods in both qualitative and quantitative measures.

Introduction

Positron emission tomography (PET) has been widely applied in hospitals and clinics for disease diagnosis and intervention (Daerr et al., ; Mosconi et al., 2008; Huang et al., 2014; Cerami et al., 2015). Different from other imaging techniques, such as computed tomography (CT) and magnetic resonance imaging (MRI), PET is a functional imaging technique that produces three-dimensional in-vivo observation of metabolism processes of human body (Karnabi, 2017). Specifically, the PET system detects pairs of gamma rays emitted indirectly from a radioactive tracer which is injected into the human body on a biologically active molecule. Then, three-dimensional PET images of tracer concentration within the human body are constructed using computer analysis (Bailey et al., 2005).

To^†obtain high quality PET images for diagnostic needs, a full dose radioactive tracer is usually preferred. This inevitably raises concerns about potential health hazards. According to the report “Biological Effects of Ionizing Radiation (BEIR VII)”,¹ the increased risk of incidence of cancer is 10.8% per Sv, so one brain PET scan increases lifetime cancer risk by about 0.04%. Although this number is small, the risks are multiplied for the patients who experience multiple PET scans during their treatments. Moreover, these risks are more serious for pediatric patients. Although it is desirable to reduce the dose during the PET scanning, a major drawback of dose reduction is that more noise may be involved in the reconstructed PET images, resulting in poor image quality.

A range of methods have been proposed to improve the image quality and reduce noise and artifacts in PET images, while preserving crucial image details. In (Bagci and Mollura, 2013), for denoising PET images and preserving structural information simultaneously, the authors used the singular value thresholding concept and Stein's unbiased risk estimate to optimize a soft thresholding rule. In order to address the issue of resolution loss associated with denoising, Pogam et al. (2013) considered a strategy that combines the complementary wavelet and curvelet transforms. In (Mejia et al., 2014), a multi-resolution approach for noise reduction of PET images was proposed in the transform domain by modeling each sub-band as a group of different regions separated by boundaries.

The aforementioned methods are mainly designed to improve the image quality for the full-dose PET images. In contrast, the goal of this study is to estimate the high-quality full-dose PET image from the low-dose PET image, which is an innovative and promising research field. To the best of our knowledge, there are only a few works along this research direction. Specifically, Kang et al. (2015) proposed to train a regression forest to estimate the full-dose PET image in a voxel-wise strategy. In (Wang et al., 2016), a mapping based sparse representation method was adopted for full-dose PET prediction, utilizing both low-dose PET and multimodal MRI. To take advantage of a large number of missing-modality training samples, the authors further developed a semi-supervised tripled dictionary learning method for full-dose PET image prediction (Wang et al., 2017). An et al. (2016) proposed a multi-level canonical correlation analysis framework to map the low-dose and full-dose PET into a common space and perform patch-based sparse representation for estimation. Although the above sparse learning based methods showed good estimation performance, a major limitation of these methods is that they are based on small patches, and adopt a voxel-wise estimate strategy which is very time-consuming when testing on new samples. Also, the final estimation of each voxel was obtained by averaging the overlapped patches, resulting in over-smoothed images that lack the texture information within typical full-dose PET images. This smoothing effect may limit the quantification of small structures in estimated PET images.

Convolutional neural networks (CNN) have drawn a tremendous amount of attention in machine learning and medical image analysis areas (Kamnitsas et al., 2017; Kleesiek et al., 2016; Valverde et al., 2017; Dolz et al., 2017; Kawahara et al., 2017). In PET estimation research field, Xiang et al. (2017) proposed a deep auto-context CNN that estimates full-dose PET image based on local patches in low-dose PET. This regression method integrated multiple CNN modules following the auto-context strategy, to iteratively improve the tentatively estimated PET image. However, the authors just extracted the axial slices from the 3D images and treated them as separate 2D images independently for training the deep architecture. This inevitably causes the loss of information in sagittal and coronal directions and discontinuous estimation results across slices.

Recently, generative adversarial networks (GANs) have attracted widespread attention since their introduction (Goodfellow et al., 2014; Denton et al., 2015; Chen et al., 2016; Ledig et al., 2016; Wu et al., 2016). GANs are generative models which comprise two units namely generator and discriminator. The generator learns to map the input low-dimensional vectors to plausible counterfeits, according to some pre-specified distribution. The discriminator learns to distinguish between the generated distribution and the real data distribution. Using GANs, many researches have gained encouraging performance through architectural improvements and modification to the training scheme (Radford et al., 2015; Arjovsky and Bottou, 2017; Arjovsky et al., 2017; Berthelot et al., 2017; Zhao et al., 2016; Bi et al., 2017). Previous works have also explored GANs in the conditional settings, i.e., conditional GANs (Mirza and Osindero, 2014; Reed et al., 2016; Isola et al., 2016). Just like GANs learn a generative model data, the conditional GANs learn a conditional model of data. In the real world, there are numerous 3D image data such as 3D medical image, however, many applications using GANs focus on 2D images. To tackle the problem of 3D medical images, Wolterink et al. (2017) proposed a 3D conditional GANs model for noise reduction in low-dose CT images. In this paper, inspired by the remarkable success of GANs and to overcome the limitations of existing estimation methods, we propose a novel end-to-end framework based on 3D conditional GANs (3D c-GANs) to estimate the high-quality full-dose PET image from the corresponding low-dose PET image. Like the original GANs, the training procedure of our proposed 3D c-GANs is similar to a two-player min-max game in which a generator network ( $G$ ) and a discriminator network ( $D$ ) are trained alternatively to respectively minimize and maximize an objective function. The novelties and contributions of the paper are as follows.

i
To ensure the same size of the input and output of the generator network, we utilize both convolutional and up-convolutional layers in our generator architecture instead of using the traditional CNN network which just includes convolutional layers.
ii
To render the same underlying information between low-dose and full-dose PET images, we adopt a 3D U-net-like deep architecture as the generator network and use the skip connections strategy to combine hierarchical features for generating the estimated image. The detailed U-net-like architecture will be fully described in Section 2.2.1. The trained U-net-like generator can be directly applied to test images to synthesize the corresponding full-dose PET images, which is very efficient compared with those voxel-wise estimation methods.
iii
To take into account the differences between the estimated full-dose PET image and the ground truth (i.e., the real full-dose PET image), the estimation error loss is considered in the objective function to enhance the robustness of the proposed approach. Different from (Wolterink et al., 2017), we employ the L1 norm instead of L2 distance to encourage less blurring.
iv
To further improve the estimated image quality, we propose a concatenated 3D c-GANs based progressive refinement scheme.

The rest of this paper is organized as follows. Section 2 introduces our proposed 3D c-GANs architecture and methodology. Experimental setup is conducted in Section 3 and Section 4 gives the experimental results. Finally, we discuss and conclude this paper in Section 5.

Section snippets

Methodology

Fig. 1 illustrates the proposed 3D c-GANs training procedure, which constitutes two networks: the generator network $G$ and the discriminator network $D$ . The generator network takes a low-dose PET image and generates an estimated PET image that approximates its corresponding real full-dose PET image. The discriminator network takes a pair of images as input including both the low-dose PET and the corresponding real/estimated full-dose PET images, and it aims to differentiate between the real and

Data acquisition

We evaluate our 3D c-GANs based method on a real human brain dataset, including two categories: 8 normal subjects and 8 subjects diagnosed as mild cognitive impairment (MCI). The detailed demographic information of these subjects is summarized in Table 1.

The PET scans were acquired on the Siemens Biograph mMR PET-MR scanner. The full-dose PET scans in our cohort was administered an average of 203MBq (from 191 MBq to 229 MBq) of ¹⁸F-2-deoxyglucose (¹⁸FDG). This corresponds to an effective dose

Experimental results

To study the effectiveness of the proposed 3D c-GANs method, our experiment explores the following questions:

⁃
Compared with 2D model, does the 3D c-GANs model gain better performance?
⁃
Compared with the model just use the generator network (i.e., 3D U-net-like network), does the adversarial training in the 3D c-GANs model increase the estimation performance?
⁃
Does the model adopting the concatenated 3D c-GANs based progressive refinement scheme improve the estimation quality?
⁃
For the lesion regions

Conclusion

In this paper, we have proposed a novel end-to-end framework based on 3D c-GANs to estimate the high-quality full-dose PET images from the low-dose PET images. Difference from the original GANs that consider the image appearance slice by slice, our proposed method is carried on in a 3D manner, avoiding the discontinuous cross-slice estimation that occurs in 2D GANs models. To render the same underlying information between the low-dose and full-dose PET images, we employ a 3D U-net-like

Conflicts of interest

The authors declare no conflict of interest.

Acknowledgements

This work was supported by National Natural Science Foundation of China (NSFC61701324) and Australian Research Council (ARC DE160100241).

References (43)

C. Cerami et al.
Brain metabolic maps in mild cognitive impairment predict heterogeneity of progression to dementia
NeuroImage Clin.
(2015)
C. Huang et al.
Motion compensation for brain PET imaging using wireless MR active markers in simultaneous PET–MR: phantom and non-human primate studies
NeuroImage
(2014)
K. Kamnitsas et al.
Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation
Med. image Anal.
(2017)
J. Kawahara et al.
BrainNetCNN: convolutional neural networks for brain networks; towards predicting neurodevelopment
NeuroImage
(2017)
P.E. Kinahan et al.
Positron emission tomography-computed tomography standardized uptake values in clinical practice and assessing response to therapy
J. Kleesiek et al.
Deep MRI brain extraction: a 3D convolutional neural network for skull stripping
NeuroImage
(2016)
T. Rohlfing et al.
Evaluation of atlas selection strategies for atlas-based image segmentation with application to confocal microscopy images of bee brains
NeuroImage
(2004)
F. Shi et al.
LABEL: pediatric brain extraction using learning-based meta-algorithm
Neuroimage
(2012)
S.M. Smith et al.
Advances in functional and structural MR image analysis and implementation as FSL
Neuroimage
(2004)
S. Valverde et al.
Improving automated multiple sclerosis lesion segmentation with a cascaded 3D convolutional neural network approach
NeuroImage
(2017)

L. Xiang et al.

Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI

Neurocomputing

(2017)

L. An et al.

Multi-level canonical correlation analysis for PET image estimation

IEEE Trans. Image Process.

(2016)

M. Arjovsky et al.

Towards Principled Methods for Training Generative Adversarial Networks

(2017)

M. Arjovsky et al.

Wasserstein GAN

(2017)

U. Bagci et al.

Denoising PET images using singular value thresholding and stein's unbiased risk estimate

D.L. Bailey et al.

Positron Emission Tomography: Basic Sciences

(2005)

D. Berthelot et al.

BEGAN: Boundary Equilibrium Generative Adversarial Networks

(2017)

L. Bi et al.

Synthesis of positron emission tomography (PET) images via multi-channel generative adversarial networks (GANs)

X. Chen et al.

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

Daerr, S., Brendel, M., Zach, C., Mille, E., Schilling, D., Zacherl, M. J., Burger, K., Danek, A., Pogarell, O.,...

E.L. Denton et al.

Deep generative image models using a laplacian pyramid of adversarial networks

Cited by (346)

A 3D multi-scale CycleGAN framework for generating synthetic PETs from MRIs for Alzheimer's disease diagnosis
2024, Image and Vision Computing
This paper proposes a novel framework for generating synthesized PET images from MRIs to fill in missing PETs and help with Alzheimer's disease (AD) diagnosis. This framework employs a 3D multi-scale image-to-image CycleGAN architecture for the end-to-end translation of MRI and PET domains together. A hybrid loss function is also proposed to enforce structural similarity while preserving voxel-wise similarity and avoiding blurry images. As shown by the quantitative and visual assessment of the synthesized PETs, this framework is superior to the state-of-the-art. Moreover, using these synthesized PETs helps improve the ternary classification of AD subjects (AD vs. MCI vs. NC). Specifically, assuming an extreme case where none of the subjects has a PET, feeding the classifier with MRIs and their corresponding synthetic PETs results in a more accurate diagnosis than feeding it with just available MRIs. Accordingly, the proposed framework can help improve AD diagnosis, which is the final goal of the current study. Ablation investigation of the proposed multi-scale framework as well as the proposed loss function, is also conducted to study their contribution to the quality of synthesized PETs. Furthermore, other factors, such as stopping criteria, the type of normalization layer, the activation function, and dropouts, are examined, concluding that the appropriate use of these factors can significantly improve the quality of synthesized PETs.
SAGAN: Skip attention generative adversarial networks for few-shot image generation
2024, Digital Signal Processing: A Review Journal
The task of producing high-quality, realistic, and diverse images based on a few instances of newly emerging or long-tail categories is known as few-shot image generation. Despite prior works showing outstanding results, the quality and diversity of the outputs are still limited. In this paper, we tackle this problem by presenting a range of innovative fusion techniques based on attention mechanisms with generative adversarial networks. Our proposed framework introduces global skip attention that links matching residual blocks of symmetric encoder-decoder pairs to generate new instance objects. Additionally, we incorporate an alignment algorithm based on spatial transformer networks into our pipeline encoder to address feature misalignment. In the decoding phase of our attention-based decoder, we propose a novel attention mechanism within each fusion residual block, which leads to capturing long-range dependencies in feature maps. An attention reconstruction loss function has been proposed to balance adversarial training between the generator and discriminator, mitigate mode collapse, and guide the generator to focus on specific regions of interest within images. Finally, we apply a back summation to the decoding outputs, resulting in unified features through a weighted combination of similar characteristics. Extensive experiments conducted on five few-shot image datasets demonstrate the effectiveness of our proposed model. The source code of the proposed model can be found on GitHub https://github.com/Aldhubri/SAGAN.
Turning brain MRI into diagnostic PET: <sup>15</sup>O-water PET CBF synthesis from multi-contrast MRI via attention-based encoder–decoder networks
2024, Medical Image Analysis
Accurate quantification of cerebral blood flow (CBF) is essential for the diagnosis and assessment of a wide range of neurological diseases. Positron emission tomography (PET) with radiolabeled water (¹⁵O-water) is the gold-standard for the measurement of CBF in humans, however, it is not widely available due to its prohibitive costs and the use of short-lived radiopharmaceutical tracers that require onsite cyclotron production. Magnetic resonance imaging (MRI), in contrast, is more accessible and does not involve ionizing radiation. This study presents a convolutional encoder–decoder network with attention mechanisms to predict the gold-standard ¹⁵O-water PET CBF from multi-contrast MRI scans, thus eliminating the need for radioactive tracers. The model was trained and validated using 5-fold cross-validation in a group of 126 subjects consisting of healthy controls and cerebrovascular disease patients, all of whom underwent simultaneous ¹⁵O-water PET/MRI. The results demonstrate that the model can successfully synthesize high-quality PET CBF measurements (with an average SSIM of 0.924 and PSNR of 38.8 dB) and is more accurate compared to concurrent and previous PET synthesis methods. We also demonstrate the clinical significance of the proposed algorithm by evaluating the agreement for identifying the vascular territories with impaired CBF. Such methods may enable more widespread and accurate CBF evaluation in larger cohorts who cannot undergo PET imaging due to radiation concerns, lack of access, or logistic challenges.
Image synthesis of apparel stitching defects using deep convolutional generative adversarial networks
2024, Heliyon
In industrial manufacturing, the detection of stitching defects in fabric has become a pivotal stage in ensuring product quality. Deep learning-based fabric defect detection models have demonstrated remarkable accuracy, but they often require a vast amount of training data. Unfortunately, practical production lines typically lack a sufficient quantity of apparel stitching defect images due to limited research-industry collaboration and privacy concerns. To address this challenge, this study introduces an innovative approach based on DCGAN (Deep Convolutional Generative Adversarial Network), enabling the automatic generation of stitching defects in fabric. The evaluation encompasses both quantitative and qualitative assessments, supported by extensive comparative experiments. For validation of results, ten industrial experts marked 80% accuracy of the generated images. Moreover, Fréchet Inception Distance also inferred promising results. The outcomes, marked by high accuracy rate, underscore the effectiveness of proposed defect generation model. It demonstrates the ability to produce realistic stitching defective data, bridging the gap caused by data scarcity in practical industrial settings.
Deep clustering framework review using multicriteria evaluation
2024, Knowledge-Based Systems
The application of clustering has always been an important method for problem-solving. In the era of big data, most classical clustering methods suffer from the curse of dimensionality and scalability issues. Recently, deep clustering models have garnered more attention due to their capabilities in dealing with complex, high-dimensional, and large-scale datasets. They offer intriguing perspectives owing to their outstanding representative capacity and fast inference speed. The remaining major problem in clustering scenarios with high-dimensional data revolves around determining an appropriately compressed representation that semantically preserves cluster structures. Without labels, defining an objective function to encourage a suitable representation becomes a critical question. After several years of stagnation, impressive results have been achieved in the last two years. This paper proposes a comprehensive and up-to-date review of deep clustering methods. We first introduce the basic concepts shared by several deep clustering algorithms, available network architectures, and optimization strategies. Then, a detailed review is presented for each family by analyzing their most representative algorithms. These algorithms are then assessed based on their classification accuracy and from a multi-criteria perspective to aid investigators in selecting the most appropriate solution. Finally, an overview of the diversity of tasks and application domains is provided, and current issues and challenges are discussed.
Generative adversarial networks for spine imaging: A critical review of current applications
2024, European Journal of Radiology
In recent years, the field of medical imaging has witnessed remarkable advancements, with innovative technologies which revolutionized the visualization and analysis of the human spine. Among the groundbreaking developments in medical imaging, Generative Adversarial Networks (GANs) have emerged as a transformative tool, offering unprecedented possibilities in enhancing spinal imaging techniques and diagnostic outcomes. This review paper aims to provide a comprehensive overview of the use of GANs in spinal imaging, and to emphasize their potential to improve the diagnosis and treatment of spine-related disorders. A specific review focusing on Generative Adversarial Networks (GANs) in the context of medical spine imaging is needed to provide a comprehensive and specialized analysis of the unique challenges, applications, and advancements within this specific domain, which might not be fully addressed in broader reviews covering GANs in general medical imaging. Such a review can offer insights into the tailored solutions and innovations that GANs bring to the field of spinal medical imaging.
An extensive literature search from 2017 until July 2023, was conducted using the most important search engines and identified studies that used GANs in spinal imaging.
The implementations include generating fat suppressed T2-weighted (fsT2W) images from T1 and T2-weighted sequences, to reduce scan time. The generated images had a significantly better image quality than true fsT2W images and could improve diagnostic accuracy for certain pathologies. GANs were also utilized in generating virtual thin-slice images of intervertebral spaces, creating digital twins of human vertebrae, and predicting fracture response. Lastly, they could be applied to convert CT to MRI images, with the potential to generate near-MR images from CT without MRI.
GANs have promising applications in personalized medicine, image augmentation, and improved diagnostic accuracy. However, limitations such as small databases and misalignment in CT-MRI pairs, must be considered.

View all citing articles on Scopus

View full text

3D conditional generative adversarial networks for high-quality PET image estimation at low dose

Highlights

Abstract

Introduction

Section snippets

Methodology

Data acquisition

Experimental results

Conclusion

Conflicts of interest

Acknowledgements

NeuroImage Clin.

NeuroImage

Med. image Anal.

NeuroImage

NeuroImage

NeuroImage

Neuroimage

Neuroimage

NeuroImage

Neurocomputing

Multi-level canonical correlation analysis for PET image estimation

IEEE Trans. Image Process.

Towards Principled Methods for Training Generative Adversarial Networks

Wasserstein GAN

Denoising PET images using singular value thresholding and stein's unbiased risk estimate

Positron Emission Tomography: Basic Sciences

BEGAN: Boundary Equilibrium Generative Adversarial Networks

Synthesis of positron emission tomography (PET) images via multi-channel generative adversarial networks (GANs)

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

Deep generative image models using a laplacian pyramid of adversarial networks