ReviewDeep learning on image denoising: An overview
Introduction
Digital image devices have been widely applied in many fields, including recognition of individuals (Lei et al., 2016, Wen, Xu et al., 2020, Wen, Zhang et al., 2020), and remote sensing (Du, Wei, & Liu, 2019). The captured image is a degraded image from the latent observation, in which the degradation processing is affected by factors such as lighting and noise corruption (Zha et al., 2018, Zhang and Zuo, 2017). Specifically, the noise is generated in the processes of transmission and compression from the unknown latent observation (Xu, Zhang, & Zhang, 2018c). It is essential to use image denoising techniques to remove the noise and recover the latent observation from the given degraded image.
Image denoising techniques have attracted much attention in recent 50 years (Bernstein, 1987, Xu, Zhang, Zuo et al., 2015). At the outset, nonlinear and non-adaptive filters were used for image applications (Huang, 1971). Nonlinear filters can preserve the edge information to suppress the noise, unlike linear filters (Pitas & Venetsanopoulos, 1986). Adaptive nonlinear filters depended on local signal-to-noise ratios to derive an appropriate weighting factor for removing noise from an image corrupted by the combination of additive random, signal-dependent, impulse noise and additive random noise (Bernstein, 1987). Non-adaptive filters can simultaneously use edge information and signal-to-noise ratio information to estimate the noise (Hong & Bao, 2000). In time, machine learning methods, such as sparse-based methods were successfully applied in image denoising (Dabov, Foi, Katkovnik, & Egiazarian, 2007). A non-locally centralized sparse representation (NCSR) method used nonlocal self-similarity to optimize the sparse method, and obtained high performance for image denoising (Dong, Zhang, Shi, & Li, 2012). To reduce computational costs, a dictionary learning method was used to quickly filter the noise (Elad & Aharon, 2006). To recover the detailed information of the latent clean image, priori knowledge (i.e., total variation regularization) can smooth the noisy image in order to deal with the corrupted image (Osher et al., 2005, Ren, Zuo, Zhang et al., 2019). More competitive methods for image denoising can be found in Mairal et al., 2009, Zuo et al., 2014 and Zhang, Zuo, Chen, Meng and Zhang (2017), including the Markov random field (MRF) (Schmidt & Roth, 2014), the weighted nuclear norm minimization (WNNM) (Gu, Zhang, Zuo, & Feng, 2014), learned simultaneous sparse coding (LSSC) (Mairal et al., 2009), cascade of shrinkage fields (CSF) (Schmidt & Roth, 2014), trainable nonlinear reaction diffusion (TNRD) (Chen & Pock, 2016) and gradient histogram estimation and preservation (GHEP) (Zuo et al., 2014).
Although most of the above methods have achieved reasonably good performance in image denoising, they suffered from several drawbacks (Lucas, Iliadis, Molina, & Katsaggelos, 2018), including the need for optimization methods for the test phase, manual setting parameters, and a certain model for single denoising tasks. Recently, as architectures became more flexible, deep learning techniques gained the ability to overcome these drawbacks (Lucas et al., 2018).
The original deep learning technologies were first used in image processing in the 1980s (Fukushima & Miyake, 1982) and were first used in image denoising by Chiang and Sullivan (1989) and Zhou, Chellappa, and Jenkins (1987). That is, the proposed denoising work first used a neural network with both the known shift-invariant blur function and additive noise to recover the latent clean image. After that, the neural network used weighting factors to remove complex noise (Chiang & Sullivan, 1989). To reduce the high computational costs, a feedforward network was proposed to make a tradeoff between denoising efficiency and performance (Tamura, 1989). The feedforward network can smooth the given corrupted image by Kuwahara filters, which were similar to convolutions. In addition, this research proved that the mean squared error (MSE) acted as a loss function and was not unique to neural networks (Greenhill and Davies, 1994, de Ridder et al., 1999). Subsequently, more optimization algorithms were used to accelerate the convergence of the trained network and to promote the denoising performance (Bedini and Tonazzini, 1992, de Figueiredo and Leitao, 1992, Gardner et al., 1989). The combination of maximum entropy and primal–dual Lagrangian multipliers to enhance the expressive ability of neural networks proved to be a good tool for image denoising (Bedini & Tonazzini, 1990). To further make a tradeoff between fast execution and denoising performance, greedy algorithms and asynchronous algorithms were applied in neural networks (Paik & Katsaggelos, 1992). Alternatively, designing a novel network architecture proved to be very competitive in eliminating the noise, through either increasing the depth or changing activation function (Sivakumar & Desai, 1993). Cellular neural networks (CENNs) mainly used nodes with templates to obtain the averaging function and effectively suppress the noise (Nossek and Roska, 1993, Sivakumar and Desai, 1993). Although this proposed method can obtain good denoising results, it requires the parameters of the templates to be set manually. To resolve this problem, the gradient descent was developed (Lee and de Gyvez, 1996, Zamparelli, 1997). To a certain degree, these deep techniques can improve denoising performance. However, these networks did not easily allow the addition of new plug-in units, which limited their applications in the real world (Fukushima, 1980).
Based on the reasons above, convolutional neural networks (CNNs) were proposed (Lo et al., 1995, Ren, Pan et al., 2020). The CNN as well as the LeNet had real-world application in handwritten digit recognition (LeCun, Bottou, Bengio, Haffner, et al., 1998). However, due to the following drawbacks, they were not widely applied in computer systems (Krizhevsky, Sutskever, & Hinton, 2012). First, deep CNNs can generate vanishing gradients. Second, activation functions such as sigmoid (Marreiros, Daunizeau, Kiebel, & Friston, 2008) and tanh (Jarrett, Kavukcuoglu, Ranzato, & LeCun, 2009) resulted in high computational cost. Third, the hardware platform did not support the complex network. However, that changed in 2012 with AlexNet in that year’s ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) (Krizhevsky et al., 2012). After that, deep network architectures (e.g., VGG Simonyan & Zisserman, 2014 and GoogLeNet Szegedy et al., 2015) were widely applied in the fields of image (Li, Li et al., 2020, Wang et al., 2018, Wu and Xu, 2019), video (Liu, Lu et al., 2017, Yuan, Li et al., 2020), nature language processing (Duan et al., 2018) and speech processing (Zhang et al., 2018), especially low-level computer vision (Peng et al., 2019, Tian et al., 2019).
Deep networks were first applied in image denoising in 2015 (Liang and Liu, 2015, Xu, Zhang, Zhang et al., 2015). The proposed network need not manually set parameters for removing the noise. After then, deep networks were widely applied in speech (Zhang et al., 2015), video (Yuan, Fan and He, 2020) and image restoration (Ren, Shang et al., 2020, Tian, Xu, Zuo et al., 2020). Mao, Shen, and Yang (2016) used multiple convolutions and deconvolutions to suppress the noise and recover the high-resolution image. For addressing multiple low-level tasks via a model, a denoising CNN (DnCNN) (Zhang, Zuo, Chen et al., 2017) consisting of convolutions, batch normalization (BN) (Ioffe & Szegedy, 2015), rectified linear unit (ReLU) (Nair & Hinton, 2010) and residual learning (RL) (He, Zhang, Ren, & Sun, 2016) was proposed to deal with image denoising, super-resolution, and JPEG image deblocking. Taking into account the tradeoff between denoising performance and speed, a color non-local network (CNLNet) (Lefkimmiatis, 2017) combined non-local self-similarity (NLSS) and CNN to efficiently remove color-image noise.
In terms of blind denoising, a fast and flexible denoising CNN (FFDNet) (Zhang, Zuo, & Zhang, 2018a) presented different noise levels and the noisy image patch as the input of a denoising network to improve denoising speed and process blind denoising. For handling unpaired noisy images, a generative adversarial network (GAN) CNN blind denoiser (GCBD) (Chen, Chen, Chao and Yang, 2018) resolved this problem by first generating the ground truth, then inputting the obtained ground truth into the GAN to train the denoiser. Alternatively, a convolutional blind denoising network (CBDNet) (Guo, Yan, Zhang, Zuo, & Zhang, 2019) removed the noise from the given real noisy image by two sub-networks, one in charge of estimating the noise of the real noisy image, and the other for obtaining the latent clean image. For more complex corrupted images, a deep plug-and-play super-resolution (DPSR) method (Zhang, Zuo and Zhang, 2019) was developed to estimate blur kernel and noise, and recover a high-resolution image. Although other important research has been conducted in the field of image denoising in recent years, there have been only a few reviews to summarize the deep learning techniques in image denoising (Tian, Xu, Fei, & Yan, 2018). Although Ref. Tian et al. (2018) referred to a good deal work, it lacked more detailed classification information about deep learning for image denoising. For example, related work pretraining to unpaired real noisy images was not covered. To this end, we aim to provide an overview of deep learning for image denoising, in terms of both applications and analysis. Finally, we discuss the state-of-the-art methods for image denoising, including how they can be further expanded to respond to the challenges of the future, as well as potential research directions. An outline of this survey is shown in Fig. 1.
This overview covers more than 200 papers about deep learning for image denoising in recent years. The main contributions in this paper can be summarized as follows.
1. The overview illustrates the effects of deep learning methods on the field of image denoising.
2. The overview summarizes the solutions of deep learning techniques for different types of noise (i.e., additive white noise, blind noise, real noise and hybrid noise) and analyzes the motivations and principles of these methods in image denoising, where blind noise denotes noise of unknown types. Finally, we evaluate the denoising performance of these methods in terms of quantitative and qualitative analyses.
3. The overview points out some potential challenges and directions for deep learning in the use of image denoising.
The rest of this overview is organized as follows.
Section 2 discusses the popular deep learning frameworks for image applications. Section 3 presents the main categories of deep learning in image denoising, as well as a comparison and analysis of these methods. Section 4 offers a performance comparison of these denoising methods. Section 5 discusses the remaining challenges and potential research directions. Section 6 offers the authors’ conclusions.
Section snippets
Fundamental frameworks of deep learning methods for image denoising
This section offers a discussion of deep learning, including the ideas behind it, the main network frameworks (techniques), and the hardware and software, which is the basis of the deep learning techniques for image denoising covered in this survey.
Deep learning techniques for additive white noisy-image denoising
Due to the insufficiency of real noisy images, additive white noisy images (AWNIs) are widely used to train the denoising model (Jin, McCann, Froustey, & Unser, 2017). AWNIs include Gaussian, Poisson, Salt, Pepper and multiplicative noisy images (Farooque & Rohankar, 2013). There are several deep learning techniques for AWNI denoising, including CNN/NN; the combination of CNN/NN and common feature extraction methods; and the combination of the optimization method and CNN/NN.
Training datasets
The training datasets are divided into two categories: gray-noisy and color-noisy images. Gray-noisy image datasets can be used to train Gaussian denoisers and blind denoisers. They included the BSD400 dataset (Bigdeli, Zwicker, Favaro, & Jin, 2017) and Waterloo Exploration Database (Ma et al., 2016). The BSD400 dataset was composed of 400 images in .png format, and was cropped into a size of 180 × 180 for training a denoising model. The Waterloo Exploration Database consisted of 4744 nature
Discussion
Deep learning techniques are seeing increasing use in image denoising. This paper offers a survey of these techniques in order to help readers understand these methods. In this section, we present the potential areas of further research for image denoising and point out several as yet unsolved problems.
Image denoising based on deep learning techniques mainly are effective in increasing denoising performance and efficiency, and performing complex denoising tasks. Solutions for improving
Conclusion
In this paper, we compare, study and summarize the deep networks used for on image denoising. First, we show the basic frameworks of deep learning for image denoising. Then, we present the deep learning techniques for noisy tasks, including additive white noisy images, blind denoising, real noisy images and hybrid noisy images. Next, for each category of noisy tasks, we analyze the motivation and theory of denoising networks. Next, we compare the denoising results, efficiency and visual effects
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This paper is partially supported by the National Natural Science Foundation of China under Grant No. 61876051, in part by Shenzhen Municipal Science and Technology Innovation, Council under Grant No. JSGG20190220153602271 and in part by the Natural Science Foundation of Guang dong Province under Grant No. 2019A1515011811.
References (275)
- et al.
Three-dimensional optical coherence tomography image denoising through multi-input fully-convolutional networks
Computers in Biology and Medicine
(2019) - et al.
Neural network use in maximum entropy image restoration
Image and Vision Computing
(1990) - et al.
Image restoration preserving discontinuities: the bayesian approach and neural networks
Image and Vision Computing
(1992) - et al.
Aerial-image denoising based on convolutional neural network with multi-scale residual learning approach
Information
(2018) - et al.
End-to-end single image enhancement based on a dual network cascade model
Journal of Visual Communication and Image Representation
(2019) - et al.
Image denoising via deep network based on edge enhancement
Journal of Ambient Intelligence and Humanized Computing
(2018) Extended tanh-function method and its applications to nonlinear equations
Physics Letters. A
(2000)- et al.
Relative effectiveness of neural networks for image noise suppression
- et al.
Back-propagation algorithm which varies the number of hidden units
Neural Networks
(1991) - et al.
An edge-preserving subband coding model based on non-adaptive and adaptive regularization
Image and Vision Computing
(2000)
Tensorflow: A system for large-scale machine learning
A high-quality denoising dataset for smartphone cameras
Blind denoising of mixed Gaussian-impulse noise by single cnn
Neat image
K-svd: An algorithm for designing overcomplete dictionaries for sparse representation
IEEE Transactions on Signal Processing
Block-matching convolutional neural network for image denoising
Block-matching convolutional neural network (bmcnn): Improving cnn-based denoising by block-matched inputs
Proximal splitting networks for image restoration
Real image denoising with feature attention
Chaining identity mapping modules for image denoising
Kernel-predicting convolutional networks for denoising Monte Carlo renderings
ACM Transactions on Graphics
Greedy layer-wise training of deep networks
Adaptive nonlinear filters for simultaneous removal of different kinds of noise in images
IEEE Transactions on Circuits and Systems
Image restoration using autoencoding priors
Deep mean-shift priors for image restoration
Large-scale machine learning with stochastic gradient descent
Removing structured noise with self-supervised blind-spot networks
Image denoising: Can plain neural networks compete with bm3d?
Vggface2: A dataset for recognising faces across pose and age
Gan2gan: Generative noise learning for blind image denoising with single noisy images
Hsi-denet: Hyperspectral image restoration via convolutional neural network
IEEE Transactions on Geoscience and Remote Sensing
Light field denoising via anisotropic parallax analysis in a cnn framework
IEEE Signal Processing Letters
Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration
IEEE Transactions on Pattern Analysis and Machine Intelligence
Deep rnns for video denoising
Cudnn: Efficient primitives for deep learning
Multi-frame image restoration using a neural network
Gradient prior-aided cnn denoiser with separable convolution-based optimization of feature dimension
IEEE Transactions on Multimedia
Semi-supervised learning for low-dose ct image restoration with hierarchical deep generative adversarial network (hd-gan)
Keras
Image denoising using a deep encoder-decoder network with skip connections
Nonlocality-reinforced convolutional neural networks for image denoising
IEEE Signal Processing Letters
Pet image denoising using unsupervised deep learning
European Journal of Nuclear Medicine and Molecular Imaging
Image denoising by sparse 3-d transform-domain collaborative filtering
IEEE Transactions on Image Processing
Non-local video denoising by cnn
Cited by (617)
Ocean-wave suppression for synthetic aperture radar images by depth counteraction method
2024, Remote Sensing of EnvironmentDual-domain strip attention for image restoration
2024, Neural NetworksCar drag coefficient prediction using long–short term memory neural network and LASSO
2024, Measurement: Journal of the International Measurement ConfederationReducing redundancy in the bottleneck representation of autoencoders
2024, Pattern Recognition LettersA cross Transformer for image denoising
2024, Information Fusion