Non-local statistical label fusion for multi-atlas segmentation
Graphical abstract
Highlights
► We propose a novel statistical fusion algorithm, Non-Local STAPLE (NLS). ► NLS seamlessly integrates intensity into the estimation process. ► NLS provides a theoretically consistent model of multi-atlas observation error. ► NLS largely diminishes the need for large atlas sets and high quality registration. ► We demonstrate superior performance over the state-of-the-art approaches.
Introduction
Segmentation of anatomical structures on medical images is essential for scientific inquiry into the complex relationships between biological structure and function as well as clinical diagnosis, treatment, and assessment. The long-held “gold standard” for highly robust segmentation has been through expert manual delineation (Crespo-Facorro et al., 1999, Tsang et al., 2008). Yet, manual delineation is extremely resource consuming and plagued by inter- and intra-rater variability (e.g., 10–20% by volume (Ashton et al., 2003, Joe et al., 1999)). Alternatively, fully-automated algorithms often result in robust and accurate estimations for specific classes of problems (e.g., brain-tissue classification (Cocosco et al., 2003, Van Leemput et al., 1999, Wells et al., 1996), optic nerve segmentation (Noble and Dawant, 2011)). Unfortunately, the success of automated techniques is often dependent upon the application, modality, and image quality (Fischl et al., 2002, Heckemann et al., 2006, Rohlfing et al., 2004a, Yeo et al., 2008).
Atlas-based segmentation methods form a middle-ground between fully-manual and fully-automatic segmentation approaches (Collins et al., 1995, Gee et al., 1993). In atlas-based models, spatial information is transferred from an existing dataset (labeled atlas) to a previously unseen context (target) through deformable registration. Proposed extensions enable the summary of multiple atlases into a common coordinate system by constructing (1) unbiased average atlases (Guimond et al., 2000, Joshi et al., 2004) and (2) target-specific atlases (Commowick et al., 2009, Ericsson et al., 2008). Yet, the accuracy of single-atlas based methods is limited due to the bias concerns and lack of correspondence to the target (Ashburner and Friston, 2005, Han and Fischl, 2007). Thus, an alternative strategy that independently utilizes multiple atlases (i.e., multi-atlas segmentation) has come to represent the de facto standard baseline for atlas techniques. In multi-atlas segmentation (Heckemann et al., 2006, Rohlfing et al., 2004b), multiple atlases are separately registered to the target and the voxelwise label conflicts between the registered atlases are resolved using label fusion.
Perhaps surprisingly, a majority vote, the simplest fusion strategy, has been shown to result in highly robust segmentations (Aljabar et al., 2009, Heckemann et al., 2006, Rohlfing et al., 2004a, Rohlfing and Maurer, 2007). More recently, weighted voting strategies that use global (Artaechevarria et al., 2009, Chen et al., 2012), local (Isgum et al., 2009, Sabuncu et al., 2010, Wang et al., 2011), semi-local (Sabuncu et al., 2010, Wang et al., 2012), and non-local (Coupé et al., 2011) intensity similarity metrics have demonstrated consistent improvement in segmentation accuracy. Particularly for neurological applications, highly local weights have provided the most consistent results in segmentation quality (Artaechevarria et al., 2009, Sabuncu et al., 2010).
In contrast to ad hoc voting, statistical fusion strategies (e.g., Simultaneous Truth and Performance Level Estimation, STAPLE (Warfield et al., 2004)) directly integrate a stochastic model of rater behavior into the estimation process. Despite elegant theory and success on human raters, applications to the multi-atlas context have proven problematic (Asman and Landman, 2011a, Sabuncu et al., 2010, Wang et al., 2011, Wang et al., 2012). In response, a myriad of advancements to the STAPLE framework have been proposed to account for (1) spatially varying task difficulty (Asman and Landman, 2011b, Rohlfing et al., 2004b), (2) spatially varying rater performance (Asman and Landman, 2011a, Asman and Landman, 2012a, Commowick et al., 2012, Weisenfeld and Warfield, 2011), and (3) instabilities in the rater performance level parameters (Commowick and Warfield, 2010, Landman et al., 2011b). Yet, these advanced techniques remain inherently models of human observation error as they fail to directly incorporate the image intensity differences between the atlases and the target. Moreover, initial attempts to incorporate intensity into the STAPLE framework have relied upon ad hoc extensions that simply ignore voxels based upon a priori similarity measures (Cardoso et al., 2011, Weisenfeld and Warfield, 2011).
Regardless of the approach, label fusion models have consistently made an implicit assumption that the use of multiple atlases results in a voxelwise, collectively unbiased representation of the target. This assumption is manifested through the fact that nearly all fusion algorithms determine the optimal label using only directly corresponding intensity and label information. Ergo, multi-atlas methods are generally dependent upon highly accurate registration and the use of large numbers of atlases. We are left with several problems in multi-atlas segmentation: (1) a dependence on large-scale, high-quality registrations, (2) voting-based algorithms lack the theoretical underpinning of statistical fusion observation models and (3) statistical fusion algorithms fail to incorporate intensity information. Thus, previous approaches have failed to accurately model the stochastic process of registered atlas observation error.
Meanwhile, a relatively new framework in the field of image analysis, non-local means, has gained momentum in terms of quantifying complex image characteristics (e.g., noise structure, spatially varying correspondence). In non-local means, images are deconstructed into a collection of small volumetric patches and the similarity or correspondence between these patches is quantified to learn the underlying image structure (Buades et al., 2005). The non-local means framework has emerged in the context of image de-noising (Buades et al., 2005, Coupé et al., 2006, Kervrann et al., 2007, Liu et al., 2008, Manjón et al., 2008, Van De Ville and Kocher, 2009). However, more recent work has demonstrated the applicability of non-local means to new applications such as synthesizing image contrast (Roy et al., 2010a), in-painting (Sun and Tappen, 2011), and image segmentation (Coupé et al., 2011, Roy et al., 2010b).
Herein, we propose a novel statistical fusion algorithm (Non-Local STAPLE – NLS) that reformulates the STAPLE framework from a non-local means perspective. NLS models the registered atlases as collections of volumetric patches containing both intensity and label information and uses the non-local criteria (Buades et al., 2005, Coupé et al., 2011) to resolve imperfect correspondence. Through this reformulation, we seamlessly integrate exogenous intensity information into the estimation process to provide a theoretically consistent model of multi-atlas observation error. NLS provides a model in which we learn which label each atlas would have observed given perfect correspondence with the target. This presentation is an extension and generalization of a recently published conference paper (Asman and Landman, 2012b). Herein, we provide additional examples, derivations and insights that were not part of the original conference publication.
In this manuscript, we begin by deriving the theoretical basis and the parameters for initialization and convergence governing NLS. Next, we demonstrate significant improvement over the state-of-the-art fusion algorithms on two distinct datasets: (1) computed tomography (CT) images for thyroid segmentation and (2) structural magnetic resonance (MR) images for whole-brain segmentation. For whole-brain segmentation, we demonstrate that NLS dramatically lessens the need for large-scale and highly accurate non-rigid registration. Lastly, we provide insight into the sensitivity of NLS to the various model parameters, assess the optimality of the algorithm, and provide a comparison to a direct application of non-local voting.
Section snippets
Theory
The following presentation provides the theoretical model governing NLS in the commonly used Expectation–Maximization (EM) framework (Dempster et al., 1977). For clarity and consistency, the notation closely follows the presentation of the original STAPLE algorithm (Warfield et al., 2004).
Methods and results
An implementation of the Non-Local STAPLE algorithm is available as part of the Java Image Science Toolkit (JIST, www.nitrc.org/projects/jist).
Discussion
Non-Local STAPLE represents the first statistical fusion algorithm that seamlessly incorporates intensity into the estimation process and creates a cohesive theoretical model specifically targeting registered atlas observation behavior. Additionally, NLS largely overcomes several of the current obstacles that plague multi-atlas segmentation including the need for high-quality non-rigid registration and large numbers of atlases. These goals are accomplished through the reformulation of the
Conclusions
We have derived and investigated Non-Local STAPLE, a new statistical fusion algorithm for multi-atlas segmentation. Through a reformulation from a non-local means perspective, NLS represents the first statistical fusion algorithm that (1) creates a cohesive theoretical model specifically targeting registered atlas observation behavior, and (2) seamlessly incorporates intensity into the core of the STAPLE estimation framework. As a result, NLS largely overcomes the need for high-quality
Acknowledgements
This research was supported by NIH Grants 1R21NS064534 (Prince/Landman), 2R01EB006136 (Dawant), 1R03EB012461 (Landman) and R01EB006193 (Dawant). This work was conducted in part using the resources of the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University, Nashville, TN. The authors are grateful to Dr. Benoit Dawant for the labeled thyroid dataset and Dr. Andrew Worth (NeuroMorphometrics, Inc.) for the exquisitely labeled whole-brain dataset.
References (63)
- et al.
Multi-atlas based segmentation of brain images: atlas selection and its effect on accuracy
Neuroimage
(2009) - et al.
Unified segmentation
Neuroimage
(2005) - et al.
A fully automatic and robust brain MRI tissue classification method
Medical Image Analysis
(2003) - et al.
Patch-based segmentation using expert priors: application to hippocampus and ventricle segmentation
Neuroimage
(2011) - et al.
Human frontal cortex: an MRI-based parcellation method
Neuroimage
(1999) - et al.
Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain
Neuron
(2002) - et al.
Multi-atlas multi-shape segmentation of fetal brain MRI for volumetric and morphometric analysis of ventriculomegaly
Neuroimage
(2012) - et al.
Average brain models: a convergence study
Computer Vision and Image Understanding
(2000) - et al.
Automatic anatomical brain MRI segmentation combining label propagation and decision fusion
Neuroimage
(2006) - et al.
A global optimisation method for robust affine registration of brain images
Medical Image Analysis
(2001)
Unbiased diffeomorphic atlas construction for computational anatomy
Neuroimage
Fast and robust multi-atlas segmentation of brain magnetic resonance images
Neuroimage
MRI denoising using non-local means
Medical Image Analysis
An atlas-navigated optimal medial axis and deformable model algorithm (NOMAD) for the segmentation of the optic nerves and chiasm in MR and CT images
Medical Image Analysis
Evaluation of atlas selection strategies for atlas-based image segmentation with application to confocal microscopy images of bee brains
Neuroimage
Combining atlas based segmentation and intensity classification with nearest neighbor transform and accuracy weighted vote
Medical Image Analysis
Combination strategies in multi-atlas image segmentation: application to brain MR data
IEEE Transactions on Medical Imaging
Accuracy and reproducibility of manual and semiautomated quantification of MS lesions by MRI
Journal of Magnetic Resonance Imaging
Characterizing spatially varying performance to improve multi-atlas multi-label segmentation
Information Processing in Medical Imaging (IPMI)
Robust statistical label fusion through consensus level, labeler accuracy and truth estimation (COLLATE)
IEEE Transactions on Medical Imaging
Formulating spatially varying performance in the statistical fusion framework
IEEE Transactions on Medical Imaging
Non-local STAPLE: an intensity-driven multi-atlas rater model
Medical Image Computing and Computer-Assisted Intervention (MICCAI)
Simultaneous Segmentation and Statistical Label Fusion
Dynamic programming and Lagrange multipliers
Proceedings of the National Academy of Sciences of the United States of America
A non-local algorithm for image denoising
Computer Vision and Pattern Recognition (CVPR), IEEE
Evaluation of multiple-atlas-based strategies for segmentation of the thyroid gland in head and neck CT images for IMRT
Physics in Medicine and Biology
Automatic 3-D model-based neuroanatomical segmentation
Human Brain Mapping
Estimating a reference standard segmentation with spatially varying performance parameters: local MAP STAPLE
IEEE Transactions on Medical Imaging
Cited by (195)
Structural brain differences in essential tremor and Parkinson's disease deep brain stimulation patients
2023, Journal of Clinical NeuroscienceLearning from multiple annotators for medical image segmentation
2023, Pattern RecognitionEvolution of multiorgan segmentation techniques from traditional to deep learning in abdominal CT images – A systematic review
2022, DisplaysCitation Excerpt :STAPLE approach assigns weights to propagation labels based on estimated accuracy but voxel intensity inconsistencies are ignored here. Non local STAPLE reformulates STAPLE framework and provides consistent model with reducing the need for large datasets and good quality registrations [4,3]. Among all weighted schemes, intensity similarity weight assignment is the most successful.
Robust Bayesian fusion of continuous segmentation maps
2022, Medical Image Analysis