Elsevier

NeuroImage

Volume 39, Issue 4, 15 February 2008, Pages 1654-1665
NeuroImage

The impact of skull-stripping and radio-frequency bias correction on grey-matter segmentation for voxel-based morphometry

https://doi.org/10.1016/j.neuroimage.2007.10.051Get rights and content

Abstract

This study evaluates the application of (i) skull-stripping methods (hybrid watershed algorithm (HWA), brain surface extractor (BSE) and brain-extraction tool (BET2)) and (ii) bias correction algorithms (nonparametric nonuniform intensity normalisation (N3), bias field corrector (BFC) and FMRIB's automated segmentation tool (FAST)) as pre-processing pipelines for the technique of voxel-based morphometry (VBM) using statistical parametric mapping v.5 (SPM5). The pipelines were evaluated using a BrainWeb phantom, and those that performed consistently were further assessed using artificial-lesion masks applied to 10 healthy controls compared to the original unlesioned scans, and finally, 20 Alzheimer's disease (AD) patients versus 23 controls. In each case, pipelines were compared to each other and to those from default SPM5 methodology. The BET2 + N3 pipeline was found to produce the least miswarping to template induced by real abnormalities, and performed consistently better than the other methods for the above experiments. Occasionally, the clusters of significant differences located close to the boundary were dragged out of the glass-brain projections—this could be corrected by adding background noise to low-probability voxels in the grey matter segments. This method was confirmed in a one-dimensional simulation and was preferable to threshold and explicit (simple) masking which excluded true abnormalities.

Introduction

Voxel-based morphometry or VBM (Ashburner and Friston, 2000) is a frequently employed method for evaluating regional differences in grey-matter density. Data sets can be compared on a voxel-by-voxel basis since the structural magnetic resonance (MR) images are normalised to the same standard space, and segmented into grey matter (GM) and white matter (WM) prior to statistical analysis. The latest statistical parametric mapping (SPM) release, SPM5, enables spatial normalisation, tissue classification and radio-frequency (r.f.) bias correction (BC) to be combined within the same model (Ashburner and Friston, 2005).

Spatial normalisation inaccuracy is an important limiting factor for the validity of the VBM results (Bookstein, 2001). Systematic miswarping to template of a given structure across groups may lead to either false positives or false negatives. Therefore, spatial smoothing is also required prior to statistical analysis in order to cope, not only with miswarping, but also with inter-subject variations in anatomy, to improve the signal-to-noise ratio and to render the data more normally distributed. Smoothing MR data with a Gaussian filter, however, limits spatial selectivity and significant clusters should not be interpreted as anatomically precise results.

A recent study found that VBM using SPM2 can be improved by skull-stripping MR images prior to analysis (Fein et al., 2006). Many other studies have contrasted the performance of different skull-stripping algorithms (Boesen et al., 2004, Fennema-Notestine et al., 2006, Rex et al., 2004, Segonne et al., 2004, Smith, 2002; and Zhuang et al., 2006), but did not assess the impact on VBM. It is also clear that correcting for nonuniform tissue intensities as a result of r.f. bias may improve VBM sensitivity and accuracy; several bias correction methods have been reported and quantitatively compared to each other by Arnold et al., 2001, Shattuck et al., 2001, Sled et al., 1998 and Vovk et al. (2007). Given that SPM5 has incorporated nonuniformity correction as well as image registration and tissue classification in its generative model, the purpose of the present study was to investigate the impact that this new SPM approach would have on smoothed modulated, normalised GM segments in comparison to those derived from the most widely used pre-processing methods; these are enumerated below:

  • Skull-stripping

    • Hybrid watershed algorithm (HWA) using atlas information in FreeSurfer v.3.04 (http://surfer.nmr.mgh.harvard.edu). HWA makes use of local statistics for the template deformation and integrates an atlas-based term constraining the shape of the brain (Segonne et al., 2004).

    • Brain surface extractor (BSE) in BrainSuite v.2.0 (http://brainsuite.usc.edu). BSE combines edge-detection and morphology-based techniques. Adaptive anisotropic diffusion, edge detection and morphological erosions are used to identify the brain component (Shattuck et al., 2001).

    • Brain extraction tool v.2.1 (BET2) in FSL (http://www.fmrib.ox.ac.uk/fsl). BET2 is based on regional properties of the image; the forces pushing the template outward are locally computed at each vertex (Smith et al., 2002).

  • Bias correction

    • Nonparametric nonuniform intensity normalisation v.1.10 (N3) in FreeSurfer v.3.04. N3 corrects intensity nonuniformities without requiring a model of tissue classes. It uses a deconvolution kernel to sharpen the histogram plots that have been smoothed by the bias field (Sled et al., 1998).

    • Bias field corrector (BFC) in BrainSuite v.2.0. BFC computes local estimates of bias fields uniformly spaced throughout the volume using an adaptive partial volume tissue model (Shattuck et al., 2001).

    • FMRIB's Automated Segmentation Tool v.3.53 (FAST) in FSL. FAST uses a hidden Markov random field model and an associated expectation–maximisation algorithm to classify the brain into different tissue types and to correct for intensity nonuniformities (Zhang et al., 2001).

The pre-processing pipelines were first evaluated with a phantom, and those that showed the best results were then used for VBM analyses of simulated lesions in healthy controls and real patient data.

Section snippets

Methods

The pre-processing pipelines were evaluated on the T1-weighted MRI BrainWeb phantom (http://www.bic.mni.mcgill.ca/brainweb/selection_normal.html) with a noise level of 3% and 40% bias as shown in Fig. 1. This phantom was chosen because it resembles, in terms of signal-to-noise ratio (SNR) and intensity r.f. bias, the average scan used in this study, acquired with a 1.5-T GE Signa MRI scanner (GE Medical Systems, Milwaukee, WI, USA) using a T1-weighted 3-dimensional (3-D) inversion-recovery fast

Simulated-lesion study

Although results derived from phantom studies can offer useful insights, they do not necessary capture the full complexity of real scan conditions (e.g. motion artefacts, complex r.f. bias fields, etc.). Therefore the next step was to evaluate the performance of the pipelines using real data. However, assessing the impact of these methods on real MR data with patient data sets is limited by the lack of a ground-truth T-map. Therefore, we created simulated lesions to the temporal lobes (right

Methods

The same volumes and processing steps described in the simulated-lesion study were used in this experiment. However, the Haircut procedure followed tissue classification; random noise was added to low-probability voxels from the GM segments prior to smoothing. An empirical threshold an order of magnitude lower than expected GM probabilities in GM regions was set to 0.05. Noise uniformly distributed between 0 and 0.05 was only added to probabilities lying below this threshold. In addition, a

Methods

We designed a synthetic, 1-D, controls versus patients' experiment, where the central row of the mid-axial slice from the re-sampled phantom fuzzy (91 voxels) was used as a baseline to generate data as shown in Fig. 5. “Healthy” regions of synthetic patients and all synthetic controls were obtained as follows:

  • if Pbaseline > 0.025, then P = |Pbaseline  Ps|; where < Ps > = 0, σ = 0.2

  • if Pbaseline < 0.025, then P = Pbaseline

Ps was a random number from the normal distribution with mean parameter, < Ps >, and

Patient study

Having demonstrated the comparative validity and accuracy of the methods in phantom, simulated-lesion and synthetic data, the following study compared the impact on real patient data.

Discussion

It is known that clusters of statistically significant differences found in VBM may lead to ambiguous results due to systematic miswarping of a particular region across groups (Bookstein, 2001); this is a topic of ongoing discussion among the scientific community. This study was not intended to solve this problem, but rather, to evaluate the impact that pre-VBM pipelines (skull-stripping and bias correction) have on VBM results by comparing them to each other and to those derived from

Conclusion

In conclusion, the experiments indicated that although SPM5 includes segmentation, spatial normalisation and bias correction in the same model, pre-processing with BET2 + N3 prior to SPM5 appeared to improve VBM. We also reported that artefactual displacement of significant clusters from the cortical surface can be successfully corrected by addition of background noise to low-probability GM voxels (Haircut method); this has advantages over threshold and explicit masking in that it does not

Acknowledgments

We gratefully acknowledge Professor John R. Hodges for identifying patients as well as the participants themselves and their relatives for their continued support with our research. This research was funded by the Medical Research Council (MRC), U.K.

Cited by (0)

View full text