Elsevier

NeuroImage

Volume 52, Issue 4, 1 October 2010, Pages 1261-1267
NeuroImage

Evaluation of automated techniques for the quantification of grey matter atrophy in patients with multiple sclerosis

https://doi.org/10.1016/j.neuroimage.2010.05.029Get rights and content

Abstract

Several methods exist and are frequently used to quantify grey matter (GM) atrophy in multiple sclerosis (MS). Fundamental to all available techniques is the accurate segmentation of GM in the brain, a difficult task confounded even further by the pathology present in the brains of MS patients. In this paper, we examine the segmentations of six different automated techniques and compare them to a manually defined reference standard. Results demonstrate that, although the algorithms perform similarly to manual segmentations of cortical GM, severe shortcomings are present in the segmentation of deep GM structures. This deficiency is particularly relevant given the current interest in the role of GM in MS and the numerous reports of atrophy in deep GM structures.

Introduction

Multiple sclerosis (MS) is a chronic inflammatory disorder of the central nervous system. Focal white matter (WM) lesions represent the hallmark pathological finding of MS; however, increasing evidence from pathological studies has underscored the importance of grey matter (GM) involvement as well (Kutzelnigg et al., 2005, Peterson et al., 2001, Stadelmann et al., 2008). GM pathology does not seem to correlate with focal WM lesions (Bo et al., 2007, Caramanos et al., 2009), and neocortical GM volume loss has been shown to be related to worsening cognition (Amato et al., 2007). As our appreciation for the importance of GM pathology grows, reliable imaging methods are essential to accurately measure and analyze GM pathology in MS.

One of the challenges in classifying GM and WM in the brains of patients with MS results from the presence of WM lesions. Previous studies (Chard et al., 2002a, Sanfilipo et al., 2005) have shown that lesions lead to misclassifications of other tissues, the majority of which are WM erroneously labeled as GM. Even the most basic correction method of adding the lesion volume to the segmented WM volume may be insufficient to obtain accurate volumes for WM and GM compartments, as all segmentation failures are presumed to involve only WM lesions.

In the present study, we examine the GM classification results of six automated methods used to detect GM atrophy in the brains of MS patients. These include the two most commonly reported techniques: (a) a voxel-based morphometry (VBM) approach, executed most commonly with the statistical parametric mapping (SPM) software suite (Ashburner and Friston, 2000) and (b) SIENAx (Smith et al., 2002), as well as (c) FIRST (Patenaude, 2007), (d) Freesurfer (Fischl et al., 2002), (e) a classifier publicly available from the Montreal Neurological Institute (MNI) (Zijdenbos et al., 1998), and (f) a multispectral Bayesian classifier (MBC) designed specifically for segmenting the brains of MS patients (Francis, 2004). In contrast to previous studies that focused on lesion misclassification in MS, our current work is specific to the accuracy of GM segmentations, both for cortical GM (cGM) and deep GM structures (dGM).

Given the complexity of the cerebral anatomy, combined with partial volume effects present in MRI data, it is well known that manual segmentation is difficult and time consuming. Furthermore, differences in interpretation of image intensity and contrast with respect to the anatomy can lead to significant variability in voxel labeling between readers. In order to minimize errors and reduce variability, we decided to solicit expert readers (i.e., radiologists, neuroradiologists, and neurologists) trained in manual segmentation on MRI to obtain the highest quality manual GM segmentations possible. Given that the time of these experts is limited, we were restricted to analyzing a small number of slices on a small number of subjects.

The focus of this study is to explore the validity and the variability of some of the freely available automated methods currently being used to segment GM and to estimate GM atrophy in MS. Although the assessment of the six techniques listed above is limited to three slices within the brains of three subjects, this was enough to demonstrate that (a) there is variability in GM segmentation between the different software packages; (b) this variability is quite high for deep GM structures; and (c) users must be careful when interpreting the results of automatic classification programs and when comparing results between studies.

Section snippets

Subject and acquisition details

Three subjects with secondary progressive MS were selected from a multicenter clinical trials dataset. The subjects were chosen at random for their low, medium, and high WM lesion loads of 2.4 cm3, 8.6 cm3, and 24 cm3, respectively. Subjects’ scans were acquired from three different centers, all at a field strength of 1.5 T, and included T1, T2, and proton density (PD)-weighted sequences with a voxel size of 0.98 × 0.98 × 3 mm3. Consistent with previous reports of GM atrophy in MS, the T1w scan was used

Manual segmentations

Inter-reader variability was assessed by examining the mean DCSs for each pair of experts for each slice. The results are presented in Table 1. Two-way analysis of variance (ANOVA) was used to test for the effects of slice location (inferior, intermediate, superior) and WM lesion load (low, medium, high) on the experts’ mean DCSs. No significant interaction was found between location and lesion load (Fdf = 4,126 = 1.85, p = 0.12), but there was a main effect for both slice (F2,126 = 15.32, p < 0.0001)

Discussion

This study aimed to evaluate some of the most commonly used automated techniques for measuring GM atrophy on MRI data typically acquired in clinical trials. While previous studies have touched upon possible pitfalls of some of these techniques (Chard et al., 2002a, Giorgio et al., 2008, Lee and Prohovnik, 2008, Sanfilipo et al., 2005), to the best of our knowledge, ours is the first to explore the problem at the root of every technique, namely, GM segmentation.

Undoubtedly, the accurate

Conclusion

In summary, we evaluated the GM segmentations of several commonly used automated techniques for the detection of atrophy in MS. Results demonstrate that, although the algorithms perform similarly to manual segmentations of cortical GM, severe shortcomings exist in the segmentation of deep GM structures. Such misclassifications are of particular importance in studies on MS given that their magnitude can be more than four times the annual rate of atrophy. In general, given the specificity of

Acknowledgments

This research was supported in part by the NSERC/MITACS Industrial Postgraduate Scholarship Program and the endMS society of Canada. We thank the expert readers for their manual segmentations: Tao Li, David Araujo, Xu Liu, and Ling Han. We also thank Simon Warfield for his discussion and assistance with STAPLE. STAPLE is supported in part by NIH R01 RR021885 from the National Center for Research Resources and by an award from the Neuroscience Blueprint I/C through R01 EB008015 from the National

References (47)

  • E. Portaccio et al.

    Neocortical volume decrease in relapsing-remitting multiple sclerosis with mild cognitive impairment

    J. Neurol. Sci.

    (2006)
  • A. Prinster et al.

    Grey matter loss in relapsing-remitting multiple sclerosis: a voxel-based morphometry study

    Neuroimage

    (2006)
  • M.P. Sanfilipo et al.

    The relationship between whole brain volume and disability in multiple sclerosis: a comparison of normalized gray vs. white matter with misclassification correction

    Neuroimage

    (2005)
  • J. Sastre-Garriga et al.

    Brain volumetry counterparts of cognitive impairment in patients with multiple sclerosis

    J. Neurol. Sci.

    (2009)
  • S.M. Smith et al.

    Accurate, robust, and automated longitudinal and cross-sectional brain change analysis

    Neuroimage

    (2002)
  • M.P. Amato et al.

    Association of neocortical volume changes with cognitive deterioration in relapsing-remitting multiple sclerosis

    Arch. Neurol.

    (2007)
  • V.M. Anderson et al.

    Magnetic resonance imaging measures of brain atrophy in multiple sclerosis

    J. Magn. Reson. Imaging

    (2006)
  • R. Antulov et al.

    Gender-related differences in MS: a study of conventional and nonconventional MRI measures

    Mult. Scler.

    (2009)
  • B. Audoin et al.

    Voxel-based analysis of MTR images: a method to locate gray matter abnormalities in patients at the earliest stage of multiple sclerosis

    J. Magn. Reson. Imaging

    (2004)
  • B. Audoin et al.

    Localization of grey matter atrophy in early RRMS: a longitudinal study

    J. Neurol.

    (2006)
  • M. Battaglini et al.

    P705] Influence of white matter lesions in the measurement of MRI-derived brain columes

  • L. Bo et al.

    Lack of correlation between cortical demyelination and white matter pathologic changes in multiple sclerosis

    Arch. Neurol.

    (2007)
  • D.T. Chard et al.

    Brain atrophy in clinically early relapsing-remitting multiple sclerosis

    Brain

    (2002)
  • Cited by (79)

    View all citing articles on Scopus
    View full text