Elsevier

NeuroImage

Volume 50, Issue 2, 1 April 2010, Pages 516-523
NeuroImage

Robust atrophy rate measurement in Alzheimer's disease using multi-site serial MRI: Tissue-specific intensity normalization and parameter selection

https://doi.org/10.1016/j.neuroimage.2009.12.059Get rights and content

Abstract

We describe an improved method of measuring brain atrophy rates from serial MRI for multi-site imaging studies of Alzheimer's disease (AD). The method (referred to as KN-BSI) improves an existing brain atrophy measurement technique—the boundary shift integral (classic-BSI), by performing tissue-specific intensity normalization and parameter selection. We applied KN-BSI to measure brain atrophy rates of 200 normal and 141 AD subjects using baseline and 1-year MRI scans downloaded from the Alzheimer's Disease Neuroimaging Initiative database. Baseline and repeat images were reviewed as pairs by expert raters and given quality scores. Including all image pairs, regardless of quality score, mean KN-BSI atrophy rates were 0.09% higher (95% CI 0.03% to 0.16%, p = 0.007) than classic-BSI rates in controls and 0.07% higher (− 0.01% to 0.16%, p = 0.07) higher in ADs. The SD of the KN-BSI rates was 22% lower (15% to 29%, p < 0.001) in controls and 13% lower (6% to 20%, p = 0.001) in ADs, compared to classic-BSI. Using these results, the estimated sample size (needed per treatment arm) for a hypothetical trial of a treatment for AD (80% power, 5% significance to detect a 25% reduction in atrophy rate) would be reduced from 120 to 81 (a 32% reduction, 95% CI = 18% to 45%, p < 0.001) when using KN-BSI instead of classic-BSI. We concluded that KN-BSI offers more robust brain atrophy measurement than classic-BSI and substantially reduces sample sizes needed in clinical trials.

Introduction

Large multi-site clinical studies provide a powerful way to understand diseases and their treatments. In recent years, neuroimaging outcomes have increasingly been incorporated into such studies (Horn and Toga; 2009; Barkhof et al., 2009). However, information is often lacking about the robustness and variability of these outcomes in a multi-site setting. The Alzheimer's Disease Neuroimaging Initiative (ADNI) was established partly to address this issue. ADNI included subjects from over 50 sites across the U.S. and Canada, and its aims include testing the ability of serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological and imaging markers, and clinical and neuropsychological assessments to measure progression in mild cognitive impairment (MCI) and early Alzheimer's disease (AD) (Mueller et al., 2005).

The use of images from different sites and scanners brings particular challenges for image analysis algorithms with the potential to lose sensitivity and introduce systematic errors (Stonnington et al., 2008). Increased variability in the outcome measure leads to a corresponding loss of power to detect treatment effects. For longitudinal studies the stability of image acquisition is critical but may be compromised in several ways. For MRI, variability in the outcome may be affected by: (1) image intensity variation due to subject-specific noise, noise in the electronics, and imaging gradient non-linearities (Sled et al., 1998; Lewis and Fox; 2004), (2) variability in distortion fields due to differences in subject positioning (Jovicich et al., 2006), (3) voxel size variation due to drift in the strength of the applied read out gradient (i.e. calibration drift) (Clarkson et al., 2009), (4) imaging protocol differences between scanners and between baseline and repeat scans (due to scanner hardware and software changes during the study) (Preboske et al., 2006); and (5) differences in scanner calibration and quality control procedures (Whitwell et al., 2004). Although much effort has been put in to address these problems, e.g. intensity inhomogeneity correction (Sled et al., 1998), distortion field correction (Jovicich et al., 2006), voxel size correction based on geometric phantom (Gunter et al., 2006) or image registration (Clarkson et al., 2009), intensity and geometric distortion artifacts and contrast differences still exist in the images. These errors interact in a complex manner and affect the results from different image analysis algorithms in a large multi-site clinical study. Images are often reviewed by expert raters as part of the quality control in clinical studies, so that those with unacceptable errors or artifacts can be excluded from subsequent analysis. However, the exclusion of images (and hence subjects) decreases the statistical power of the study and, more importantly, may introduce bias if the outcome values for the excluded images differ systematically from those included.

The aim of this paper is to increase the robustness and reproducibility of brain atrophy measurement in multi-site image studies. The boundary shift integral (BSI) is a semi-automated measure of regional and global cerebral atrophy rates from serial MRI which uses intra-subject image registration to give higher precision than is typically possible with manual measures (Freeborough and Fox; 1997). The BSI has been used to assess atrophy progression in clinical trials in AD (Fox et al., 2005), and in a number of natural history studies in a range of neurological disorders, including AD (Ridha et al., 2006; Freeborough and Fox; 1997), frontotemporal dementia (Chan et al., 2001), multiple sclerosis (Anderson et al., 2007) and Huntington's disease (Henley et al., 2006). The BSI estimates the changes in cerebral volume using differences in voxel intensities between two serial MRI volume scans at the boundary region of the brain. In order to accurately measure brain atrophy using BSI, the intensity of the same tissue in the baseline and repeat scans should be as similar as possible. The classic BSI technique employs intensity normalization between baseline and repeat images by dividing the intensity on each scan by the mean intensity of the interior region of the brain (consisting mainly of white matter). Where there is the possibility of tissue contrast changes over time this is not an ideal approach because it does not take into account the intensity changes of individual tissue types in the brain, namely cerebrospinal fluid (CSF), gray matter (GM) and white matter (WM), relative to each other. Furthermore, an intensity window parameter must be chosen in the calculation of BSI, in order to correctly capture the intensity transitions associated with the brain boundary. The optimal value is largely dependent on the signal-to-noise ratio (SNR) and the image intensity of CSF and GM. Existing protocols make use of a single BSI intensity window for all the images from all the imaging sites; however different images acquired from different sites may have different tissue contrasts and SNRs with different optimal BSI intensity windows. Ideally the choice of that optimal window should be automated and unbiased, and based upon the intrinsic tissue contrast and SNR in the image pairs of a particular subject produced by a particular scanner and acquisition protocol.

Few papers have explicitly addressed the problems of MR image intensity normalization and standardization. Nyúl and Udupa used a two-step approach to standardize MR image intensity to a standard intensity scale, so that specific tissue types have a similar intensity (Nyúl and Udupa; 1999). The first step (‘training step’) involved finding the parameters of the standardizing transform from a set of images. The second step (‘transformation step’) applied the learnt parameters to transform the intensity of a new image into the standardized histogram. Madabhushi and Udupa later used scale-space concepts to accurately identify principal regions used for the training step (Madabhushi and Udupa; 2006). Christensen reported the use of even-ordered derivatives of the image histogram to determine a single global scaling factor between two images (Christensen; 2003). The model of a single global scaling factor is the same as the model of intensity normalization in the classic-BSI. Weisenfeld and Warfield proposed the use of Kullback-Leibler divergence to match the intensity distribution of two images (Weisenfeld and Warfield; 2004). Since disease progression in AD will cause changes in the histogram model (changes in the relative heights and spread of the CSF/GM/WM peaks) in the repeat image, the methods proposed by Weisenfeld and Warfield may introduce bias in the BSI.

Many image processing algorithms have a set of customizable parameters to allow the users to adapt the algorithms to specific problems (e.g. biological and image quality variability) (Fennema-Notestine et al., 2006; Popovic et al., 2006). However, in a clinical trial setting, it is desirable that the image analysis is standardized (in terms of procedures and parameters), repeatable and reproducible (in terms of small intra-rater and inter-rater variabilities) (Schuster; 2007), and increasingly, regulations require that the procedure for choosing parameters be defined in advance for the trial.

In this paper, we describe two improvements for the BSI that address differences in tissue contrast and SNR over time and between scanners, namely robust intensity normalization and automatic parameter selection based on the intrinsic tissue contrast of the MR images. The aim thereby was to increase the robustness and reproducibility of the BSI in multi-site image studies. We used the large ADNI dataset to assess whether, and by how much, these modifications may reduce variability in measurements of atrophy rates and consequently reduce estimated sample sizes for a randomized trial of a putative disease-modification therapy for AD.

Section snippets

Methods and materials

In this section, we describe the image data, the method of computing BSI based on normalization using interior brain regions and manual selection of intensity window (referred to as ‘classic-BSI’), the improved method of computing BSI (referred to as ‘KN-BSI’), and the methods of comparison between classic-BSI and KN-BSI.

Qualitative analysis

After reviewing the 341 normalized image pairs following standard image registration and intensity normalization (classic-BSI image processing procedures), 289 (120 AD, 169 controls) image pairs (85%) were found to have image quality scores 1–3, and 52 (21 AD, 31 controls) image pairs (15%) were found to have image quality score 4. The percentages of images with quality score 4 were similar in AD subjects and controls (15% AD vs 16% controls). Fig. 2 shows an example of the intensity

Conclusions and discussion

We have described a method of brain atrophy measurement from serial MR imaging that addresses the problem of differences in tissue contrast and SNR over time and between scanners. The method involves tissue-specific intensity normalization to improve consistency over time, and automated BSI parameters selection based on image specific brain boundary contrast to improve consistency between scanners. The method was applied to over 300 baseline and 1-year volumetric MR image pairs acquired in a

Acknowledgments

The authors would like to thank Josephine Barnes at the Dementia Research Centre, and Derek L.G. Hill and David M. Cash at IXICO for helpful discussions. We would also like to thank all the image analysts (Melanie Blair, Magda Sokolska, Elizabeth Gordon, Raivo Kittus, Laila Ahsan, Kate MacDonald) and the research associates (Casper Nielsen and Ian Malone) in the Dementia Research Centre for their help in the study. The implementation of KN-BSI uses the Insight Segmentation and Registration

References (35)

  • WhitwellJ.L. et al.

    Using nine degrees-of-freedom registration to correct for changes in voxel size in serial MRI studies

    Magn. Reson. Imaging

    (Sep 2004)
  • AndersonV.M. et al.

    Cerebral atrophy measurement in clinically isolated syndromes and relapsing remitting multiple sclerosis: a comparison of registration-based methods

    J. Neuroimaging.

    (Jan 2007)
  • BarkhofF. et al.

    Imaging outcomes for neuroprotection and repair in multiple sclerosis trials

    Nat. Rev. Neurol.

    (May 2009)
  • ChanD. et al.

    Rates of global and regional cerebral atrophy in AD and frontotemporal dementia

    Neurology

    (Nov 2001)
  • EvansM. et al.

    Volume changes in Alzheimer's disease and mild cognitive impairment: cognitive associations

    Eur. Radiol

    (Sep 2009)
  • Fennema-NotestineC. et al.

    Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location

    Hum. Brain Mapp.

    (Feb 2006)
  • FoxN.C. et al.

    Effects of Aβ immunization (AN1792) on MRI measures of cerebral volume in Alzheimer disease

    Neurology

    (May 2005)
  • Cited by (126)

    View all citing articles on Scopus
    1

    Equal senior author.

    2

    Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (www.loni.ucla.edu/ADNI). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. ADNI investigators included (complete listing available at www.loni.ucla.edu/ADNI/Collaboration/ADNI_Citatation.shtml).

    View full text