The Contributions of MRI-Based Measures of Gray Matter, White Matter Hyperintensity, and White Matter Integrity to Late-Life Cognition

BACKGROUND AND PURPOSE: GM volume, WMH volume, and FA are each associated with cognition; however, few studies have detected whether these 3 different types of MR imaging measurements exert independent or additive effects on cognitive performance. To detect their extent of contribution to cognitive performance, we explored the independent and additive contributions of GM atrophy, white matter injury, and white matter integrity to cognition in elderly patients. MATERIALS AND METHODS: Two hundred and 9 elderly patients participated in the study: 97 were CN adults, 65 had MCI, and 47 had dementia. We measured GM on T1-weighted MR imaging, WMH on FLAIR, and FA on DTI, along with psychometrically matched measures of 4 domains of cognitive performance, including semantic memory, episodic memory, executive function, and spatial abilities. RESULTS: As expected, patients with dementia performed significantly more poorly in all 4 cognitive domains, whereas patients with MCI performed generally less poorly than dementia patients, though considerable overlap in performance was present across groups. GM, FA, and WMH each differed significantly between diagnostic groups and were associated with cognitive measures. In multivariate models that included all 3 MR imaging measures (GM, WMH, and FA), GM volume was the strongest determinant of cognitive performance. CONCLUSIONS: These results strongly suggest that MR imaging measures of GM are more closely associated with cognitive function than WM measures across a broad range of cognitive and functional impairment.

M R imaging-based brain volumetric measurements are widely used in clinical studies of aging, mild cognitive impairment, and dementia, particularly Alzheimer disease. Multiple reports confirm that global GM loss is significantly associated with advancing age and the onset and progression of AD. 1 The extent of GM loss in AD is associated with cognitive performance. 1,2 In addition, increased WMH burden on FLAIR MR imaging has been observed in both MCI 3,4 and clinical AD, 3,5 and is associated with diminished episodic memory and executive function. 6,7 More recently, studies of white matter microstructural integrity using FA derived from DTI 8,9 have shown that FA in the temporal lobe, posterior cingulum bundle, and fornix is reduced in dementia 10,11 and MCI, 12 while FA in the corpus callosum may be reduced in CN elders. 13 Reduced FA in these tracts is also associated with poor memory performance. 14 Prior studies, however, have generally failed to assess whether different types of MR imaging measurements exert independent or additive effects on cognitive performance. Addressing this issue could have important implications for understanding the biologic underpinnings of age-associated cognitive decline, MCI, and dementia. In particular, because GM loss probably reflects injury to the neuronal soma and dendritic arborizations, whereas WMH and FA reflect injury to axonal tracts, and both aspects of brain structure are critical to higher-order cognition, each of the MR imaging measurements has the potential to be independently associated with loss of cognitive abilities. However, it is also possible that 1 of these forms of brain injury has a greater impact on brain function and thus has the greatest biologic impact.
In this study, we measured GM, WMH, and FA in a diverse population of 209 community-dwelling elders and assessed their independent strengths of association with psychometrically matched measures of semantic memory, episodic memory, executive function, and spatial ability. We combined these 3 MR imaging measures into unified models to examine how strongly individual biologic substrates, as measured by MR imaging, are associated with cognitive function.

Participants
Participants were recruited through both memory clinic referrals and community outreach. Participants included patients with complaints of cognitive problems and cognitively normal controls. To be included in the study, participants had to be older than 60 years. Exclusion criteria were limited to unstable major medical illness, major primary psychiatric disorder (history of schizophrenia, bipolar disorder, or recurrent major depression), and substance abuse or dependence in the last 5 years. Each of the participants signed informed consent approved by the University of California, Davis, institutional review board. The sample consisted of 209 participants. Most participants (n ϭ 138; 66.0%) were recruited from the community using protocols designed to enhance demographic diversity, 15 whereas 71 participants (34.0%) were recruited from memory clinics.

Clinical Evaluation
Each participant received a multidisciplinary clinical evaluation through the University of California, Davis, Alzheimer Disease Center, which included a detailed medical history, physical examination, and neurologic examination. All participants received a standardized neuropsychological test battery that was distinct from the outcome measures in our analyses.
Each participant was diagnosed at a consensus conference by a clinical team. Ninety-seven CN patients, 65 diagnosed with MCI and 47 diagnosed with dementia, were included in this study. Among the patients with dementia, 79.5% were diagnosed with probable AD, 11.4% were diagnosed with AD mixed with cerebrovascular dementia, 4.5% were diagnosed with Lewy body dementia or mixed Lewy body dementia and AD, 2.3% were diagnosed with cerebrovascular dementia, and 2.3% were diagnosed with frontotemporal lobar degeneration. Dementia was diagnosed using the Diagnostic and Statistical Manual of Mental Disorders, 4th editionrevised criteria 16 for dementia, modified to exclude the requirement of memory impairment. Underlying etiology was determined according to standardized criteria and methods. MCI was diagnosed using a modified version of the Petersen criteria, which did not require cognitive complaints. 17,18 Participants were considered cognitively normal if they had no clinically significant cognitive impairment.

MR Imaging Acquisition
Brain imaging was obtained at the University of California, Davis, Imaging Research Center (Sacramento, California) on a 1.5T Signa Horizon LX Echospeed system (GE Healthcare, Milwaukee, Wisconsin) and an Eclipse machine (Philips Healthcare, Andover, Massachusetts) at the Veterans Affairs Medical Center, Northern California Health Care System (Martinez, California). 19,20 Brain volume measures, including GM and TCV, were obtained from a T1-weighted fast-spoiled gradient recalled echo with TE ϭ 2.9 ms, TR ϭ 9 ms, and a flip angle ϭ 15°. Voxel size was 0.977 ϫ 0.977 ϫ 1.5 mm; WMH was determined from FLAIR imaging with TE ϭ 144 ms, TR ϭ 11,000 ms, TI ϭ 2250 ms, and a voxel size of 0.859 ϫ 0.859 ϫ 3 mm; FA was determined from DTI with TE ϭ 90.4 ms, TR ϭ 8000 ms, flip angle ϭ 90°, and a voxel size of 1.875 ϫ 1.875 ϫ 5 mm. A rigorous protocol ensured the validity of the MR imaging measures across differing scanners and analysts. ICCs between new analysts, all previously trained analysts, and the neurologist were required to be above 0.95 and 0.95 for TCV and WMH. Within-subject between-scanner agreement in TCV and WMH was strong (ICCs ϭ 0.96, and 0.89). Within-subject agreement on the same scanner was strong (ICCs ϭ 0.97 and 0.99). 21

Four-Tissue Image Segmentation
Segmentation of GM, WM, and CSF was performed in native space T1-weighted images by an in-house computer program using a Bayesian maximum likelihood expectation-maximization algorithm. 22 Tissue probabilities at each voxel were based on a combination of Gaussian intensity distributions and a Markov random field component for modeling tissue configurations within voxel neighborhoods. 23 Two in-house enhancements included 1) automatic initialization via a high-dimensional image warp in which tissue probability maps in a template space were fitted to the native T1 images, and 2) edge detection to encourage homogeneous tissue labels within homogeneous image regions. The segmentation of WMH was determined from subject FLAIR images using a semiautomated approach, as described previously. 21 Briefly, nonbrain elements were manually removed from FLAIR images by operator-guided tracing of the dura mater in the cranial vault. The resulting corrected image was modeled as a mixture of 2 Gaussian probability functions, with the segmentation threshold determined at the minimum probability between these 2 distributions, followed by a single Gaussian distribution fitted to the image data using an a priori threshold of 3.5 SD in pixel intensity above the mean to identify WMH. Intrarater and interrater reliability of these methods are high and have been published previously. 24

Tissue Volumes and FA within ROIs
Linear alignment, followed by high-dimensional image warping, was performed to register native T1-weighted images to a MDT, as described previously. 25 These steps were then reversed to map handtraced ROIs from the space of the MDT back into native space. Global GM and WMH volumes were calculated by counting voxels within those labels throughout the brain. ICC between hand-traced brain volumes and this automatic method are greater than 0.90.
A WM probability map in the MDT space was created by labeling the WM voxels in each T1-weighted scan, 26 transforming the resulting WM masks to MDT space, and averaging the masks across the population. Thresholding this WM average map provided a binary WM mask in MDT space.
An average-young-adult FA map was also created in MDT space, as previously described, 8 to provide normative FA values for comparison with the elderly participants in the present study. This map was made by transforming the FA images of 15 healthy young adults to MDT space (mean age ϭ 24.1 Ϯ 3.1 years, 60.0% male) and taking the FA average at each voxel.

FA Ratio Values
FA values were indexed against normative values to account for inherent variability in FA that is caused by the intrinsic organization of WM tracts. 8 The WM mask described here was used to remove non-WM voxels from the individual FA images, and FA at each voxel was divided by the corresponding entry in the young mean FA map to express FA as a percentage of the FA value that would be expected at that voxel in a healthy young person. In the resulting FA ratio map, a voxel with a value lower than 1 indicates that the subject exhibits FA that is reduced compared with the young group independent of intrinsic local WM organization. Finally, after applying WM tract re-gion of interest masks 27 to each individual FA ratio map, we computed the means of FA ratio values within fornix, CC, cingulum, SLF-FP, and UNC ROIs for each subject (Fig 1).

Outcome Measures
Neuropsychological Measures. Spanish and English Neuropsychological Assessment Scales were used to measure 4 specific domains of cognitive functioning. 28,29 A subset of Spanish and English Neuropsychological Assessment Scales were combined using methods from item response theory to create composite measures: The semantic memory measure was based on object naming and picture association tests, the episodic memory measure was based on word list learning tests, and the spatial ability measure was based on pattern recognition and spatial localization tests. A composite executive function score was created from a set of fluency and working memory measures that have been previously developed using the same methods as the original Spanish and English Neuropsychological Assessment Scales. 28 Statistical Analyses. WMH and GM volumes were divided by intracranial volume to avoid confounding effects of head size. WMH volume was also log transformed to better approximate a normal distribution for analysis, as previously described. 19 Analyses of variance were used to detect diagnostic group differences in demographics, cognitive performance, and MR imaging measures. Tukey-Kramer adjustment was used for post hoc analyses. P values smaller than .05 were considered statistically significant. Multiple regression models were used to examine the associations between MR imaging measures and cognitive performance, while adjusting for the potential effects of sex, education, and recruitment source. Subject age was not used as a covariate in these adjusted models because previous studies indicate that controlling for age may attenuate the associations between MR imaging variables and cognitive function in this sample. 21,28 Because there were 4 outcomes, we used Bonferroni correction for multiple comparisons and set the ␣ level to .05/4. P values smaller than this ␣ level were considered significant. All statistical analyses were performed in JMP 8 (SAS Institute, Cary, North Carolina).

Diagnostic Group Differences
This study included 209 participants (average age 74.5 Ϯ 7.8 years). Sample sizes, demographic characteristics, composite cognitive scores, and MR imaging measures are summarized by diagnostic group in Table 1. Age (P ϭ .13) and education (P ϭ .09) were not significantly different among the 3 diagnostic groups.

Cognitive Performance
Cognitive performance differed significantly across diagnostic groups (all P Ͻ .001) ( Table 1 and Fig 2), though significant overlap in performance between groups was found. Patients with dementia, as expected, had significantly poorer cognitive performance than both CN (all P Ͻ .001) and MCI participants (semantic, P ϭ .02; spatial, P ϭ .04; episodic, P Ͻ .001; executive, P ϭ .01). MCI participants also scored significantly worse than CN participants in episodic memory and executive function (all P Ͻ .001) but not in spatial ability (P ϭ .25) or semantic memory (P ϭ .08).

Brain MR Imaging Measures
Among MR imaging measures, the GM volume differences between diagnostic groups were most pronounced (CN/Demented and MCI/Demented, P Ͻ .001; CN/MCI, P ϭ .008). Global WMH volume was significantly higher in the group with dementia compared with controls (P ϭ .04). FA-CC (P ϭ .002), FA-Cingulum (P ϭ .01), FA-Fornix (P ϭ .002), and FA-UNC (P ϭ .02) were each significantly lower in the group with dementia compared with the CN group; FA-CC was significantly lower in dementia compared with MCI (P ϭ .05); and FA-Fornix was significantly lower in MCI compared with the CN group (P ϭ .01). However, FA-SLF-FP did not differ significantly across diagnostic groups (Table 1 and Fig 2).

Individual MR Imaging Measures and Cognitive Performance
Diagnostic groups showed considerable heterogeneity and overlap in both cognitive performance and MR imaging measures (Fig 2). Therefore, regression models controlling for sex, education, and recruitment source were used to assess the as-sociations between continuous cognitive and MR imaging measures independent of clinical diagnosis. Models including individual MR imaging predictors suggested that poor performance in each cognitive domain was associated with smaller GM volume (semantic, b ϭ 6.14, P ϭ .001; and episodic, b ϭ 10.87; executive, b ϭ 6.95; spatial, b ϭ 7.65; all P Ͻ .001). Poorer episodic memory performance was associated with greater WMH volume (b ϭ Ϫ.16, P ϭ .006). Lower FA-Cingulum was associated with poor episodic memory (b ϭ 1.80, P ϭ .004) and executive function (b ϭ 1.32, P ϭ .005), and lower FA-fornix was associated with poor episodic memory (b ϭ 2.78, P Ͻ .001), executive function (b ϭ 1.71, P ϭ .003), and spatial ability (b ϭ 2.10, P ϭ .005). Associations between FA-SLF-FP or FA-UNC and cognitive measures were not statistically significant ( Table 2).

Correlations among MR Imaging Measures
Before including multiple MR imaging predictors into multiple regression models of cognitive performance, we used Pearson correlations to assess whether colinearities among the MR imaging measures could influence their strengths of association with the cognitive outcomes.

Multiple MR Imaging Measures and Cognitive Performance
Because GM, WMH, FA-Cingulum, and FA-Fornix were each significantly associated with cognitive performance, all 3 measures (GM, WMH, and FA) were added into multiple regression models predicting each cognitive outcome with different models for each FA measure: FA-Fornix and FA-Cingulum. These models also included sex, education, and recruitment source as additional covariates (Table 3). In the models with GM, WMH, and FA-Fornix, smaller GM was still associated with poorer semantic memory (b ϭ 5.89, P ϭ .004), episodic memory (b ϭ 9.13, P Ͻ .001), executive function (b ϭ 5.95, P Ͻ .001), and spatial ability (b ϭ 6.39, P ϭ .004), while none of the associations between these cognitive measures and FA-Fornix or WMH were statistically significant. Similarly, in the model with GM, WMH, and FA-Cingulum, reduced GM was associated with poorer cognitive performance (semantic, b ϭ 6.26, P ϭ .002; episodic, b ϭ 9.50, P Ͻ .001; executive, b ϭ 6.0, P Ͻ .001; spatial, b ϭ 6.86, P ϭ .002), while WMH and FA-Cingulum were not significantly associated with any cognitive measures.

Discussion
MR imaging measures of GM, WMH volumes, and FA are each independently associated with cognition. [30][31][32] However, few studies have examined whether these 3 different types of MR imaging measurements exert independent or additive effects on cognitive performance. While GM atrophy, white matter integrity, and white matter injury were each individually associated with cognitive performance in our study, only GM atrophy was associated with cognition in a multivariate model that included all 3 measures. Thus, while longitudinal measurement of the 3 brain injury measures and cognition are needed to fully understand their coevolution over time, our cross-sectional findings support the view that GM atrophy may be the most relevant MR imaging measurement of cognitive ability. We speculate that GM atrophy is more tightly tied to cognitive performance because it most specifically represents neuronal injury, and neuronal injury appears to be the most important biologic process associated with cognition. We do not wish, however, to imply that injury to white matter  .77 (.63) P ϭ .22 Note:-FA indicates ratio of young adults fractional anistropy value; GM, gray matter (percentage of total cranial volume); WMH, log transformed percentage of total cranial volume. * Indicates P values not larger than the ␣ level which was set at .0125.
is not important. In fact, multiple studies show that neurobehavioral deficits arise both from cortical lesions and injury to fiber tracts that link cortical and subcortical regions into distributed networks. 33 However, WMH and reduced FA represent injury to fiber tracts that is probably incomplete (ie, injured but not transected axons), thereby impeding, but not preventing, neuronal transmission via those tracts. Furthermore, because any pair of regions may be connected directly or indirectly via multiple fiber pathways, 33,34 injury to any individual pathway may alter or make inefficient, but not completely prevent, communication between functional cortical systems. Conversely, GM atrophy probably reflects injury to neuronal soma, dendrites, and, most importantly, synapses, leading to primary impairments in the nodes of the neural networks. The importance of GM atrophy in our study is further strengthened by the fact that most of the cognitively impaired participants exhibited clinical phenotypes suggesting the presence of AD pathology, which results in specific injury to pyramidal cell neurons in cortical layers III and IV responsible for cortical-cortical connectivity. 35,36 Again, these results should not be interpreted as suggesting that white matter integrity has no direct bearing on late-life cognition; indeed, WMHs have been shown to impact cognitive function in the absence of severe GM atrophy, 37 and white matter injury may in fact exacerbate the gray matter atrophy that seems to drive cognitive losses. 3,8,32 However, in a population spanning a range of both GM atrophy and WM injury, our data suggest that GM atrophy may represent the final common pathway to cognitive decline. The veracity of our findings is bolstered by a number of additional observations. First, echoing previous studies, we found significant univariate associations between GM, WM measures, and cognitive performance, 30 particularly for performance in the executive and episodic memory domains. 6 In addition, we found significant differences in cognitive performance across all the cognitive domains comparing CN patients with ones with MCI and MCI with dementia, similar to prior studies. 18,38 Reductions in GM and WM across syndromes have been similarly shown. 33,[39][40][41][42] We believe similarities between our study and prior observations support the primary finding of differing strengths of association between cognition and GM, WMH, and FA.
Our study is not without limitations, however. For example, this analysis is entirely cross-sectional and therefore cannot address causality or temporal evolution. It is possible, for example, that white matter lesions may have independent effects on cognition early in the course of brain injury when clinical impairment is absent or only mild 37,43,44 but are overwhelmed in the setting of cortical brain injury through the AD process, as suggested in pathologic studies. 45 In addition, this study included participants with cognitive decline of various etiologies, though most were AD associated. This may limit our ability to investigate relations between AD-related brain injury and cognition, but the more diverse cohort would have been expected to increase the effects of WMH and FA, because many of the non-AD subjects would be expected to exhibit vascular disease. In future studies, we may restrict analysis to only patients with amnestic MCI and AD to further study the specific MR imaging changes and cognition decline associated with AD.

Conclusions
These results strongly suggest that MR imaging measures of GM associate more closely with cognitive function than white matter measures do across a broad range of cognitive and functional impairment. We believe our study is among the first to comprehensively assess the relationship between various MR imaging measures of gray and white matter injury and cognitive performance. From our results, we conclude the GM injury is most important to cognitive performance among a group of older participants with various degrees of cognitive impairment. Future research will evaluate possible effects of regional differences in GM volume on longitudinal cognition and risk for transitions from normal cognitive ability to cognitive impairment.