Evaluation of Common Structural Brain Changes in Aging and Alzheimer Disease with the Use of an MRI-Based Brain Atrophy and Lesion Index: A Comparison Between T1WI and T2WI at 1.5T and 3T

This study assesses the usefulness of a set of established structural findings in Alzheimer disease with various MRI sequences at 2 different field strengths in 127 subjects. Scores of atrophy and lesion burden were reliable across sequences and unit strength and were lowest in individuals with cognitive impairment, higher in those with Alzheimer disease, and also correlated with age, cognitive performance, and amyloid-β test. Although the results were slightly better at 3T, the authors concluded that even at 1.5T scores were reliable. BACKGROUND AND PURPOSE: The Brain Atrophy and Lesion Index combines several common, aging-related structural brain changes and has been validated for high-field MR imaging. In this study, we evaluated measurement properties of the Brain Atrophy and Lesion Index by use of T1WI and T2WI at 1.5T and 3T MR imaging to comprehensively assess the usefulness of the lower field-strength testing. MATERIALS AND METHODS: Data were obtained from the Alzheimer's Disease Neuroimaging Initiative. Images of subjects (n = 127) who had T1WI and T2WI at both 3T and 1.5T on the same day were evaluated, applying the Brain Atrophy and Lesion Index rating. Criterion and construct validity and interrater agreement were tested for each field strength and image type. RESULTS: Regarding reliability, the intraclass correlation coefficients for the Brain Atrophy and Lesion Index score were consistently high (>0.81) across image type and field strength. Regarding construct validity, the Brain Atrophy and Lesion Index score differed among diagnostic groups, being lowest in people without cognitive impairment and highest in those with Alzheimer disease (F > 5.14; P < .007). Brain Atrophy and Lesion Index scores correlated with age (r > 0.37, P < .001) and cognitive performance (r > 0.38, P < .001) and were associated with positive amyloid-β test (F > 3.96, P < .050). The T1WI and T2WI Brain Atrophy and Lesion Index scores were correlated (r > 0.93, P < .001), with the T2WI scores slightly greater than the T1WI scores (F > 4.25, P < .041). Regarding criterion validation of the 1.5T images, the 1.5T scores were highly correlated with the 3T Brain Atrophy and Lesion Index scores (r > 0.93, P < .001). CONCLUSIONS: The higher field and T2WI more sensitively detect subtle changes in the deep white matter and perivascular spaces in particular. Even so, 1.5T Brain Atrophy and Lesion Index scores are similar to those obtained by use of 3T images. The Brain Atrophy and Lesion Index may have use in quantifying the impact of dementia on brain structures.

A ging involves multiple structural changes in the brain that can have an additive effect on cognition. [1][2][3] Such common brain changes include global atrophy, white matter injury, small-vessel ischemia, and microhemorrhages. [4][5][6][7][8] These changes are more frequent and more severe in neurodegenerative and neurovascular conditions such as Alzheimer disease (AD), than in healthy aging. [9][10][11] To collectively evaluate multiple common brain changes and their additive effects on brain function, a semi-quantitative rating scale, the Brain Atrophy and Lesion Index (BALI), has been validated. 12,13 The BALI assesses global atrophy and lesions in the supratentorial and the infratentorial compartments, including lesions in the gray matter (eg, cortical infarcts) and dilated perivascular spaces in the subcortical white matter as well as lesions in the periventricular regions, deep white matter, basal ganglia, and the surrounding regions. 12,13 Through the use of different datasets, for example, the Alzheimer's Disease Neuroimaging Initiative (ADNI), 14 the BALI has been used to distinguish AD from healthy aging 15 and to evaluate the dynamics of brain structural changes with aging. 16 To date, BALI has been applied only to MR imaging acquired at 3T and 4T to exploit the higher SNR. 17 Although high-field systems represent the mainstream in future research and clinical settings, large amounts of data have been collected at 1.5T. To generalize the BALI to 1.5T MR imaging has potential value in understanding brain aging.
Our goal was to test the measurement properties of the BALI at both 1.5T and 3T MR imaging, for both T1WI and T2WI. In consequence, we compared BALI scores derived from T1WI and T2WI at both 3T and 1.5T and tested the relationship of the score with age, cognitive test scores, AD and mild cognitive impairment (MCI) diagnosis, and AD biomarkers. Our specific objectives were to validate BALI in 1.5T MR imaging by investigating 1) its criterion validity, for example, how well brain images acquired at 1.5T can be used to capture various structural changes in aging, and 2) whether T1WI and T2WI can both be used in the evaluation of global brain changes at 1.5T.

Data
Data used in the preparation of this article were obtained from the ADNI data base (adni.loni.ucla.edu). The ADNI was launched in 2003 with a primary goal to test whether serial MR imaging, PET, other biologic markers and clinical and neuropsychological assessments can be combined to measure the progression of MCI and early AD. The initial goal of the ADNI was to recruit 800 subjects, but the ADNI has been followed by ADNI-GO and ADNI-2. To date, these 3 protocols have recruited more than 1500 research participants, ages 55-90 years, consisting of cognitively normal older individuals, people with early or late MCI, and people with early AD. 14 For this secondary analysis, data from ADNI participants who had T1WI and T2WI at both 1.5T and 3T on the same day (n ϭ 135) were retrieved. Subjects who had at least 1 image with severe artifacts were excluded (n ϭ 8). A set of the 4 images from each of the remaining subjects (AD ϭ 37; MCI ϭ 45; healthy control subjects [HC] ϭ 45) was analyzed (ie, the first-time, same-day scans, so that each set represents unique individuals). At both 3T and 1.5T, the T1WI scans used 3D MPRAGE (TR/TE ϭ 2300 -3000/3-4 ms; flip angle ϭ 8 -9°; section thickness ϭ 1.2 mm; 256 reconstructed axial sections), whereas the T2WI scans used a 2D FSE/TSE (TR/TE ϭ 3000 -4000/96 -103 ms; flip angle ϭ 90°or 150°; section thickness ϭ 3 mm; 48 axial sections). 18 Clinical assessment data were also obtained, including the Mini-Mental State Examination, Clinical Dementia Rating Scale, and the Alzheimer's Disease Assessment Scale-cognitive subscale. 19 The clinical assessments had been completed within 14 days of the MR imaging scans. Diagnostic categorization (AD, MCI, and HC) had been made by ADNI site physicians in accordance with the National Institute of Neurologic and Communicative Disorders and Stroke/Alzheimer Disease and Related Disorders Association (NINCDS/ADRDA) criteria and reviewed by ADNI clinical monitors. In addition, for subjects in whom CSF biomarkers were tested, the baseline amyloid-␤ 1-42 peptide (A␤ 1-42 ), phospho-, and total protein data were obtained (AD ϭ 21, MCI ϭ 28, HC ϭ 23) and their values were dichotomized (positive/negative tests). 20

Evaluation of the BALI
The BALI is a semi-quantitative measure, adapted from several existing scales that assess localized structural changes. 12,13 Index variables integrate information from several sources (in the present report, types and locations of structural lesions) and are wellsuited to evaluating change at a system level. 21 Changes in 7 categories were evaluated by use of BALI (Fig 1; On-line Table 1). These included gray matter lesions and subcortical dilated perivascular spaces (GM-SV), deep white matter lesions (DWM), periventricular white matter lesions, lesions in the basal ganglia and surrounding areas (including the caudate, putamen, globus pallidus, thalamus, and internal capsule), lesions in the infratentorial compartments (including the cerebellum and the brain stem), and global atrophy. In addition, an "other findings" category was included to record other possible changes such as neoplasm, trauma, idiopathic normal-pressure hydrocephalus, focal asymmetry, and deformity, each of which, in our experience, is sometimes seen in older adults, even though no subjects in this sample showed a change in the "other findings" category. A value between 0 -3 was assigned to each category on the basis of the severity of change, with a higher score meaning greater severity (Fig 1; On-line Table 1; On-line Figs 1-4). Values of 4 and 5 were also used to capture more severe changes in the DWM and global atrophy categories, without a ceiling effect (On-line Figs 1-4). The BALI total score was calculated as the sum of the subscores of all the 7 categories, with a possible maximum of 25.
Five certified neuroradiologists, each with many years of experience in brain MR imaging evaluation, performed the image evaluation. The raters were trained with the BALI rating method chiefly through studying the rating schema and examples and discussing selected cases. Images were displayed using MRIcron (http://www.nitrc.org/projects/mricron/). Each rater performed ratings independently, blinded to the information concerning the subject demographics, diagnosis, cognitive test results, and imaging field strength. T1WI and T2WI were assessed separately on different days to minimize possible recall bias. For both T1WI and T2WI, the images were rated in random order. Only axial images were used.

Analysis
We tested interrater agreement to evaluate reliability. To test construct validity, we correlated each of the 4 sets of measures (both field strengths and both image types) with age, cognitive test scores, and biomarkers. Criterion validity refers to comparison with a reference standard 22 ; in the present study, we used the 3T images as the reference standard and correlated 1.5T images against them. Given that all lesions are less common and less severe in healthy aging people, compared with people with dementia, analyses are presented in relation to cognitive diagnosis. To evaluate the reliability of BALI, interrater agreement was as-sessed by use of the intraclass correlation coefficient for the BALI total scores (interval variable), with intraclass correlation coefficient values compared by use of Fisher Z tests and the Cohen coefficient for the categoric subscores. A 2-way random model was used, with both subject-sample and rater as random factors. The agreement rate was assessed independently for image type and field strength, for example, 1.5T T1WI, 1.5T T2WI, 3T T1WI, and 3T T2WI. As commonly done, the agreement was examined by use of a random subsampling of 20% of subjects among 3 raters and generalized with 5 raters by use of 9% random subsampling. 23 Demographic characteristics across diagnostic groups were examined by use of the Kruskal-Wallis nonparametric test for interval data (eg, age) and 2 for categoric data (eg, sex). Comparisons of the mean BALI total scores and the subscores between and within different groups, by diagnosis or biomarker, were conducted by use of ANOVA and the Kruskal-Wallis nonparametric tests, respectively. The interrelations of BALI total score between 3T and 1.5T and between T1WI and T2WI were examined by use of correlation analyses. Relationships between BALI total score and age/ cognitive tests were examined by use of regression analyses. Performance of the BALI scores in identifying individuals with different diagnoses was evaluated by use of the area under the curve of receiver operating characteristic analysis. All analyses were performed with the use of PASW 17 (IBM, Armonk, New

RESULTS
There was no diagnostic group difference in age (Table 1). Subjects with MCI were more likely to be men. As expected, people with AD had significantly lower cognitive testing scores compared with those with cognitively healthy aging, with MCI showing an intermediate level on average. Subjects with AD also had levels of education lower than those in the HC or MCI groups. Significant differences were also present in the AD biomarkers (Table 1).
Considering reliability, the intraclass correlation coefficient indicated at least strong agreement, with a value of 0.81 (CI ϭ 0.67-0.94) for 1.5T T1WI, 0.86 (0.70 -0.95) for 1.5T T2WI, 0.89 (CI ϭ 0.71-0.96) for 3T T1WI, and 0.88 (0.73-0.96) for 3T T2WI (Fisher Z ϭ 0.05-0.29, P Ͼ .770), indicating indifference in the agreement rates between BALI scores on the basis of different image types and field strengths. The coefficients for the BALI category were moderate to substantial, between 0.45 with DWM by use of T1WI at 1.5T and 0.76 with lesions in the infratentorial regions by use of T1WI at 3T (Fig 2).
Multiple structural changes were present commonly in each diagnostic group. Regardless of field strength and imaging type, on average, subjects with AD showed the highest values of the BALI total scores, followed by those with MCI (Online Table 2). Significant differences in BALI total score were found among diagnostic groups in T1WI and T2WI at both 1.5T and 3T (F Ͼ 5.14; P Ͻ .007). Similar differences by diagnosis existed in the global atrophy subscores ( 2 Ͼ 14.38, P Ͻ .001), whereas other subscores also showed the trend. Within each diagnostic group, the 3T T2WI-based BALI total score was the highest, followed by 1.5T T2WI, 3T T1WI, and 1.5T T1WI. A significant difference in the total score was found between T2WI and T1WI (F Ͼ 4.25, P Ͻ .041) and marginally between 3T and 1.5T (F Ͻ 3.23, P Ͻ .074), without interaction (F Ͻ 0.02, P Ͼ .898). Similar differences between field strengths and the image types were observed for GM-SV (ie, 3T Ͼ 1.5T; T2WI Ͼ T1WI), though the lesions in the infratentorial regions, DWM, periventricular white matter lesions, and lesions in the basal ganglia and surrounding areas subscores also showed such a tendency without a significant differences (Online Table 2).
The 3T-based and 1.5T-based BALI total scores were correlated for both T1WI (r ϭ 0.94, P Ͻ .001) and T2WI (r ϭ 0.93, P Ͻ .001; Fig 3A, -B), as were the T1WI-and T2WI-based scores (r ϭ 0.93, P Ͻ .001 at 3T; r ϭ 0.94, P Ͻ .001 at 1.5T; Fig 3C, -D). The BALI total score increased significantly with age, regardless of image type and field strength (the regression coefficients r ϭ 0.37-0.40; Table 2). A higher BALI total score also consistently correlated with cognitive testing scores (r Ն 0.42 for Mini-Mental State Examination, r Ն 0.38 for Alzheimer's Disease Assessment Scale-cognitive subscale; Table 2) and identified individuals with AD versus HC with the accuracy reached at 0.71 Ϯ 0.06 (P Ͻ .007; 70% sensitivity, 68% specificity). The BALI scores also differed significantly between amyloid-␤-negative (n ϭ 25) versus amyloid-␤-positive (n ϭ 47) groups, especially at the higher field (accuracy ϭ 0.64 Ϯ 0.07-0.66 Ϯ 0.07; F Ͼ 3.96, P Ͻ .050). Interrater agreement for the BALI rating. Images were rated by 3 raters independently by use of 20% of randomly selected subsample. The interrater agreement was calculated for the total scores (interval data) by use of intraclass correlation coefficient, whereas that for the category subscores (categoric data) used Cohen . IT indicates lesions in the infratentorial regions; BG, lesions in the basal ganglia and surrounding areas; PV, periventricular white matter lesions; GM-SV, gray matter lesions and subcortical dilated perivascular spaces; GA, global atrophy; DWM, deep white matter lesions. Bars from dark to light-gray: 3T T1WI, 1.5T T1WI, 3T T2WI, and 1.5T T2WI. Error bars indicate standard deviation of the mean value in a 2-way random model, with subject-sample and rater as random factors.

DISCUSSION
In the present study, we compared BALI scores on the basis of the ADNI MR imaging data that were concurrently acquired at 1.5T and 3T. Evaluations were made with the use of T1WI and T2WI at each field strength. The results demonstrated that several common brain changes can be captured and summarized by the use of BALI, thereby providing a way to quantify the impact of global structural changes on brain function. BALI scores on the basis of T1WI and T2WI at 1.5T are comparable with those at 3T. Notably, the higher field images give better definition of white matter changes and perivascular spaces, as is well known. 12,17 To the best of our knowledge, this is the first report on integrating multiple structural brain changes through the use of lower-field 1.5T MR imaging. Given the wealth of 1.5T MR imaging data in clinical and research settings, extending the global assessment beyond high fields can allow more effective use of existing (and not inexpensive) neuroimaging data in the study of aging and AD dementia. This study has several strengths. It has been suggested that multiple structural brain changes often coexist in the process of aging, reflecting heterogeneous profiles. Each of these changes can individually be related to an increased, albeit small, dementia risk; however, when combined, they produce additive effects on function. 11,24,25 It has also been suggested that many changes are interrelated, for example, more severe white matter damage and vascular lesions are associated with more severe gray matter and hippocampal atrophy, 26-28 producing a combined effect on cognition, though the relationships typically are nonlinear. 29,30 By combining these changes, their overall effect can be understood more comprehensively. This argument appears to be supported by our data: the BALI total score differed significantly among diagnostic groups (ie, AD Ͼ MCI Ͼ HC), correlated closely with age and cognition, and was associated with amyloid-␤ status. This study has taken advantage of the well-established open-access ADNI protocol, 14 in which a relatively large number of subjects had concurrent standard anatomic MR imaging scans at both high and lower fields on the same day. Because the standard 1.5T and 3T T1WI and T2WI were acquired in the same individuals on the same day, the analyses were performed with maximum control of potential differences caused by timerelated variations such as disease and cognition worsening and treatment influence. In consequence, evaluations of morphologic features on MR imaging can be optimally performed, comparing the field strengths and image types.
The study made use of multiple raters; each is experienced in clinical neu-  roimaging evaluation. Trained in the method, mainly through the use of the rating schema descriptions, examples, and case discussions, each rater mastered the BALI rating quickly and rated an image independently, typically within a few minutes. Such a quick and easy application can be particularly beneficial and is welcome in clinical settings when evaluation time is a concern. 31 The interrater agreement rate of BALI scores was quite strong across the raters, consistently with the image types and the field strengths, suggesting the robustness of the BALI rating. Further research will be needed to validate whether BALI rating by non-neuroradiologists is possible. As a semi-quantitative rating scale, the BALI can be coarse, relative to other measures with more precise morphometric or volumetric quantifications. 32,33 Even so, quantitative methods are usually highly image quality-dependent, and this can be difficult to satisfy in multicenter studies; what is more, the high precision may not always be necessary. 31 Against this background, a quick and easy visual rating is sufficient and may even be favorable. 34 The BALI grading system has been established by adapting several validated rating scales, and the BALI itself has been validated in previous research by use of several independent datasets. Our data now suggest that it is reasonable to expand its use.
On the basis of the widely available T1WI and T2WI, BALI focuses on morphologic changes and not on their pathologic causes (which would require additional imaging sequences such as FLAIR and gradient recalled-echo/T2*). The high-field strength and the T2WI showed a greater sensitivity, particularly for evaluating subtle changes of the imaging-contrast-reliant categories (eg, GM-SV and DMW), leading to slightly higher BALI total scores in these conditions. This is not surprising because subtle lesions are more conspicuous on T2WI, and higher field strength allows higher SNR and thus greater image contrast. 17 The argument appears to be supported by our data demonstrating the relationships between T2WI versus T1WI and between 3T versus 1.5T; in each case, the difference was more obvious at the relatively low level of changes (Fig 3A-D). The T2WI was more sensitive for the small lesions than was T1WI, even though the T1WI section thickness was less and hence had greater spatial resolution. Against this background, a field-strength-related difference may or may not be reflected in interrater agreement rate. For example, a subtle change in GM-SV at 1.5T was not as easy to see by the raters, leading to lower subscores at 1.5T than that at 3T, whereas the interrater agreement was not necessarily lower ( Fig  1). Meanwhile, small changes in DWM were more variably seen by different rates at 1.5T than at 3T (eg, by use of T1WI), resulting in a relatively lower value of both the DWM subscore and the agreement rate at 1.5T (Fig 1). Even with these detailed differences, the BALI total scores obtained under different conditions were reliable, correlated with each other, and were related to age and cognition in the same manner, suggesting each may be used to evaluate the global structure changes in the aging brain.
Our data must be integrated with caution. In the present study, the mean values of the BALI rating appeared to be slightly higher than previously reported with the use of other samples, including a different sample of the ADNI dataset at baseline.
Given the demonstrated reliability of the approach, this probably reflects that the concurrent MR imaging at both field strengths studied was performed at a follow-up of up to 36 months, instead of at baseline. In consequence, structural brain changes, which represent worsening on average, would be reflected in higher BALI scores.
A few further caveats are needed in relation to how to best evaluate each category. First, dilated perivascular spaces are seen in several regions. 7 Perivascular spaces along the ventral aspect of the lentiform nuclei at the level of the anterior commissure are extremely common even in healthy people 35 and thus were not counted. Second, the "large confluent lesions" notion appeared to be broad, because "large" can vary, depending on brain structure. For example, a "large" change regarding the relatively smaller lesions in the infratentorial regions and lesions in the basal ganglia and surrounding areas may not necessarily be judged as "large" regarding the DWM, even though our data suggested only a 1-point difference between raters typically, if any. Further research will be needed to better understand whether a more detailed definition, for example, categorizing DWM, lesions in the basal ganglia and surrounding areas, and lesions in the infratentorial regions in terms of their size in millimeters can help to further improve the robustness of the rating. Third, the BALI does not distinguish between lacunar infarcts and microangiopathic white matter changes, disregarding underlying pathophysiological mechanisms. Whether assigning a category rating to lacunar infarcts specifically can improve the applicability of the BALI rating deserves a separate investigation. To maintain the quickness and ease of BALI, caution must be taken to avoid unnecessary complexity.
Finally, how to best aggregate the subscores that may contribute to cognition differently is challenging. For example, the effect of white matter lesions can differ by location. 5,8 Accordingly, integrating several white matter subscores may be sensible. In this regard, introducing a weighting factor may be beneficial, as shown in a previous study that combined BALI and the medial temporal lobe atrophy to improve AD discrimination and prediction. 15 This raises the possibility of its potential application in differentiating dementia subtypes, though that proposition remains to be tested. Even so, values of weights often rely on specific outcomes and methods used, so that weighting may limit the general application of a measure. 36 Whereas this challenge has motivated the current research of our group, the merit of combining multiple brain changes can already been seen simply by summing them up.

CONCLUSIONS
Our study suggests that multiple structural changes in the aging brain have an additive effect on cognition and can be collectively evaluated by use of the BALI total score. Although high field strength and T2WI have a better sensitivity in detecting subtle changes in the deep white matter and perivascular spaces in particular, both T1WI and T2WI at 1.5T as in the ADNI protocol have good reliability in robustly capturing global brain changes.