Clinically Feasible Microstructural MRI to Quantify Cervical Spinal Cord Tissue Injury Using DTI, MT, and T2*-Weighted Imaging: Assessment of Normative Data and Reliability

Forty healthy subjects underwent T2WI, DTI, magnetization transfer, and T2*WI at 3T in <35 minutes using standard hardware and pulse sequences. Cross-sectional area, fractional anisotropy, magnetization transfer ratio, and T2*WI WM/GM signal intensity ratio were calculated. Reliable multiparametric assessment of spinal cord microstructure is possible by using clinically suitable methods. These results establish normalization procedures and pave the way for clinical studies. BACKGROUND AND PURPOSE: DTI, magnetization transfer, T2*-weighted imaging, and cross-sectional area can quantify aspects of spinal cord microstructure. However, clinical adoption remains elusive due to complex acquisitions, cumbersome analysis, limited reliability, and wide ranges of normal values. We propose a simple multiparametric protocol with automated analysis and report normative data, analysis of confounding variables, and reliability. MATERIALS AND METHODS: Forty healthy subjects underwent T2WI, DTI, magnetization transfer, and T2*WI at 3T in <35 minutes using standard hardware and pulse sequences. Cross-sectional area, fractional anisotropy, magnetization transfer ratio, and T2*WI WM/GM signal intensity ratio were calculated. Relationships between MR imaging metrics and age, sex, height, weight, cervical cord length, and rostrocaudal level were analyzed. Test-retest coefficient of variation measured reliability in 24 DTI, 17 magnetization transfer, and 16 T2*WI datasets. DTI with and without cardiac triggering was compared in 10 subjects. RESULTS: T2*WI WM/GM showed lower intersubject coefficient of variation (3.5%) compared with magnetization transfer ratio (5.8%), fractional anisotropy (6.0%), and cross-sectional area (12.2%). Linear correction of cross-sectional area with cervical cord length, fractional anisotropy with age, and magnetization transfer ratio with age and height led to decreased coefficients of variation (4.8%, 5.4%, and 10.2%, respectively). Acceptable reliability was achieved for all metrics/levels (test-retest coefficient of variation < 5%), with T2*WI WM/GM comparing favorably with fractional anisotropy and magnetization transfer ratio. DTI with and without cardiac triggering showed no significant differences for fractional anisotropy and test-retest coefficient of variation. CONCLUSIONS: Reliable multiparametric assessment of spinal cord microstructure is possible by using clinically suitable methods. These results establish normalization procedures and pave the way for clinical studies, with the potential for improving diagnostics, objectively monitoring disease progression, and predicting outcomes in spinal pathologies.

T he era of quantitative MR imaging has arrived, allowing in vivo measurement of specific physical properties reflecting spinal cord (SC) microstructure and tissue damage. 1,2 Such measures have potential clinical applications, including improved di-agnostic tools, objective monitoring for disease progression, and prediction of clinical outcomes. 3 However, technical challenges such as artifacts, image distortion, and achieving acceptable SNR have led to limited reliability. Specialized pulse sequences and custom hardware have advanced the field but incur costs of increased complexity and acquisition time while creating barriers to portability and clinical adoption. Furthermore, quantitative MR imaging metrics often show wide ranges of normal values and confounding relationships with subject characteristics such as age, [4][5][6][7][8] for which most previous studies have not accounted. 3 Among the most promising SC quantitative MR imaging techniques are DTI and magnetization transfer (MT). [1][2][3] These provide measures of axonal integrity and myelin quantity that correlate with functional impairment in conditions such as degenerative cervical myelopathy (DCM) [5][6][7]9 and MS, 3,9 albeit with limited physiologic specificity (eg, fractional anisotropy [FA] reflects both demyelination and axonal injury). 10,11 SC crosssectional area (CSA) computed from high-resolution anatomic images can measure atrophy (eg, in MS) 12 or the degree of SC compression in DCM. 13 T2*-weighted imaging at 3T or higher field strengths offers high resolution and sharp contrast between SC WM and GM, allowing segmentation between these structures similar to that in phase-sensitive inversion recovery. 14,15 T2*WI also demonstrates hyperintensity in injured WM, [16][17][18] reflecting demyelination, gliosis, and increased calcium and nonheme iron concentrations. 19 T2*WI signal intensity is not an absolute quantity, so we normalize its value in WM by the average GM signal intensity in each axial section, creating a novel measure of WM injury: T2*WI WM/GM ratio. 20 We propose a multiparametric approach to cervical SC quantitative MR imaging with clinically feasible methods, including acceptable acquisition times, standard hardware/pulse sequences, and automated image analysis. Our protocol yields 4 measures of SC tissue injury (CSA, FA, MT ratio [MTR], and T2*WI WM/GM), for which this study establishes normative values in numerous ROIs. We characterize the variation of these metrics with age, sex, height, weight, cervical cord length, and rostrocaudal level and propose normalization methods. Finally, we assess testretest reliability of FA, MTR, and T2*WI WM/GM and compare our DTI results against those with cardiac triggering.

Study Design and Subjects
This study received approval from the University Health Network (Toronto, Ontario, Canada), and written informed consent was obtained from all participants. Forty-two subjects were recruited between October 2014 and December 2016 with a broad range of ages and balance between sexes. A physician (A.R.M.) assessed all subjects to rule out symptoms and signs of neurologic dysfunction, and T2WI was screened for abnormalities suggestive of mul-tiple sclerosis, tumor, or severe cord compression. Two subjects were excluded from the study with clinical and imaging findings of DCM, leaving 40 healthy subjects for analysis. Data from 18 patients with DCM were included for analysis of test-retest reliability, and 6 patients with DCM were included in a cardiac-triggering comparison, but subjects with DCM were excluded from other analyses. 20

MR Imaging Acquisitions
MR images were acquired on a 3T clinical scanner (Signa Excite HDxt; GE Healthcare, Milwaukee, Wisconsin). Peak gradients were 50 mT/m; slew rate, 150 T/m/s with a body coil for transmission and the top 2 elements of a standard 8-element spine coil (Premier III Phased Array CTL; USA Instruments, Aurora, Ohio) for reception. Subjects were positioned head-first and supine with the head tightly padded to prevent movement and the neck flexed to straighten the cervical SC.
The MR imaging protocol was developed on the basis of methods previously used by one of the authors (J.C.-A.). 16,17,21 T2WIs used sagittal FIESTA-cycled phases with 0.8-mm 3 isotropic resolution covering the brain stem to T4. DTI, MT, and T2*WI had 13 axial sections positioned perpendicular to the spinal cord (at C3), covering C1-C7 by using a variable gap, alternating between the mid-vertebral body and the intervertebral disc. Parameters for each sequence are listed in Table 1. DTI used a spin-echo singleshot EPI sequence with an 80 ϫ 80 mm 2 FOV to minimize susceptibility distortions, anterior/posterior saturation bands to achieve outer volume suppression, and no cardiac triggering. Second-order localized shimming was performed before DTI by positioning a VOI encompassing the SC from C1-C7. T2*WIs used the multiecho recombined gradient-echo sequence, with 3 echoes that are magnitude-reconstructed and combined by using a sumof-squares algorithm. 18 Each session required 30 -35 minutes, including subject positioning, section prescription, prescanning, and shimming. Test-retest reliability was assessed by removing the subject from the scanner and repositioning before rescanning. This was performed in a subset of subjects (DTI: 17 healthy, 9 with DCM; MT: 13 healthy, 4 with DCM; T2*WI: 5 healthy, 11 with DCM) extemporaneously, depending on scanner availability and subject willingness. Reliability was not assessed for SC CSA measurement due to time constraints.
A comparison of DTI with and without cardiac triggering was also performed in 10 subjects (4 healthy, 6 with DCM). Cardiac-triggered DTI was performed with pulse oximetry triggering, trigger delay of 310 ms, window of 250 ms, and TR ϭ 7 R-R interval. Two acquisitions were performed that were analyzed individually for test-retest coefficient of variation (TRCOV) and then concatenated and averaged for comparison with nontriggered DTI.

Image Analysis Techniques
Imaging data were analyzed by using the Spinal Cord Toolbox, Version2.3 (SCT; https://www.nitrc.org/projects/sct/). 22 Each axial image was visually inspected by 1 rater (A.R.M.) and excluded if low signal or artifacts (motion, aliasing) were present. SC segmentation was automatically performed by using native T2WIs and T2*WIs, the mean diffusivity map for DTI, and the MT image with a prepulse. Segmentation errors were resolved by providing seed points for automatic segmentation or manual ed-iting. Images were nonlinearly registered to the MNI-Poly-AMU template/ atlas in SCT. 23 T2WIs were used to automatically calculate cervical cord length (from the top of C1 to the bottom of the C7 vertebral levels) and SC CSA. DTI was motion-corrected with regularized registration, and diffusion tensors were calculated with outlier rejection by using the RESTORE (robust estimation of tensors by outlier rejection) method. 24 MT images with and without prepulses were coregistered, and MTR was computed. T2*WI data were further analyzed with automatic segmentation of GM and WM, 25 which was used to refine the registration of T2*WI to the template. FA, MTR, and T2*WI WM/GM ratios were extracted from various ROIs by using the SCT probabilistic atlas with automatic correction for partial volume effects by using the maximum a posteriori method. 26 ROIs included the SC, WM, and GM and the left/right lateral corticospinal tract, fasciculus cuneatus, fasciculus gracilis, and spinal lemniscus in each axial section (Fig 1). Metrics were averaged at rostral (C1-C3), middle (C4 -5) or maximally compressed (MCL, subjects with DCM), and caudal (C6 -C7) levels.

Statistical Analysis
Statistical analysis was performed with R statistical and computing software, Version 3.3 (http://www.r-project.org/). Normative data were summarized with mean, SD, and intersubject coefficient of variation. Relationships between MR imaging metrics (averaged from C1-C7) and patient characteristics (age, sex, height, weight, cervical cord length) were assessed with Pearson correlation coefficients and backward stepwise linear regression to determine significant independent relationships and their coefficients. Differences by rostrocaudal level were assessed with ANOVA. If differences were found, we calculated Spearman coefficients (between mean values and numbered levels) to identify monotonic relationships. To determine whether nonlinear relationships were present, we performed a likelihood ratio test on linear regression models with and without a 5-knot restricted cubic spline. Paired t tests compared WM and GM differences, and ANOVA was used to identify differences among individual WM tracts (averaged bilaterally). Reliability was assessed by using testretest coefficient of variation, and differences between healthy subjects and those with DCM were assessed with Welch t tests, as were pair-wise comparisons between techniques at each rostrocaudal level. Statistical significance was set to P ϭ .05 and was not corrected for multiple comparisons due to the exploratory nature of this study.

Subject Characteristics
Characteristics of 40 healthy subjects and 18 with DCM included in this study are listed in Table 2.

Image Acquisition
Acceptable image quality was achieved in all subjects and techniques. For DTI, 27 of 520 axial images (5.2%) were excluded due to artifacts or poor signal. For MT and T2*WI, 6 (1.2%) and 4 (0.8%) sections were excluded due to artifacts, respectively.

Automated Analysis
Automated segmentation was frequently successful, with manual editing required in 8 T2WI datasets (20%), 14 MT datasets (35%), 4 DTI datasets (10%), and 20 T2*WI datasets (50%). Manual segmentation editing was usually restricted to a small number of sections and required Ͻ5 minutes per dataset. Automatic registration to the template and data extraction were successful in all cases.

Cardiac Triggering in DTI
FA did not differ significantly among DTI acquisitions with and without cardiac triggering, though triggering showed a trend toward higher FA at MCL (0.558 versus 0.514, P ϭ .06) and caudal (0.562 versus 0.534, P ϭ .07) levels (Table 5). No significant differences in TRCOV were observed, though cardiac-triggered DTI provided approximately 1% lower TRCOV at all levels.

Summary of Findings
This study establishes a multiparametric MR imaging protocol and analysis framework to assess the microstructure of the entire cervical SC by using simple methods that are feasible for clinical adoption, requiring only 20 minutes of acquisition  MCLs and caudal levels, likely related to distorted anatomy, increased partial volume effects, increased susceptibility artifacts, and less accurate registration to the SCT template. However, these differences were not significant, and pooled reliability results were all considered acceptable (TRCOV Ͻ 5%). Our clinically feasible multiparametric approach provides 4 unique quantitative measures in multiple ROIs that reflect aspects of macrostructure and microstructure, with the benefit that these measures cross-validate each other to overcome the limitations (reliability, intersubject variability, sensitivity to pathology) of each individual technique. We anticipate that this multivariate approach can accurately characterize tissue injury in various SC pathologies, which could enable quantitative MR imaging of the SC to achieve clinical translation in the near future.

Normalization for Confounding Factors
It is essential that quantitative readouts reflect pathologic changes and eliminate confounding effects as much as possible to move toward clinical use of SC quantitative MR imaging. In keeping with prior reports, significant relationships were found between age and FA 5,7,8 and MTR, 8 but not CSA. 8,23 However, we also identified univariate relationships between MR imaging metrics and sex, height, weight, and cervical cord length, for which we are not aware of previous reports. The relationship between CSA and cervical cord length likely indicates that CSA is related to overall body size because height and weight also showed positive (nonsignificant) correlations. It is unclear why MTR decreases with height, but weak negative trends were also seen with weight and  cervical cord length, suggesting that MTR (reflecting myelin density) is negatively related to overall body size. However, no relationship was present between MTR and CSA in a post hoc test (r ϭ 0.01, P ϭ .94). Strong relationships were also found among all 4 metrics and the rostrocaudal level, with the CSA, FA, and MTR showing nonlinearity (Fig 3). CSA increased between the C3 and C6 vertebral levels, reflecting the cervical enlargement that contains increased GM for C5-T1 neurologic levels, and our CSA measurements were highly similar to those in previous reports. 32,33 WM FA peaked at C2 and locally at C7, where the orientations of axons are almost purely rostrocaudal. In contrast, decreases were seen at C1 (likely due to decussation of corticospinal fibers) and in the cervical enlargement (where a fraction of axons turn and form synapses within the GM). The T2*WI WM/GM ratio was nearly invariant from C1 to C6 but increased at C7, likely due to increased susceptibility artifacts from the lungs, decreased SNR, and respiratory motion. We suggest a normalization scheme in which CSA, FA, and MTR are linearly corrected for relationships (cervical cord length, age, and age/height, respectively) and all metrics are converted to z scores per rostrocaudal level, as proposed by Uda et al 4 for DTI metrics. Although normalization procedures add complexity to data postprocessing, these methods facilitate fair comparisons, decrease nuisance variability, and produce more accurate biomarkers of SC tissue injury.

Quantitative MR Imaging Techniques: Specificity, Accuracy, Feasibility
The rapidly evolving field of quantitative MR imaging includes a rich array of acquisition techniques, including strict quantitative methods that attempt to measure a specific physical property, such as quantitative MT, longitudinal relaxation rate, and apparent transverse relaxation rate mapping. 27,34,35 However, such techniques are inherently complex and require specialized pulse sequences, while typically requiring lengthy scan times. Furthermore, these methods face challenges in achieving acceptable SNR and reliability, particularly in the SC, which is considerably more difficult to image than the brain due to magnetic field inhomogeneity and physiologic motion. Similarly, reduced FOV DTI has become available, offering increased SNR and reduced distortions but often requiring increased acquisition times and involving proprietary pulse sequences. 31 Our protocol purposefully used standard sequences available from all major MR imaging vendors, making it an attractive approach for multicenter studies and clinical use. A recent study comparing reduced FOV with outer volume suppression for cervical SC DTI found only minimal dif-ferences in reliability (intersubject coefficient of variation: reduced FOV ϭ 3.98% versus outer volume suppression ϭ 4.59). 31 Unfortunately, this study did not report P values for these comparisons, and it did not assess intrasubject reliability, but the findings suggest that outer volume suppression provides acceptable reliability.

Cardiac-Triggered DTI
Previous research suggests that cardiac triggering reduces variance in diffusion time-series by acquiring data during the quiescent phase of cardiac-related SC motion. 36 However, to our knowledge, no studies have directly compared the test-retest reliability of SC DTI acquisitions with and without cardiac triggering, particularly in the context of multiple acquisitions and outlier rejection during postprocessing. Our pilot data in 10 subjects suggest roughly equivalent results with and without triggering, though trends toward higher FA and lower TRCOV (approximately 1%) were observed with triggering. Further investigation is needed, but the ungated acquisition used in this study is validated by its acceptable reliability. This simpler approach avoids difficulties with triggering such as variable TR and cardiac irregularities (arrhythmias, tachycardia) that are more common in older or critically ill patients.

Limitations
Further studies with larger sample sizes would allow greater accuracy for normative data, influences of confounding variables, and differences in DTI with and without cardiac triggering. The normative data are specific to our methodology, and cross-site and cross-vendor validation is required. Our use of automated analysis aimed to reduce bias, but manual editing of segmentations was frequently required. Other DTI metrics were not analyzed due to an a priori decision to focus on FA, due to its consistent results in previous studies. 3 Our test-retest reliability experiment does not account for scanner drift, but this is unlikely a large source of error because the 2 metrics are ratios rather than absolute signal-intensity values. Neurologically intact subjects with mild SC compression were considered healthy subjects; these changes are evident in 8%-26% of asymptomatic individuals. 32,37 Moreover, we think that the spectrum of "normal" includes this subgroup, but previous studies have excluded such subjects.

CONCLUSIONS
Reliable multiparametric assessment of the SC microstructure is possible with standard hardware, acceptable acquisition times, and automated analysis that provide high-fidelity readouts of tissue injury from numerous ROIs. Normalization procedures can be implemented to mitigate confounding effects such as age, height, cervical cord length, and rostrocaudal level, producing more meaningful quantitative metrics. Our clinically suited approach paves the way for translational studies to evaluate potential uses such as improved diagnostics, monitoring of disease progression, and prediction of outcomes. a Paired t tests were used to compare FA values extracted from WM at rostral (C1-C3), midcervical (C4 -5, healthy subjects), or MCL (subjects with DCM), and caudal (C6 -C7) levels between no triggering vs triggering in 10 subjects (4 healthy, 6 with DCM). Welch t tests were used to compare test-retest coefficient of variation between no triggering (n ϭ 26) and triggering (n ϭ 10). b Trends (P Ͻ .10).