Choice of Diffusion Tensor Estimation Approach Affects Fiber Tractography of the Fornix in Preterm Brain

BACKGROUND AND PURPOSE: Neonatal DTI enables quantitative assessment of microstructural brain properties. Although its use is increasing, it is not widely known that vast differences in tractography results can occur, depending on the diffusion tensor estimation methodology used. Current clinical work appears to be insufficiently focused on data quality and processing of neonatal DTI. To raise awareness about this important processing step, we investigated tractography reconstructions of the fornix with the use of several estimation techniques. We hypothesized that the method of tensor estimation significantly affects DTI tractography results. MATERIALS AND METHODS: Twenty-eight DTI scans of infants born <29 weeks of gestation, acquired at 30-week postmenstrual age and without intracranial injury observed, were prospectively collected. Four diffusion tensor estimation methods were applied: 1) linear least squares; 2) weighted linear least squares; 3) nonlinear least squares, and 4) robust estimation of tensors by outlier rejection. Quality of DTI data and tractography results were evaluated for each method. RESULTS: With nonlinear least squares and robust estimation of tensors by outlier rejection, significantly lower mean fractional anisotropy values were obtained than with linear least squares and weighted linear least squares. Visualized quality of tract reconstruction was significantly higher by use of robust estimation of tensors by outlier rejection and correlated with quality of DTI data. CONCLUSIONS: Quality assessment and choice of processing methodology have considerable impact on neonatal DTI analysis. Dedicated acquisition, quality assessment, and advanced processing of neonatal DTI data must be ensured before performing clinical analyses, such as associating microstructural brain properties with patient outcome.

D TI enables in vivo assessment of WM microstructure and has become essential for quantification of brain abnormalities because it has been suggested to provide early biomarkers of neurodevelopment. 1 Fiber tractography has the unique property to delineate specific WM pathways and is rapidly gaining in popularity because it may reveal substantial insights into disturbed brain connectivity and functionality of infants born preterm. [2][3][4] There are many technical issues that may complicate the analysis of DTI data, including scanner type, hardware setup, acqui-sition parameters, and processing methodology. 5,6 In addition, DTI applied in preterm infants is especially challenging because of specific clinical factors, such as the increased risk of subject motion, hemodynamic vulnerability, smaller head sizes, and higher heart and breathing rates compared with healthy adults. Therefore, before associations between tractography results and neurodevelopmental outcome can be established, it is of paramount importance that acquisition and processing of DTI data are performed with the highest standards possible. 7,8 For example, different algorithms to estimate the diffusion tensor have been developed. These methods differ considerably in processing speed and dealing with data outliers. For instance, the linear least squares (LLS) method is widely used to estimate diffusion parameters but may lead to inaccuracy as it incorrectly assumes that data outliers are homogeneously distributed. Furthermore, there seems to be no consensus on how to practically define and handle data outliers. Awareness of these matters is essential because improper use may lead to inaccuracy; especially if data are compared when different estimation methods have been used. Unfortu-nately, however, most studies using preterm brain DTI data have hardly focused on these important aspects, calling for a thorough investigation.
In the present study, trajectories of the fornix were reconstructed with fiber tractography for 28 preterm infants and compared when different diffusion tensor estimation approaches were used. Our hypothesis was that the chosen tensor estimation methodology significantly affects results of fiber tractography. This would demonstrate that an informed choice of diffusion tensor estimation is crucial for a reliable tractography analysis, which is especially relevant when artifact-sensitive DTI data of the preterm brain are involved.

MATERIALS AND METHODS
This study was approved by the institutional review board. Written informed parental consent was obtained for all subjects.

Patients
Between February 2011 and December 2012, preterm infants born before a gestational age of 29 weeks were recruited prospectively. MR imaging data were acquired at a postmenstrual age of 30 weeks (29 4/7 to 30 4/7 weeks). To avoid unnecessary data heterogeneity, infants with evidence of intracranial injury (intraventricular or cerebellar hemorrhage, WM abnormalities) observed with conventional MR imaging (see T1WI and T2WI protocols below) were excluded. Of the 217 eligible infants, 36 died before 30-week postmenstrual age; in 82 infants, the MR imaging scan could not be performed at 30-week postmenstrual age because of hemodynamic instability or logistic circumstances; and informed parental consent was not obtained for 20 infants. Of the remaining 79 infants, 36 had intracranial abnormalities, and 15 others were excluded from further analysis because different DTI acquisition settings were applied. This eventually resulted in 28 usable datasets.

MR Imaging
MR imaging procedures were carried out according to protocol 9 : all infants were accompanied by trained staff only and were positioned in an MR imaging-compatible incubator (Lammers Medical Technology, Luebeck, Germany) that provided controlled temperature and humidity, MR-compatible pulse oximetry, and MR-compatible ventilation. Moldable earplugs and neonatal earmuffs protected the infants from auditory noise; no sedation was given.

DTI Data Processing
DTI data were analyzed with the use of ExploreDTI (http://www. exploredti.com) 10 version 4.8.3. The diffusion-weighted images were first corrected for eddy currents, EPI distortion, and patient movement. 11,12 The diffusion tensor was then estimated according to 4 different methods: 1) LLS; 2) weighted linear least squares (WLLS); 3) nonlinear least squares (NLLS), and 4) robust estimation of tensors by outlier rejection (RESTORE). [13][14][15] Next, wholebrain tractography was performed for all datasets with the following parameters: fractional anisotropy (FA) threshold: 0.08; fiber length range: 15-500 mm; angle threshold 30°; and step size: 1 mm. 16 Without loss of generality of this work, a single WM structure was investigated. Because of its important relation to cognition 17,18 and high tracking reproducibility as the result of its unique shape, we performed tractography of the fornix. ROI placement was performed on the color-coded FA maps 19 : 1) 1 "OR" ROI was placed in the axial plane at the level of the bilateral columns of the fornix, above the mammillary bodies; 2) 2 "AND" ROIs were placed: in the coronal plane to encompass the corpus of the fornix and in the axial plane to include both crura of the fornix in the same section where the "OR" ROI was placed, and 3) 2 "NOT" ROIs were placed in the sagittal plane laterally to the seed region to exclude fibers from the anterior commissure 20 (Fig  1). For each subject, tractography was repeated for each method of tensor estimation while the same subject-specific ROIs were used.

Data Analysis
Quality of the diffusion-weighted images was assessed with the outlier profiles of each dataset, after diffusion tensor estimation by use of RESTORE (Fig 2). The mean percentage of outliers per dataset was calculated by averaging the percentage artifacted voxels across the diffusion gradient orientations. 6 In addition, tract parameters, including mean FA, mean diffusivity, mean fiber trajectory length (in mm), and number of fiber trajectories were computed for each dataset.
The quality of tractography was visually and systematically evaluated (Table) by 2 authors independently. Both reviewers were blinded to the method of tensor estimation. The final score was the average of both total scores and ranged from 0 -10 (Table). Statistical analysis was performed by use of SPSS version 20.0.0.1 (IBM, Armonk, New York). Intraclass correlations between both observers were calculated by use of a 2-way mixed model. Coefficients Ͻ0 were considered as no agreement; 0 -0.20 as slight; 0.21-0.40 as fair; 0.41-0.60 as moderate; 0.61-0.80 as substantial; and 0.81-1 as almost perfect agreement. 21 The Spearman correlation coefficient was used to investigate the correlation between the visualized quality of tract reconstruction and the mean outlier percentage per dataset. Independent-sample t tests served to test differences in visualized quality of tract reconstruction between data with Ͻ10% and Ͼ10% data outliers. Repeated-measures ANOVA served to test differences in tract parameters between the diffusion tensor estimation methods, respectively. Difference in variability of tract parameters between the estimation techniques was tested with the Levene test for equality of variances. A value of P Ͻ .05 (2-sided) was considered statistically significant.

Descriptive Statistics
Twenty-eight infants (15 boys) were included in this study. Mean gestational age and birth weight were 27.7 weeks (SD: 1.1 weeks) and 1053 g (SD: 256 g), respectively. Mean postmenstrual age at image acquisition was 30.0 weeks (SD: 0.3 weeks). The mean percentage of outliers per dataset was 10.1% (SD: 1.3%).

Outlier Evaluation
Figures 3 and 4 are characteristic representations of data with poor and good quality of data, respectively. Interclass correlation between observers showed excellent agreement with high significance (intraclass correlation coefficient: 0.87; 95% confidence interval: 0.82-0.91; P Ͻ .01). Although there was some overlap among the methods of tensor estimation, visualized quality of reconstruction of the fornix was significantly higher with the use of the RESTORE algorithm, particularly in datasets with a high percentage (Ͼ10%, n ϭ 13) of data outliers (Fig 5).

Tract Parameters
The impact of the diffusion tensor estimation method used on tract parameters is shown in Fig 6. There was a significant difference in mean FA value through the use of different diffusion estimation algorithms. Significantly lower mean FA values were obtained with NLLS and RESTORE than with LLS and WLLS. Furthermore, application of the RESTORE approach resulted in the lowest standard deviation of the mean FA value; LLS: 0.059; WLLS: 0.054; NLLS: 0.052; RESTORE: 0.051 (repeated-measures ANOVA, P Ͻ .05). Although not statistically significant, there was a trend toward an increased number of fiber trajectories in the following order of tensor estimation approaches: LLS, WLLS, NLLS, and RESTORE (Spearman correlation coefficient: 0.10; P ϭ .32).
With the use of LLS, variability of mean FA values was significantly higher in datasets with more than 10% outliers compared with datasets with less than 10% outliers. With WLLS, NLLS, or RESTORE, there was no difference in variability of mean FA values with regard to the quality of the diffusion-weighted images (Fig 7).

DISCUSSION
This study emphasizes the paramount importance of quality assessment and dedicated use of processing methodology of neonatal DTI data before performing analysis. With our work, we demonstrated that 1) tract parameters are significantly affected by the chosen tensor estimation method and are estimated more reliably if data outliers are handled carefully; 2) robust estimation of the diffusion tensor results in significantly improved visualized quality of fiber reconstruction; 3) the mean percentage of data outliers of the diffusion-weighted images correlates significantly to visualized quality of tract reconstruction; and 4) data outliers are common and significantly affect subsequent DTI analysis if they are not taken into account.
Although the incidence of destructive types of brain injury with subsequent serious deficits is decreasing, preterm infants remain at considerable risk to develop cognitive and socio-emotional disabilities that persist into adolescence. 22,23 Advanced MR imaging techniques, such as DTI, have already provided important valuable insights into WM microstructural properties of these "subtle" types of brain injury. 1 Still, the neuropathologic correlates show high variation and inconsistency, 23 and their workings are not completely understood. 24 Moreover, preterm infants are vulnerable to respiratory and hemodynamic instability and movement artifacts. 7,9 Therefore, acquisition and processing of DTI data must be handled with dedicated care, which is essential to avoid misinterpretation. This is appropriately described in technical DTI reports, 5-7,25,26 but, paradoxically, such work receives little attention in clinical research. This could be because of their emphasis on clinical results, or, more importantly, because of the lack of awareness that the choice of diffusion tensor estimation approach may affect the subsequent reconstruction of fiber pathways.
As shown in the present study, the choice of tensor estimation algorithms can significantly affect DTI tractography results, which may complicate the interpretation of specific findings. In this context, study populations can only be compared reliably when identical processing pipelines have been applied, necessitating FIG 3. Impact of diffusion tensor estimation method on tract reconstruction of poor-quality DTI data. Characteristic representations illustrate the effect of tensor estimation methodology on reconstruction of the fornix with high percentage of data outliers (Ͼ10%). Note that reconstruction is not possible with the use of the linear least squares (LLS) and weighted linear least-squares (WLLS) methods and appears to be slightly possible with nonlinear least squares (NLLS) but is very well performed if the robust estimation of tensors by outlier rejection (RESTORE) approach is used. the use of standardized guidelines before drawing conclusions with regard to outcome. Thus, strategies to limit image corruption should be incorporated into setups to acquire neonatal DTI data 27 ; this includes 1) prevention of motion by comforting the infant and promoting natural sleep 28 ; 2) adjustment of parameter settings, by shortening diffusion time, applying stronger gradients, or by use of lower b-values 7 ; 3) oversampling gradient-sensitizing directions and removing corrupted diffusionweighted images 6 ; and 4) applying more advanced tensor estimation methods. 5 Because diffusion tensor estimation techniques differ considerably in principle, speed, and accuracy, 29 awareness of the benefits and pitfalls is essential: the LLS method is fast and mostly used but assumes that errors are identically distributed, which can result in inaccurate estimation of the tensor. 30 The WLLS method is slightly slower but provides more accurate results because it considers errors to be heterogeneously distributed. 31 NLLS iteratively minimizes errors and results in more reliable estimation but needs considerably longer processing time and may get stuck in local optima during optimization. 5,14 The RESTORE approach automatically detects and removes outliers before tensor estimation. This avoids manual and subjective identification of corrupted diffusion-weighted images and appears to be particularly valuable for data with frequent motion corruption. 13,26 In summary, the reliability of DTI analyses is drastically improved when handling data outliers in an appropriate way. However, additional research is needed to determine what types of data processing can reliably be performed without affecting data quality. This report presses the need for careful data handling because corrupted data can significantly affect the final result. Although this will require longer processing time and perhaps the need to remove datasets completely, it will probably decrease the spread in the final analysis and therefore improve statistical significance and reduce sample size.
Limitations of this study are important and must be addressed. First, only datasets without evidence of injury were used, and this may have resulted in a selection bias. Although this policy provided a homogeneous study population, we did not investigate the impact of processing datasets with brain injury. Second, we applied an arbitrary boundary to define "good-quality" and "poor-quality" data: 10% data outliers. We used this threshold solely to illustrate the impact of poor data quality on DTI analysis; hence, we do not suggest that this 10% level should be used as a threshold for future studies to define poor data quality.  Impact of DTI data quality on tract reconstruction of the fornix. Quality of the reconstructed fornix was significantly higher by use of the robust estimation of tensors by outlier rejection technique; this was particularly evident for datasets with high percentages of outliers in the diffusion-weighted images (Ͼ10%).

FIG 6.
Impact of diffusion tensor estimation method on tract parameters. Tract parameters, such as fractional anisotropy (FA) (A), mean diffusivity (B), mean fiber trajectory length (C), and number of fiber trajectories (D) were affected by the tensor estimation method; mean FA value was significantly lower with use of the nonlinear least squares and robust estimation of tensors by outlier rejection techniques (paired sample t test, P Ͻ .05).

FIG 7.
Impact of data quality on variability of tract parameters. Diffusion-weighted images with high outlier percentages (Ͼ10%) resulted in a significantly increased variability of mean fractional anisotropy values compared with data with fewer data outliers (Ͻ10%) if linear least squares was used (Levene test for equality of variances, P Ͻ .05). With application of robust estimation of tensors by outlier rejection, there was no difference in variability with regard to data quality.