Abstract
BACKGROUND AND PURPOSE: Brain volumetrics have historically been obtained from MR imaging data. However, advances in CT, along with refined publicly available software packages, may support tissue-level segmentations of clinical CT images. Here, brain volumetrics obtained by applying two publicly available software packages to paired CT-MR data are compared.
MATERIALS AND METHODS: In a group of patients (n = 69; 35 men) who underwent both MR imaging and CT brain scans within 12 months of one another, brain tissue was segmented into WM, GM, and CSF compartments using 2 publicly available software packages: Statistical Parametric Mapping and FMRIB Software Library. A subset of patients with repeat imaging sessions was used to assess the repeatability of each segmentation. Regression analysis and Bland-Altman limits of agreement were used to determine the level of agreement between segmented volumes.
RESULTS: Regression analysis showed good agreement between volumes derived from MR images versus those from CT. The correlation coefficients between the 2 methods were 0.93 and 0.98 for Statistical Parametric Mapping and FMRIB Software Library, respectively. Differences between global volumes were significant (P < .05) for all volumes compared within a given segmentation pipeline. WM bias was 36% (SD, 38%) and 18% (SD, 18%) for Statistical Parametric Mapping and FMRIB Software Library, respectively, and 10% (SD, 30%) and 6% (SD, 20%) for GM (bias ± limits of agreement), with CT overestimating WM and underestimating GM compared with MR imaging. Repeatability was good for all segmentations, with coefficients of variation of <10% for all volumes.
CONCLUSIONS: The repeatability of CT segmentations using publicly available software is good, with good correlation with MR imaging. With careful study design and acknowledgment of measurement biases, CT may be a viable alternative to MR imaging in certain settings.
ABBREVIATIONS:
- BV
- brain volume
- CNR
- contrast-to-noise ratio
- DARTEL
- Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra
- ICV
- intercranial volume
- LoA
- limits of agreement
- SPM
- Statistical Parametric Mapping
Detailed analysis of brain volumetric data has been a topic of major interest during the past several decades. Abnormalities of global brain volume (BV) have been identified in multiple sclerosis,1 amyotrophic lateral sclerosis,2 and age-related dementia,3 being just a few examples. Beyond these examples of clinical populations with known brain atrophy, more subtle differences in cortical anatomy as they relate to typical and atypical developmental processes4 and individual differences in personality traits5 have been studied.
The exquisite soft-tissue contrast of MR imaging has given rise to a number of publicly available software platforms that enable tissue segmentation and statistical analysis of imaging data, with MR imaging thus firmly establishing itself as the criterion standard imaging technique for brain volumetric analysis.6⇓-8 However, MR imaging does have several important limitations: Obtaining high-quality images in the presence of increasingly common implants remains challenging, MR imaging has high cost, and long scan times lead to higher rates of artifact-corrupted images unsuitable for analysis due to poor subject compliance. Compared with MR imaging, CT is much more affordable and available for both patients and imaging departments. It is also less subject to motion artifacts due to its acquisition speed. Historically, prospective neuroimaging research studies using CT have been difficult to justify due to the unavoidable use of ionizing radiation and the limited data obtainable. However, given the nature of CT as a first-line diagnostic tool, the number of CT images available for analysis from existing clinical data—on both a patient and population level—greatly outnumbers the number of MR images. Therefore, CT appears highly suitable for in vivo study of the brain, and retrospective analysis of CT images that already exist within electronic health records may serve as a useful platform for discovery.
Group-level analysis of brain imaging data for research purposes involves several preprocessing steps, including segmentation of different tissue types (GM, WM, CSF). Any successful tissue segmentation relies on a sufficient contrast-to-noise ratio (CNR) between ≥2 tissues to discriminate tissue boundaries. Supported by the relatively strong contrast between parenchyma and CSF and between parenchyma and bone, segmentation of total brain volumes from CT images has been accomplished.9,10 However, brain tissue segmentation into GM and WM was considered nonviable for many years due to the low CNR between those tissues. The steady improvement in CT technology and image quality has recently led to several groups taking a second look at segmenting CT images using such diverse approaches as intensity-thresholding,11,12 atlas-based,13-15 and learning-based methods.16
Despite this growing interest in CT image analysis and segmentation, validation remains challenging and rare; the administration of ionizing radiation to a volunteer cohort is difficult to justify on ethical grounds, and, to our knowledge, there has been only 1 report of a paired, within-subject comparison of volumes segmented from CT against those segmented from MR imaging. In that study, SPM12 (http://www.fil.ion.ucl.ac.uk/spm/software/spm12) was adapted for CT segmentation, and the results were compared with MR imaging segmentations obtained from FreeSurfer (http://surfer.nmr.mgh.harvard.edu). The brain volumes derived from the 2 modalities were found to be in good agreement, though the total number of patients with paired data was small (10 and 25 patients in 2 study arms).13 Like Statistical Parametric Mapping (SPM), FMRIB Software Library (FSL)17 is a commonly used, open-source software library that includes an image-processing toolbox designed for analysis of MR images of the brain (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSL). Despite being developed for MR imaging, FSL has been used to obtain volumes from brain-extracted head CT images.18,19 The goal of the current study was to compare volumes obtained from paired CT and MR imaging data in a broad patient cohort using these 2 publicly available software packages.
MATERIALS AND METHODS
Subjects
This study was reviewed and approved by Geisinger institution’s review board. Data were identified for subjects who had undergone CT and high-resolution MR imaging within 12 months of one another. High-resolution MR imaging sequences included MPRAGE, echo-spoiled gradient echo, and fast-spoiled gradient recalled. Patients were referred for stroke, hemorrhage, aneurysm, and tumor. Subjects were excluded if they were scanned with a nonroutine protocol (ie, a pediatric protocol).
Data Acquisition
All noncontrast head CTs were acquired in an axial or helical mode, 120–140 kV(peak), and modulated milliampere, minimum, 50, and maximum, 290 mA, acquired from the foramen magnum through the vertex with a standard 512 × 512 matrix, 24-cm FOV at a 5.0-mm section thickness (Online Supplemental Data). MR images varied more considerably across scanners, but generally, scans were acquired as inversion-prepared 3D fast gradient recalled-echo sequences, with in-plane resolutions of 0.8–1.0 mm and sufficient 0.8- to 1.2-mm axial slices to cover the entire brain (Online Supplemental Data).
Image Processing and Analysis
Preprocessing an MR Image File.
MR images were converted from DICOM to NIfTI format. Files were then visually inspected to identify any artifacts or gross abnormalities that would prevent accurate processing. These images (n = 7) were removed from the pipeline and excluded from further processing. We then completed standard preprocessing steps using Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra (DARTEL in SPM) or FAST (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/fast)20, described below.
Preprocessing CT Image Files.
A series of steps were completed on CT images to adjust image parameters to facilitate image segmentation. Upper and lower threshold limits were first applied to the image using fslmaths (https://open.win.ox.ac.uk/pages/fslcourse/practicals/intro3/index.html) functions in the FSL software package (upper limit, 100; lower limit, −15). These thresholds were chosen after some preliminary experiments and were generally found to sufficiently retain tissue distinctions and boundaries, while eliminating much of the skull and extraneous noise. The origin point for each scan was adjusted to the anterior commissure using SPM. If the image quality was poor, too “grainy,” or there was a sizable morphologic obstruction that made it difficult to identify the anterior commissure, the image file was excluded from further analysis (n = 35). Images were then run through the FSL Brain Extraction Tool (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/BET; BET) with a fractional intensity threshold of 0.01 to remove any skull and nonbrain tissue that remained after the threshold adjustment.21 After BET, all files were visually inspected to identify any artifacts or gross abnormalities that would prevent accurate voxel-based morphometry processing. These images were also removed (n = 49) from the pipeline and excluded from further processing. The remaining 260 images were run through the voxel-based morphometry or FAST pipelines, outlined below.
Voxel-Based Morphometry Pipeline
Global GM, WM, CSF, BV (BV = GM + WM), and total intercranial volumes (ICVs = GM + WM + CSF) were estimated for CT and MR images using the DARTEL toolbox in SPM 12.22 First, GM, WM, and CSF segmented images were generated in NIfTI format in native space for both MR imaging and CT image files. In each segmented image, the numeric value of each voxel was an estimation of the fraction of the volume of the voxel representing the corresponding tissue type, ranging between 0 and 1. To compute a specific tissue volume, we discarded voxels with values of <0.2 (ie, noisy voxels) from the segmented image; the remaining voxels were summed and multiplied by the volume of a voxel in milliliters to represent whole-brain tissue volume.23 Normalization was then accomplished using DARTEL. The procedure first rigidly transforms each subject’s GM and WM segmented images to Montreal Neurological Institute space and then uses the files for all subjects to create a mean GM template in Montreal Neurological Institute space using an iterative procedure.3 This procedure also yields subject-specific flow-field files that encode the deformation that occurs during the template creation. Then, the native space GM file, the flow-field file, and the mean GM template are processed together to generate a subject-specific normalized-modulated GM image in Montreal Neurological Institute space. The modulation procedure ensures that the whole-brain and regional GM volumes are preserved (relative to the native GM image) postnormalization.
To determine the optimal threshold for the normalized-modulated image, we used a custom procedure. This procedure systematically alters the threshold of the normalized modulated image and, at each threshold value, computes voxel values by following the same steps used for estimating the native space GM volume. The threshold value at which the absolute difference between the native space and normalized modulated image is minimized is then chosen as the optimal threshold. In the normalized modulated file, all voxels with a value below this optimal threshold are zero.
FAST Pipeline
The use of FAST for CT segmentation has been described elsewhere.18 Briefly, the T1-weighted image–type setting was used, and the number of segmentation classes was set to 3. The Markov random field, iterations, and bias field smoothing values were set to 0.1, 4, and 20.0, respectively, with partial volume segmentation output. Volumes were extracted using the fslstats function (https://open.win.ox.ac.uk/pages/fslcourse/practicals/intro3/index.html).
Data Pruning
We additionally ran a cluster analysis using R statistical and computing software (http://www.r-project.org) to identify any outliers, which were then excluded from the final data set. The analysis identified 16 images with GM, WM, and/or CSF volumes that fell outside the expected range. The initial data pull included images from 8 CT and 7 MR imaging scanners. To reduce possible variation due to scanner make/manufacturer, data were included only from resources that had scanned at least 10 patients, reducing analysis to 3 CT and 3 MR imaging scanners. After we removed the inappropriate protocols, failed analysis, outliers, and low resource counts, the final data set included 69 distinct individuals with successfully segmented paired CT/MR imaging data from both FSL and SPM.
Statistics
Data are presented as median (interquartile range) or mean (SD) as appropriate. To determine whether scanner-specific parameters globally affect segmentation results, we compared ICV, BV, and GM and WM volumes across each MR imaging–CT scanner combination using 1-way ANOVA with Tukey post hoc testing between groups. To ensure that no bias was introduced from MR image contrast, we compared segmentation results between contrast-enhanced MR images and those without contrast. Several patients in the final cohort had repeat scans (8 for MR imaging, 165 for CT). Using these, we compared the repeatability of segmentation results using the coefficient of variation. Regression analysis and Bland-Altman limits of agreement were used to determine the level of agreement between segmentation measures derived from the 2 imaging modalities. When >1 image was available for a subject (CT or MR imaging), 1 image was chosen at random for comparison. T tests were used to compare average volumes within each pipeline. For all tests, P < .05 was considered significant.
RESULTS
A typical segmentation result from paired MR imaging–CT imaging from a 41-year-old patient is shown in Fig 1. As expected, the segmentation of the MR image is of high quality, with good gray, white, and CSF differentiation. In comparison, while tissue differentiation is readily apparent in the segmentation arising from the CT image, it is of lower visual quality, given the lower SNR and CNR of the base CT image.
Brain segmentation results. Brain-extracted MR imaging and CT (left) from a 41-year-old male patient at approximately the same section locations, with corresponding segmentation results from FSL and SPM on the right. Visually, both FSL and SPM perform well on MR imaging data, with SPM exhibiting a smoother appearance for segmentation results compared with FSL. Colors in segmentations represent GM (blue), WM (green), and CSF (red).
Assessment of Interscanner Variability and Repeatability
By 1-way ANOVA, no differences in average ICV, BV, and GM or WM volume were found between any set of scanners (Online Supplemental Data). Repeat measurements from the same subject had low variability (coefficient of variation of <10%) across all volumes and pipelines (Table 1). All further analysis used pooled data from all scanners.
CoV of global volumes derived from MR and CT
Volume Comparisons between MR Imaging and CT
Regression analysis showed good agreement between volumes derived from MR images versus those derived from CT. The correlation coefficient between the 2 methods was 0.93 for the SPM pipeline and 0.98 for the FSL pipeline (Fig 2). However, differences in global volumes were significant for all volumes compared within a given segmentation pipeline (Table 2).
The regression lines between all segmentation volumes (ICV, BV, GM volume, WM volume, CSF) extracted from MR imaging and CT using the SPM pipeline (left) and FSL (right). The SPM regression line has a slope of 0.97 (95% CI, 0.93–1.00, significantly different from identity, P < .05) and an intercept of 115 (95% CI, 90–141; r = 0.93). The FSL regression line has a slope of 0.93 (95% CI, 0.91–0.95, significantly different from identity, P < .05) and an intercept of 31 (95% CI, 15–46; r = 0.98).
Mean global volumes (mL) extracted from paired MRI/CT images
Bland-Altman analysis of the results from SPM (Fig 3) showed that statistically significant biases were present for ICV (16% bias, limits of agreement [LoA] ± 24%), BV (6% bias, LoA ± 20%), GM (10% bias, LoA ± 30%), and WM (36% bias, LoA ± 38%), with CT underestimating ICV, BV, and WM, and overestimating GM compared with MR imaging. Bland-Altman analysis of the results from FSL (Fig 4) showed generally more favorable agreement among modalities. Statistically significant biases were present for ICV (3% bias, LoA ± 14%), BV (6% bias, LoA ± 16%), GM (6% bias, LoA ± 20%), and WM (18% bias, LoA ± 18%), with CT overestimating ICV, BV, and WM, and underestimating GM compared with MR imaging.
Bland-Altman plots for global volumes derived from MR imaging and CT using the SPM pipeline. Low biases were observed for BV (70 mL; 6%) and GM (60 mL; 10%), with all biases observed being significant (P < .05).
Bland-Altman plots for global volumes derived from MR imaging and CT using the FSL pipeline. Low biases were observed for ICV (41 mL; 3%), BV (72 mL; 6%), and GM (31 mL; 6%), with all biases observed being significant (P < .05).
DISCUSSION
In this study, the MR imaging–CT segmentations produced by FSL were in closer agreement than those produced by SPM. In the 1 other study reporting paired MR imaging–CT data (which used SPM), volumes were found to agree to within 5%.13 In our hands, FSL volumes agreed to within 3%–18% and SPM volumes agreed to within 16%–36%, with WM being the outlier in both cases. Three important differences between their work and ours should be noted. First, while the CTs in that study were segmented with a modified SPM pipeline, MR images were segmented using another program, FreeSurfer. Second, that study was prospective in nature and had tight control over cohort inclusion and imaging parameters. By including multiple scanners from each technique with a broad range of clinical protocols in this current study, more relevant metrics for population-level retrospective studies can be obtained. Finally, the CT data included in that study were closer to isotropic resolution than images included here, affecting deformation and registration quality. Investigating the limits of agreement with regard to spatial resolution is a topic for future work.
Repeatability of CT segmentations using these methods is good (coefficient of variation of <10% for all volumes). Reproducibility studies of brain segmentations generated from MR images have produced coefficients of variation in the range of 0.2%–5.2%, depending on the tissue compartment and software package used.24,25 The repeatability metrics reported here compare favorably with these values, especially considering the similar results using MR imaging data within the same pipeline.
BVs are known to decrease with age, with an increasing fraction of ICV taken up by CSF across the decades. Because images were acquired within 1 year of one another and because all comparisons were pair-wise in the main analysis, no age-related correction factors were used in this study.
The population cohort used for this study was not healthy, having been referred for multiple head scans using different modalities for indications ranging from cancer to trauma. Therefore, close alignment with published values for the various volume and volume fractions reported here should not be anticipated. For reference, a recent review of the literature found BV ranges from 700–1300 HU, GM ranges of 400–700 HU, and WM ranges of 300–625 HU.26
One important limitation discovered in the implementation of these segmentation tools developed for MR imaging–CT data is the high failure rate of the various steps of the processing pipelines. Of 181 patients initially identified, 128 and 125 were successfully analyzed by SPM and FSL, respectively. Of those, 105 were successfully analyzed by both, with the final cohort number of 69 included in this study achieved after further pruning of outliers and of data obtained from resources with low scan counts. This high failure rate (∼30%) was largely driven by the brain-extraction step, which often failed by removing large portions of the frontal lobe from CT images or by removing small, somewhat spherical regions of the temporal or parietal lobes. Future developments of CT-specific algorithms for brain extraction are likely to improve this failure rate.
Several factors may explain the differences observed between volumes obtained from CT versus MR imaging in the present study. First is the drastically reduced CNR of a CT brain image compared with its MR imaging counterpart. WM and GM are typically separated by only 5–10 HU on a CT image, with typical noise levels; this feature translates to CNRs in the range of 1–2. An MR image, by comparison, may have a CNR on the order of 10–15, when comparing white and gray matter. The other major factor differentiating MR imaging data from CT is the drastically different resolutions of the 2 modalities. High-resolution MR imaging data, as is typically used for segmentations such as these, are acquired around 1 mm isotropic. Routine brain CT has a high in-plane resolution; however, the through-plane section thickness is typically much larger (the routine CT brain protocol used at our institution has a 5-mm section thickness), causing partial volume averaging of tissue radiodensities in the CT image, further blurring the differences between tissue types.
CONCLUSIONS
MR imaging will undoubtedly remain the criterion standard for brain tissue segmentation and volumetric analysis. Quantitative assessment of volumes is dependent on many variables including imaging technique and segmentation software. However, our study shows that general trends do emerge and that certain volumetric classes can be estimated with a reasonable level of certainty. With careful study design, the convenience, affordability, and availability of CT data should be considered for large, population-based studies of brain volumetrics. Given that CT images are captured much more frequently than MRIs in clinical care, continued improvement of algorithms for estimating brain tissue volumes from CT could have a profound impact for population studies that use existing electronic health records.
Footnotes
Disclosure forms provided by the authors are available with the full text and PDF of this article at www.ajnr.org.
References
- Received August 2, 2021.
- Accepted after revision November 5, 2021.
- © 2022 by American Journal of Neuroradiology