Mesial Temporal Sclerosis: Accuracy of NeuroQuant versus Neuroradiologist

BACKGROUND AND PURPOSE: We sought to compare the accuracy of a volumetric fully automated computer assessment of hippocampal volume asymmetry versus neuroradiologists' interpretations of the temporal lobes for mesial temporal sclerosis. Detecting mesial temporal sclerosis (MTS) is important for the evaluation of patients with temporal lobe epilepsy as it often guides surgical intervention. One feature of MTS is hippocampal volume loss. MATERIALS AND METHODS: Electronic medical record and researcher reports of scans of patients with proved mesial temporal sclerosis were compared with volumetric assessment with an FDA-approved software package, NeuroQuant, for detection of mesial temporal sclerosis in 63 patients. The degree of volumetric asymmetry was analyzed to determine the neuroradiologists' threshold for detecting right-left asymmetry in temporal lobe volumes. RESULTS: Thirty-six patients had left-lateralized MTS, 25 had right-lateralized MTS, and 2 had bilateral MTS. The estimated accuracy of the neuroradiologist was 72.6% with a κ statistic of 0.512 (95% CI, 0.315–0.710) [moderate agreement, P < 3 × 10−6]), whereas the estimated accuracy of NeuroQuant was 79.4% with a κ statistic of 0.588 (95% CI, 0.388–0.787) [moderate agreement, P < 2 × 10−6]). This discrepancy in accuracy was not statistically significant. When at least a 5%–10% volume discrepancy between temporal lobes was present, the neuroradiologists detected it 75%–80% of the time. CONCLUSIONS: As a stand-alone fully automated software program that can process temporal lobe volume in 5–10 minutes, NeuroQuant compares favorably with trained neuroradiologists in predicting the side of mesial temporal sclerosis. Neuroradiologists can often detect even small temporal lobe volumetric changes visually.

T emporal lobe epilepsy represents the most common type of partial complex epilepsy in adulthood. 1 There are 2 forms of temporal lobe epilepsy: a common form with mesial temporal lobe symptoms and a rarer form with lateral temporal lobe symptoms. 2 Mesial temporal sclerosis (MTS) is the most common pathologic entity encountered in epilepsy surgery series. 1 Its histologic confirmation is a major predictive factor for postoperative seizure control. 3 Sclerosis of the hippocampus progresses with time as both a consequence and/or a cause of seizures. 4 It most commonly manifests pathologically as gliosis and volume loss. Clinically epileptiform electroencephalography activity lateralizes to the temporal lobe on the ipsilateral side of MTS.
Multiple structural and functional imaging modalities are available to diagnose MTS and to guide surgical treatment of medically intractable seizures. 5 Most clinical MR imaging studies are sufficient to detect gross hippocampal atrophy changes; however, early changes of hippocampal atrophy may be overlooked by even experienced radiologists because of their subtlety. 6 According to Spencer et al, 7 computerized volumetric measurement of the hippocampus improves the assessment of patients with temporal lobe epilepsy and adds sensitivity and specificity to the clinical visual evaluation. Others believe that visual inspection alone is sufficient to accurately detect hippocampal sclerosis. 8 The degree of disproportionate hippocampal volume inequality required for visual detection of the hippocampal atro-phy in MTS has not yet been determined. Because there are other nonvolumetric findings that suggest MTS (eg, signal-intensity changes, blurring of gray-white borders, malrotation), it would seem that merely assessing volume, even if highly reliable, would not be sufficient for adequately assessing patients with temporal lobe epilepsy.
Manual hippocampal volumetry has been the standard technique for assessing hippocampal volume loss in MTS, Alzheimer disease, and other disorders in the research realm. 9 However, such manual quantification of temporal lobe volume is cumbersome, time-consuming, not easily reimbursed, and requires extensive training. Until recently, there was no FDA-approved means for providing computer-based, semiautomated, or automated hippocampal volumetry. NeuroQuant (CorTechs Labs, San Diego, California) is a software package that is FDA-cleared for marketing (510[k]K061855) and is now commercially available. Its value in the clinical setting has not been extensively reviewed since FDA approval. The technique for parcellating brain regions and assessing volumetry has been previously outlined by Brewer et al 10 in 2009 in the American Journal of Neuroradiology. The steps involved include sequence-checking to ensure that appropriate high-resolution and contrast image parameters are performed, correction for field/gradient inhomogeneities, removal of the overlying calvaria, alignment to the probabilistic atlas of stereotypical anatomy, and segmented volumetry of predetermined anatomic areas derived from multiple subjects of multiple age groups (Brewer et al). [10][11][12][13] We sought to assess the value of NeuroQuant volumetry of the temporal lobe in patients with MTS with the added goal of trying to determine the difference in right and left hippocampal volumes that can be detected by experienced neuroradiologists in the clinical assessment of patients with temporal lobe epilepsy. We hypothesized that because radiologists assess the hippocampi for other imaging findings above and beyond volume loss alone, the neuroradiologist's determination of the correct side of the MTS would be more accurate than basic NeuroQuant volumetry.

MATERIALS AND METHODS
This retrospective study was reviewed and approved by the institutional review board of the School of Medicine and the Committee for the Protection of Human Subjects. Due to the retrospective nature of the study, informed patient consent was not required for the review of medical records and radiographic examinations, and the study was deemed to be Health Insurance Portability and Accountability Act-compliant. The radiology information system data base was surveyed for the term "mesial temporal sclerosis" during a 53-month period (between January 2009 and May 2013) to find patients who had MR imaging studies. We included 46 healthy control research subjects from the same period who had NeuroQuant-compatible equivalent MR pulse sequences to assess normal variations in hippocampal volumes. The search yielded an initial sample of 85 patients with MTS (mean age, 34.9 Ϯ 15.9 years) matched with the 46 control subjects (mean age, 26.7 Ϯ 15.7 years). Of the 85 subjects, 61 patients who had electroencephalography, clinical, or pathologic findings that localized to the right or left side and 2 patients who had them localized bilaterally had MR pulse se-quences in a high-resolution 3D dataset that could be analyzed by NeuroQuant. The 63 patients for analysis included 35 women and 28 men with a mean age of 35.0 Ϯ 16.2 years, similar to the initial cohort and control subjects (no statistical difference in age or sex). Because of the emphasis on asymmetry, the 2 patients with bilateral MTS were removed from a second analysis, leaving 61 subjects as a second dataset. The patients had symptoms of partial complex seizures in 56 cases, intractable seizures in 5 cases, and generalized seizures in 2 cases. Patients with temporal lobe seizures due to a cause other than MTS (eg, tumors, strokes, congenital anomalies) were excluded.
All patients had a dedicated epilepsy protocol that included sagittal T1-weighted images, axial diffusion-weighted images with ADC mapping, axial T2-weighted, axial T2 FLAIR, coronal T2weighted, coronal T2 FLAIR, coronal 3D spoiled gradient-echo T1-weighted scans, and coronal thin-section T1-weighted scans obtained specifically through the temporal lobes to determine hippocampal volumes and identify any cortical dysplasia. Postcontrast T1 images were added in some cases. Thirty-seven patients were scanned on a 3T scanner, and 26, on a 1.5T scanner.
The NeuroQuant analysis was based on a sagittal 3D volumetric MPRAGE pulse sequence with the following parameters: TR, 2300 -2400 ms; TI, 900-1000 ms; TE set to minimum; flip angle, 8°; FOV, 24 cm; section thicknesses, 1.2 mm, for 170 sections. The control subjects and patients with MTS were scanned with identical volumetric pulse sequences. Scans were sent via the PACS to an Apple Mac Mini computer (Cupertino, California) with NeuroQuant installed. NeuroQuant takes the high-resolution 3D T1-weighted, sagittal non-contrast-enhanced MR imaging data, autoroutes them as input with no user intervention, and returns a new full-volume spatially corrected and anatomically labeled dataset along with 2 printable patient reports containing the absolute and relative volumes of the hippocampus, temporal horn, and other structures in DICOM-compliant format (Fig 1). This process, from the time sent from the PACS to creation of the report, typically takes between 5 and 10 minutes, and the report appears in the PACS as 2 additional "Morphometry Results" series.
A clinical neuroradiologist reviewed all MR images at the time of the patient's initial assessment, and a study neuroradiologist reviewed them retrospectively. Twelve neuroradiologists with a range of experience between 2 and 30 years interpreted the clinical studies. The clinical reports were reviewed from the electronic medical record after the study neuroradiologist gave an opinion as to the side of the MTS. In instances in which the research neuroradiologist's interpretation differed from the clinical prospective reading in the electronic medical record, a third neuroradiologist with 25 years of experience provided a third opinion to break the impasse. This occurred in 5 of 63 cases. No attempt was made to parse the data on the basis of individual results of neuroradiology faculty members.
For the criterion standard, patients were classified as having left, right, or bilateral MTS on the basis of electroencephalography recordings, histopathologic findings of surgical specimens, and clinical determination reviewed in the electronic medical record. The proof of diagnosis included pathologic specimens in 25 (of 63, 39.7%) cases and localizing electroencephalography in patients without an operation in 48 (of 63, 60.3%) cases.
The difference in volume between the right and left hippocampi also was assessed to determine the sensitivity of the clinical neuroradiologist's visual assessment of the hippocampal volume. We specifically looked at the electronic medical record reports to determine whether the neuroradiologist commented on one hippocampus being larger than the other. An attribution of hippocampal volume loss to one side in the electronic medical record was compared with the criterion standard for the side of abnormality and also the raw data analysis from the NeuroQuant assessment.
Because previous literature by Pedraza et al 14 had indicated that the left hippocampus tends to be slightly smaller than the right hippocampus in healthy adults by an average of 2.7% and Woolard and Heckers had shown a 4.4% difference in raw volumes of the hippocampus (mean left volume, 3352 mm 3 ; mean right volume, 3504 mm 3 ), 15 we performed our data analysis before and after adjustment of volumetry based on analysis of our 46 control subjects, to account for this natural asymmetry. Although Rogers et al 16 and Woolard and Heckers 15 had suggested that this was largely due to asymmetry in the anterior hippocampus (6.3% in the study of Rogers et al), we adjusted for overall hippocampal measures as produced by the NeuroQuant output.
Basic descriptive statistics were computed to summarize characteristics of our patient sample. Accuracy between 2 classifiers was estimated as the empiric proportion of cases on which the classifiers agreed, and the Wilson score interval was computed as a confidence set. The Cohen statistic was computed as an additional measure of classifier agreement, and a corresponding confidence interval was calculated. The R statistical computing software functions for medical statistics book (fmsb) (http:// www.r-project.org) was used for this purpose. To test the null hypothesis that the neuroradiologist's accuracy is the same for each side of lateralization, we constructed bootstrap confidence intervals for the relative accuracy and inverted them to yield a P value. Comparisons between neuroradiologist and Neuro-Quant accuracy were similarly performed.

Asymmetry in Healthy Controls
Using data from the 46 healthy controls in our study, we observed a very slight asymmetry in volumes between the 2 hippocampi (right larger than left), as reported in the literature. 10  provides a depiction of the distribution of observed relative asymmetry indices, defined as right volume Ϫ left volume/left volume. The NeuroQuant mean asymmetry index among healthy patients was estimated to be 2.1% (95% CI, Ϫ2.1-6.2) in our study: This asymmetry was not significantly different from zero.

NeuroQuant Versus Neuroradiologist
On the basis of the criterion standard of pathologic specimens and definitive electroencephalography readings concurrent with clinical impressions, 36 patients had left-lateralized sclerosis, 25 had right-lateralized sclerosis, and 2 had bilateral sclerosis.
In 12 of the 63 cases, the neuroradiologist rated the hippocampi as symmetric but based the MTS diagnosis on findings other than volumetry (signal intensity on FLAIR, morphology of hippocampus (HC), fornix-mammillary body asymmetry, and so forth). In 8/12 (66.7%) of these cases, the radiologist was correct. Three of the 4 incorrect cases were ones in which the MTS was incorrectly suggested as bilateral by the radiologist.
Of these 12 cases in which the hippocampuses were deemed to be symmetric but the radiologist identified MTS on a nonvolumetric basis, the radiologist agreed with NeuroQuant (which did quantitative hippocampus analysis) and correctly identified the side in 6 cases, agreed with NeuroQuant but both were incorrect in 2 cases, and disagreed with NeuroQuant in 4 cases. Of these 4 cases, the radiologist was correct in 2 and NeuroQuant was correct in 2. Overall accuracy when not using visual volumetric differences by the radiologist was 8/12 (66.7%). Of these 12 patients, NeuroQuant had a similar accuracy (8/12) however.
Classifications according to the neuroradiologist and Neuro-Quant were discordant in 27.9% of all cases (17 of 61; 95% CI, 16.4%-39.3%). NeuroQuant was correct in 58.8% of such cases (10 of 17, 95% CI, 35.0%-82.4%). Of the 10 cases in which NeuroQuant was correct and the radiologist was wrong, we found that the radiologist relied on volume to pick the (wrong) side of MTS in 9 cases. Of the 7 cases in which NeuroQuant was wrong and the radiologist was correct, the radiologist based his or her determination on findings other than the visual assessment of volume in 2/7 cases.
Because of the slight asymmetry demonstrated in our control subjects, we re-evaluated the data after they were reclassified as showing left-sided MTS if the relative asymmetry index was Ͻ2.1% and as right-sided MTS if it was Ͼ2.1%. Under this classification rule, NeuroQuant had an estimated accuracy of 77.8% (95% CI, 65.2%-86.9%) and a statistic of 0.56 (95% CI, 0.35-0.76), indicating moderate agreement with the criterion standard (P Ͻ 3 ϫ 10 Ϫ5 ). This finding suggests no added benefit for correcting for the left-to-right inherent volume asymmetry in the brain. When we repeated this analysis excluding the 2 bilateral MTS cases, the accuracy of NeuroQuant was only marginally improved but was still worse than the performance assessments, not accounting for a natural asymmetry.

Neuroradiologist Threshold for Detecting Volumetric Differences
To better understand whether there is a threshold in asymmetry in delineating cases in which an experienced radiologist might be able to detect a volumetric difference between one hippocampus versus another, we performed a visual inspection of the available data, including classifications according to the criterion standard and the volumes calculated by the neuroradiologist and Neuro-Quant (Fig 3). No such threshold was identified. In cases in which NeuroQuant yielded an index of asymmetry of at least 5%, 10%, or 20%, the neuroradiologist had an estimated accuracy of 75.0%, 78.9%, and 84.2%, respectively.

DISCUSSION
Despite the logic that would suggest that quantitative assessment of hippocampal volumes allows a more accurate assessment of patients with temporal lobe epilepsy, several limitations have prevented its widespread implementation. 17,18 The anatomy of the hippocampus is quite intricate, with curved surfaces and layers of gray and white matter that render parcellation of the cortex from subcortical content a difficult task. The plane of orientation also does not lend itself to easy assessment in the axial plane: This is the plane in which neuroradiologists are often most comfortable. Additionally, because the overall volume of the hippocampus is quite small, minor manual or automated errors in calculation can lead to wide relative error bars. Manual segmentation of hippocampi has been performed for decades both for the evaluation of MTS but also for assessing patients at risk for or with probable Alzheimer disease. The process is time-consuming and requires training as to the relevant anatomy. It is not necessarily conducive to the rapid workflow required of practicing neuroradiologists focused on efficiency and accuracy. Hippocampal volumetric accuracy is difficult to assess in vivo with live subjects, so most authors focus on reproducibility. To that end, Gonçalves Pereira et al 19 noted inter-and intraobserver error rates of approximately 6%-8% in the amygdala and piriform cortex. The volumes of these areas differed between controls and patients with MTS by 15%-20%. Achten et al, 20 by using a manual ray-tracing methodology, reported inter-and intraobserver variabilities that ranged between 3.6%-7.3% and 3.4%-5.6%, respectively, for various structures, suggesting good reproducibility.
Neuroradiologists may suggest a diagnosis of MTS even in the face of absent volumetric changes. The following findings may also suggest MTS: 1) hippocampal T2-weighted/FLAIR signalintensity abnormalities, 2) loss of the crenated margin of the upper surface of the hippocampus, 3) gray matter-white matter blurring, 4) malrotation of the hippocampus, 5) ipsilateral mammillary body and fornical column volume loss, and 6) unilateral temporal horn dilation (sometimes as secondary findings of volume loss in the limbic system). [21][22][23] The ability to detect changes suggestive of MTS has been shown to be significantly correlated with the experience of the reader and the quality of the study. For example, Von Oertzen et al 24 have shown that the sensitivity for detection of MTS varies between 39% and 50% when comparing nonexpert and dedicated epilepsy expert readers of standard brain MRI, respectively. However, when given an epilepsy-specific MR imaging with appropriate sequences and protocols, the sensitivity of dedicated epilepsy expert readers increased to 91%, with a 4-fold improvement in accuracy when an epilepsy-specific protocol was performed and interpreted by expert readers over standard protocols read by general readers. 24 Automated methods for the analysis of hippocampal volumetry have recently been published in clinical journals. 13,25 However, there have been few reports using an FDA-approved solution that could be practical in a busy clinical practice. Brewer et al 10   This plot suggests that there is no hard threshold below which the radiologist is unable to appropriately identify a case of unilateral mesial temporal sclerosis. In fact, in those cases for which the hippocampal volumes differed by Ͼ5%, 10%, and 20%, the radiologist's classification had an estimate accuracy of 75.0%, 78.9%, and 84.2%, respectively.
et al confirmed the existence of a natural asymmetry, with the right hippocampus found to be larger than the left hippocampus (4.00 versus 3.82 mL, 4.6% difference): The level of asymmetry they observed differed slightly from ours. Farid et al also showed that visual inspection by radiologists was concordant with the NeuroQuant assessment in 85% of cases, whereas we had a lower agreement rate of 72.1%. 13 While NeuroQuant was correct in Ͻ60% of our discordant cases, they found quantitative analysis to be correct in 80% of their discordant cases. 13 How good are neuroradiologists at detecting volumetric changes in the hippocampi? What is the threshold for an experienced physician? We did not detect a threshold clearly delineating scenarios in which a neuroradiologist would or would not be able to detect a volumetric change. In going from a 5%-20% difference in hippocampal volume, ranging from 1.59 to 4.7 mL overall, the radiologist's estimated accuracy ranged from 75.0% to 84.2%. However, in 9 of the 10 cases in which NeuroQuant was correct and the neuroradiologists were incorrect over lateralizing the MTS, the radiologists selected the wrong hippocampus as being the smaller one. By the same token, when the neuroradiologists used findings other than volumetry to select a side of MTS, they were correct 67% (8/12) of the time, but this was dominated by misclassifications of bilateral MTS.
Coan et al 26 have suggested that adding T2 relaxometry to automated T1-weighted-based volumetry of the hippocampus can increase interpretation accuracy. They found that the use of combined hippocampal volumetry and T2 relaxometry increased the sensitivity to detect MR imaging signs of MTS, notably reclassifying 28% of patients read as having normal findings on visual inspection alone. While automated volumetry detected atrophy in 119 of 125 (95%) patients who were identified by radiologists as having MTS, it identified an additional 10 of 78 patients (12.8%) initially read as having normal findings by the radiologists. T2 relaxometry analysis detected hyperintense T2 signal in 103 of the 125 cases (82.4%) of radiologist-detected MTS and in 15 of 78 subjects (19.2%) whom the radiologist classified as having normal findings. Coan et al used an automatic volumetric analysis with FreeSurfer software (Version 5.1.0; http://surfer.nmr.mgh.harvard.edu) from a T1-weighted dataset. As of this publication, FreeSurfer has not been FDA-approved and is not being used clinically in our setting. While both NeuroQuant and FreeSurfer use a probabilistic atlas for labeling, the adaptation of this approach for clinical use and FDA clearance required adaptation of the probabilistic atlas and complete rebuilding of the code by using Good Manufacturing Practices by NeuroQuant. Indeed FreeSurfer and NeuroQuant often do not return identical volumetry values, despite both using a similar underlying algorithm.
Given the report that epilepsy expert readers have a sensitivity for MTS of 91%, 20 why is the accuracy reported in the current study only 71.4%? Some factors are the following: Many cases of MTS are bilateral; the cases that get sent to a quarternary care hospital in academia are often not straightforward and are the ones that are confounding the outside physicians; our study was stricter in making sure the cases were "proved" (requiring pathology for more cases); and our experience is more the "norm" for a neuroradiology academic practice.
Although the use of both 1.5T and 3T datasets in this cohort may seem to be a limitation of the study, hippocampal volume measures obtained by using 1.5T and 3T scanners have not been found to differ. 27,28 The limitations of this study include the limited percentage of cases that had histologic confirmation; nevertheless, our rate of pathologic proof is greater than that of most other published series. We also struggled with the inclusion of bilateral MTS cases in the analysis, given that we were using an asymmetry index and not an absolute classification of individual hippocampal volumes. We used the radiologist's interpretation of the MR images that not only included volumetry but also an assessment of signal intensity and morphology.

CONCLUSIONS
In assessing asymmetry in hippocampal sizes and thereby predicting the side of MTS, an automated FDA-approved volumetric analysis software package performed as well as experienced neuroradiologists who reviewed the scans for all MTS MR imaging findings. In those cases in which the radiologist and the computer analysis disagreed, NeuroQuant performed slightly better (10 versus 7 of 17). Implementing this quantitative analysis may assist neuroradiologists in their assessment of hippocampal asymmetry, even though small changes in right-to-left hippocampal volumes can be detected by the neuroradiologists.