Survival Analysis of Patients with High-Grade Gliomas Based on Data Mining of Imaging Variables

The authors compared survival prediction using machine-learning data-mining techniques versus traditional histopathologic analysis in a group of 74 tumors. They looked at 55 imaging variables, extracting the most important ones and comparing these with standard histology. The most important variables were: extent of resection, mass effects, volume of enhancing tumor, maximum T2 intensity, and mean trace intensity in non-enhancing or edematous surrounding areas. The accuracy of the data-mining method was 85% and based on Kaplan-Meier curves this predicted survival better than histology. BACKGROUND AND PURPOSE: The prediction of prognosis in HGGs is poor in the majority of patients. Our aim was to test whether multivariate prediction models constructed by machine-learning methods provide a more accurate predictor of prognosis in HGGs than histopathologic classification. The prediction of survival was based on DTI and rCBV measurements as an adjunct to conventional imaging. MATERIALS AND METHODS: The relationship of survival to 55 variables, including clinical parameters (age, sex), categoric or continuous tumor descriptors (eg, tumor location, extent of resection, multifocality, edema), and imaging characteristics in ROIs, was analyzed in a multivariate fashion by using data-mining techniques. A variable selection method was applied to identify the overall most important variables. The analysis was performed on 74 HGGs (18 anaplastic gliomas WHO grades III/IV and 56 GBMs or gliosarcomas WHO grades IV/IV). RESULTS: Five variables were identified as the most significant, including the extent of resection, mass effect, volume of enhancing tumor, maximum B0 intensity, and mean trace intensity in the nonenhancing/edematous region. These variables were used to construct a prediction model based on a J48 classification tree. The average classification accuracy, assessed by cross-validation, was 85.1%. Kaplan-Meier survival curves showed that the constructed prediction model classified malignant gliomas in a manner that better correlates with clinical outcome than standard histopathology. CONCLUSIONS: Prediction models based on data-mining algorithms can provide a more accurate predictor of prognosis in malignant gliomas than histopathologic classification alone.

B rain neoplasms are considered one of the most lethal and difficult-to-treat forms of cancer. The prognosis of brain cancer varies on the basis of the type of neoplasm and is poor for the most malignant types, with a median survival of approximately 3 years for anaplastic astrocytoma and 1 year for glioblastoma. 1 A better prognosis based on routine studies, including mostly MR imaging, may help treatment planning. Tumor grade, imaging properties, and clinical data have been studied by many groups in relationship to survival. 2,3 A study including 34 patients with gliomas showed that rCBV values in astrocytomas are predictive for recurrence and 1-year survival and may be more accurate than histopathologic grading. 2 This was not observed for tumors with oligodendroglial components. 2 A larger study including 189 HGGs and LGGs showed that patients with a high rCBV have a significantly more rapid time to progression than patients who have gliomas with a low rCBV. 4 HGGs have been studied by Mills et al, 5 who examined the direct relationship between the contrast transfer coefficient and length of survival in 27 adult patients. CBV related directly to histologic grade but provided no independent prognostic information over and above that provided by grade. 5 A different small study 6 showed that CBV is a stronger predictor of survival than enhancement. Also it has been shown in 28 patients with GBM that the volume after surgery, but before pre-external beam radiation therapy of the metabolic abnormality, and the nADC value within the T2 hyperintense region may be valuable in predicting outcome. 7 On the other hand, a previous study monitoring evolution of 41 GBMs showed that perfusion and diffusion techniques cannot be used to anticipate tumor progression and that classic T1 and T2 imaging remain the criterion standard. 8 Besides advanced MR imaging characteristics, clinical characteristics and tumor descriptors were correlated with survival time. A multivariate analysis of 416 patients with GBM 3 identified 5 independent predictors of survival: age, Karnofsky Performance Scale score, extent of resection, and the degree of necrosis and enhancement on preoperative MR imaging studies.
In this work, we investigate whether patient characteristics, as well as DTI and rCBV measurements in HGGs, can serve as an adjunct to conventional imaging characteristics in predicting overall survival. Specifically, we combined clinical findings, tumor pathology descriptors, and imaging characteristics from 2 ROIs and explored data-mining techniques 9 for assessing total survival time after first-time treatment (resection). Data mining primarily assists in the analysis of large quantities of data, aiming to extract interesting patterns. This has been shown to be helpful in tissue segmentation, 10 variable selection, 11 and tumor classification. 11 In this study, a segmentation method based on multiparametric MR imaging intensities was used for ROI segmentation, requiring less user interaction than most reported studies based on ROI analysis. The combined set of variables was reduced by a variable-selection method, and a classification tree was constructed by using the reduced set of significant variables.
The method was applied in a population of 67 patients with HGGs (histologically diagnosed according to the WHO system) from June 2006 to December 2007. Seven patients had multiple 2 tumors, which were considered independent masses, resulting in 74 samples in total (18 anaplastic gliomas grade III, 55 GBMs grade IV, and 1 gliosarcoma grade IV). The robustness and accuracy of the proposed classification scheme were assessed by cross-validation of the data.

Materials and Methods
A computer-assisted classification scheme combining conventional MR imaging, perfusion MR imaging, and DTI was developed and used for survival analysis. Data analysis was performed in several stages, as shown in Fig 1. First, the data were preprocessed, and ROIs were semiautomatically segmented. Then, ROI-based analysis was performed to extract imaging characteristics from rCBV and DTI. The imaging characteristics were combined with clinical findings to retrieve a complete set of diagnostic variables describing the pathology. Subsequently, due to the substantial number of variables, datamining procedures were used to discern relevant patterns of relationships within the dataset and eliminate irrelevant variables, as will be described in "Variable Selection and Classification." Last, once a final set of variables was found, the datasets were classified by using a decision tree algorithm. The calculation of the imaging characteristics was performed by using Matlab (MathWorks, Natick, Massachusetts), whereas variable selection and classification were implemented in the WEKA platform. 12

Data
We examined patients with HGGs who underwent preoperative conventional and advanced MR imaging (perfusion, DTI).
LGGs were not analyzed due to the small number of cases. The patient demographics are shown in Table 1. Exclusion criteria were any surgery, radiation therapy, or chemotherapy of a brain tumor before inclusion in the study as well as lack of histopathologic diagnoses, missing imaging data (pre-or postoperational), or presence of artifacts. Treatment was not used to exclude patients. All patients were treated under the same protocol. After resection, they received chemotherapy/radiation, including bevacizumab (Temodar). All patients underwent biopsy or surgical resection of the tumor with histopathologic diagnosis based on WHO criteria. Postoperative scans were analyzed to determine the extent of resection. The study was approved by the institutional review board and was compliant with the Health Insurance Portability and Accountability Act.
Also, after an initial loading dose of 3-mL gadodiamide and a 5-minute delay, T2*-weighted dynamic susceptibility perfusion MR imaging was performed by using a gradient-echo EPI acquisition during bolus injection of 12-mL gadodiamide. Twenty sections were ac-  Pipeline of the computer-based methodology for prediction of survival (long/short ϭ more/less than 18 months). Conventional MR imaging was used to visually rank 11 variables characterizing the tumor, whereas DTI and rCBV were used to extract ROI-based imaging attributes. quired with the following parameters: TR/TE, 2000/45 ms; matrix size, 128 ϫ 128; pixel spacing, 1.7 ϫ 1.7 mm; section thickness, 3.0 mm; flip angle, 90°. DTI data were acquired by using a single-shot spin-echo echo-planar sequence with parallel imaging by using generalized autocalibrating partially parallel acquisition and an acceleration factor of 2. Sequence parameters were the following: TR/TE, 5000/86 ms; NEX, 3; FOV, 22 ϫ 22 cm 2 ; section thickness, 3 mm; matrix size, 128 ϫ 128; 40 sections covering the whole brain with an acquisition time of 8 minutes. The diffusion-weighting gradients were applied in 30 isotropically distributed directions by using a b-value of 1000 s/mm 2 .

Preprocessing of DTI and Perfusion
DTI provides a powerful imaging tool for macroscopically studying the neuronal organization, by capturing microscopic water diffusion characteristics along different orientations. DTI models these local diffusion characteristics through a 3 ϫ 3 positive definite symmetric matrix, known as a diffusion tensor. A diffusion tensor may equivalently be represented through an orthonormal basis formed by its eigenvectors and the corresponding eigenvalues. Several quantities can be extracted from the eigenvalues and eigenvectors for characterization of tissue structure. In this study, we used the B0 and also calculated, with our own software, the voxelwise FA map and the trace of the diffusion tensor. FA represents the amount of deviation from isotropic diffusion. It assumes values in the range of 0 -1, with values close to zero denoting isotropic diffusion and values close to 1 representing highly anisotropic diffusion. Trace is the sum of the eigenvalues and is a more suitable feature when one is interested in the magnitude of water diffusivity at a voxel.
Furthermore, rCBV maps were generated off-line from perfusion MR imaging by calculating the change in relaxation rate by using the equation ⌬R2*(t)ϭϪln(S(t)/S 0 ) / TE, where S(t) is the signal intensity at time t and S 0 is the baseline signal intensity, and then by integrating under the ⌬R2*(t) curve over time points corresponding to the first pass of the contrast bolus. The B0, FA, trace, and rCBV maps were rigidly registered to the T1 postcontrast image to be in the space of the ROIs.

ROI Segmentation
The DTI and rCBV imaging characteristics were extracted from only 2 ROIs, including the enhancing neoplastic tissue and all hypointense tissue (neoplastic, edematous, and necrotic). The ROI extraction was performed by an automatic segmentation method combining all conventional MR images into a machine-learning method. 10 This method segments the brain into 6 tissue types (enhancing neoplastic, nonenhancing neoplastic, edema, white matter, gray matter, and CSF) by using a previously trained segmentation model. The segmentation model was trained by means of support vector machines by using samples drawn by an expert for the tumorous tissue types and automatically extracted by FAST 13 (a K-means segmentation algorithm provided by the Functional MRI of the Brain Software Library, http://www.fmrib.ox.ac.uk/fsl/) for the healthy tissue types. Postprocessing was applied to refine the segmentation of the pathologic tissue, while the normal tissue was not used further in this study. Specifically, small isolated tumor or edema clusters, located far from the main tumor bulk, were removed. Also, regions in falx, which are often wrongly segmented as enhancing tumor due to the vivid enhancement they show on the T1 contrast-enhanced images, were corrected. This was performed by warping an atlas with presegmented falx to the subject image. Any region segmented as enhancing tumor overlap-ping with the warped falx was replaced by CSF. Due to the uncertainty in differentiating nonenhancing neoplastic tissue from edematous tissue, these 2 ROIs were merged into 1.
Finally, an expert visually inspected the 2 extracted ROIs, denoted as ET and NET, and manually corrected these wherever necessary by examining the rCBV and DTI maps. The manual correction is rarely needed and does not require a lot of expertise due to the definition of a single ROI for all NET. A segmentation example is shown in Fig 2.

Variables
We calculated and analyzed the relationship of 55 study variables, which included clinical findings and tumor pathology descriptors obtained by visual inspection of conventional MR imaging and also imaging characteristics calculated from DTI and rCBV maps. Variables such as molecular profile, performance status (eg, Karnofsky Performance Scale score), and therapy-associated factors such as radiation dose or the use of adjuvant therapy, were not considered. Typical variables that have been used by others include localization, mass effect, T1 contrast enhancement, T2, diffusion, and perfusion signal intensity. We have combined such nominal variables ranked by an expert with attributes extracted automatically from ROIs, to achieve a more complete representation. The scoring pattern of the variables selected in this study is shown in Table 2. Specifically, the scoring of edema was visually performed as described next. In case of enhancing tumors, the tumor lesion was extracted from the contrastenhanced T1-weighted image and subtracted from the abnormal tissue traced on FLAIR (which includes tumor and edema). In case of nonenhancing tumors, edema was traced in T1 by detecting signalintensity differences between the tumor lesion and edema. The extent of resection was rated on the basis of the postoperative scans. Undefined variables (ie, when some ROIs [ET or NET] did not exist) were marked as missing, whereas the volume of those ROIs was marked as zero. The complete set of variables was used to predict survival.

Definition of Survival Time
For the purpose of this study, the preoperative images acquired at the time of the first operation were considered as the baseline and were used to monitor the evolution of the disease. Overall survival was evaluated from the baseline to death or, for cases that were not followed until death (eg, living patients), from the baseline to the time of last available follow-up. A time threshold of 18 months was defined to differentiate the patients into 2 groups, those with short-or long-term survival. Living patients with last follow-up earlier than the examined period (18 months) were excluded from the study because their vital status at the end of the period was unknown. The study of survival was performed as a binary classification problem rather than a regression problem to include living patients in the dataset, for whom the actual survival time was unknown at the time of the analysis. The threshold of 18 months represents patients with survival that is longer than the median survival time of HGGs and was chosen to divide the patient population into an approximately equal number of cases in both groups (37 cases in each group). Thus, the definition of long-term survival here is different from that in other studies, [14][15][16][17][18] where the long-term survivors are those who survive Ն3 years after the initial diagnosis.

Variable Selection and Classification
Variable or feature selection is an important processing step, during which a subset of the most informative variables is chosen and then used for the final analysis. By discarding the redundant variables, one can construct more cost-effective predictors with improved performance. 19 We tested several variable selection algorithms and selected the one with overall best classification performance. The selected method searched over the variables following the "scatter search" algorithm 20 and defined the predictive value of each subset of variables by using a "wrapper approach." 21 Scatter search operates on a group of subsets of variables, which constitute good solutions, in respect to special criteria such as diversity. The subsets are linearly combined, and a local search procedure is applied to update the initial group and incorporate good solutions. On each subset of variables, the wrapper builds a classifier by applying an induction learning algorithm. The variables subset with the highest classification accuracy, estimated by cross-validation, is selected. These steps are repeated until a stopping condition is met.
Classification of the datasets into short-and long-term survivors is performed with a J48 classification tree (http://www.opentox.org/ dev/documentation/components/j48). 22 J48 produces decision trees from a set of labeled training data by examining the normalized information gain (difference in entropy) that results from choosing a variable for splitting the data into smaller subsets. To make the decision, the variable with the highest normalized information gain is used.

Univariate Statistical Analysis of the Image-Intensity Characteristics
A 2-tailed t test was performed on each image-intensity characteristic in the segmented ROIs between short-and longterm survivors by assuming that the data came from normal distributions with unknown but equal variances. There was no significant difference (at significance level ␣ ϭ .05) in all variables (mean, minimum, maximum, peak of histogram, variance) extracted from the rCBV maps from either of the segmented ROIs (ET and NET). In contrast, the mean B0 and FA intensity values showed significant differences in both regions (P Ͻ .01 for ET and P Ͻ .04 for NET), whereas the mean trace value was significantly different only in NET. Similarly, the peak of histogram of FA and trace showed significant differences (P Ͻ .022 in both regions in FA, and P ϭ .01 in NET in trace), whereas the minimum-intensity value showed marginally significant differences (P ϭ .047) only in NET in FA. The variance and the maximum value did not demonstrate significant differences in any of the images and regions. The regionwise comparison of the mean intensities is shown in Fig 3. Group separation was not feasible by analyzing the variables independently. In the next section, classification results of the proposed scheme implementing machine-learning tools for variable selection and classification are presented.

Variable Selection and Classification
Multivariate analysis with variable selection was applied to classify the 2 groups of patients. The method was evaluated with leave-3-out cross-validation on the decision tree (ie, 3 samples were left out each time for testing and the rest were used for training). The results for every test set were finally averaged. The average classification accuracy (percentage of correctly classified samples) was 85.1%, and the AUC of the ROC was 0.84. Because variable selection was internal in the cross-validation process, the selected variables and the decision tree varied for each fold. To summarize the results in a single decision tree, we also applied the method on all samples together. In this case (without cross-validation), the variables selected as the most discriminative were the extent of resec-tion, mass effect, volume (in cubic millimeters) of ET, maximum B0 intensity in NET, and mean trace intensity in NET. The classification accuracy on the training set was 90.5%, and the AUC was 0.96. The histograms of these variables for shortand long-term survival did largely overlap; thus, class separation would not have been possible on the basis of univariate analysis. Also, these variables (selected by the scatter search algorithm) differed from the variables that demonstrated the most significant group differences by t test (shown in "Univariate Statistical Analysis of the Image-Intensity Characteristics").
The decision tree constructed by all training samples by  using the 5 variables selected by the variable-selection method is visualized in Fig 4. Mass effect is on top of the tree, associating significant displacement with long-term survival (mass effect likely being partially a surrogate for lower tumor diffusivity and infiltration). The rest of the cases exhibit long-term survival if the mean trace intensity in the nonenhancing region is high and either the enhancing part is small and/or it is large preoperationally but is then completely resected and also the maximum B0 intensity in the nonenhancing region is high. Similarly, short survival is associated with the rest of the cases having small or moderate mass effect, low mean trace intensity in the nonenhancing region, and enhancing volume Ͻ4 cm 3 or enhancing volume Ͼ4 cm 3 and not completely resected, or completely resected but with low maximum B0 intensity in the nonenhancing region.

Kaplan-Meier Curves
It is well accepted that prognosis depends on histopathology or tumor grade (though recent studies on HGGs suggest that the association might not be significant 23 ). We compared the predictive ability of tumor grade with the power of the model (which is not based on histopathology). The prediction model was applied in a leave-1-out cross-validation scheme by using the 5 variables previously described. The survival curves for each case are shown in Fig 5. When tumors were classified according to histopathology (grade 3 versus grade 4), the survival of patients was not significantly different (P ϭ .17), whereas class distinctions according to the prediction model were significantly associated with survival outcome (P Ͻ 0.00001).

Discussion
In this study, we examined whether prediction models based on machine-learning algorithms provide a more accurate predictor of prognosis in malignant gliomas than does histopathologic classification alone. Furthermore, we investigated whether DTI and rCBV measurements in HGGs may serve as an adjunct to conventional imaging in predicting overall survival. The explored variables included clinical data, descriptive tumor characteristics as they appear on conventional MR imaging, and imaging properties in selective regions of advanced MR imaging. The computer-based method consisted of 2 main steps: First, the most informative variables were identi-fied; then, a decision tree algorithm was applied to differentiate short-and long-term survival.
Two ROIs were selected that included the enhancing neoplastic tissue and all hypointense tissue (neoplastic, edematous, and necrotic). The possibly heterogeneous hypointense tissue was examined together into a single ROI to simplify the data-preparation steps and also avoid segmentation errors because differentiation between edema and tumor infiltration is often ambiguous. The assumption is that the tissue heterogeneity will be captured by the extracted statistical characteristics (mean, minimum, maximum, histogram peak, and variance) of the imaging profile (eg, a low minimum value indicates the presence of necrosis).
Five variables were selected as the most informative, providing the highest classification accuracy, namely the volume of enhancing tissue, mass effect, extent of resection, and B0 and trace values in the nonenhancing/edematous ROIs. Thus, the results support our hypothesis that DTI contains information on HGG prognosis similar to its potential in evaluating cellularity and grading of gliomas. 24 This information seems to be mostly contained in the nonenhancing or edematous area. The volume of enhancing tissue, as segmented from the T1 contrast-enhanced image, was also an important predictor, as observed in similar studies investigating MR imaging characteristics of brain tumors. 3,11,25 The extent of resection was also selected as an important variable and was characterized by a binary decision function, incomplete or complete resection. Similarly in the study of Lacroix et al, 3 univariate and multivariate analyses showed that resection of Ն98% of the tumor volume was associated with a significant survival advantage. On the contrary, the extent of resection was not a statistically meaningful predictor of survival in the study of Pope et al. 26 The mass effect was placed on top of the decision tree in this study and related to survival in an opposite way than might be expected. Specifically, the data show that most the cases with significant mass effect (Ͼ5 mm) were associated with long survival time. A potential reason could be that aggressive gliomas, which usually have the worst prognosis, tend to be more diffuse and infiltrative, thereby causing relatively lower mass effect. This hypothesis, however, remains to be confirmed in future prospective studies. In previous studies, 3,23 mass effect did not show any association with survival when multivariate analysis was applied, whereas univariate Cox regression dem- onstrated statistical significance for a large mass effect (Ͼ10 mm). 23 ROI-based characteristics in rCBV were not among the top-ranked variables, whereas rCBV statistics have been associated with brain tumor type in previous work 11 and with histologic changes such as angiogenesis. 27 Direct comparison with such studies, however, should not be performed because tumor classification 11 and tumor growth 27 are different targets from survival prediction, which was the aim in this work.

Conclusions
Kaplan-Meier survival curves demonstrated that the constructed prediction model provides a more accurate predictor of prognosis in high-grade gliomas than pathologic classification. Further investigation with a larger patient population and histologic validation will be necessary to determine the robustness of the selected variables in predicting survival and to increase understanding of the morphologic and functional characteristics of brain tumors. In the future we would also like to include in the analysis therapy-associated factors, such as radiation dose or adjuvant chemotherapy, as well as genotypic information because it has been shown to be an important factor for clinical outcome in certain brain tumor types. [28][29][30]