Texture Analysis in Cerebral Gliomas: A Review of the Literature

SUMMARY: Texture analysis is a continuously evolving, noninvasive radiomics technique to quantify macroscopic tissue heterogeneity indirectly linked to microscopic tissue heterogeneity beyond human visual perception. In recent years, systemic oncologic applications of texture analysis have been increasingly explored. Here we discuss the basic concepts and methodologies of texture analysis, along with a review of various MR imaging texture analysis applications in glioma imaging. We also discuss MR imaging texture analysis limitations and the technical challenges that impede its widespread clinical implementation. With continued advancement in computational processing, MR imaging texture analysis could potentially develop into a valuable clinical tool in routine oncologic imaging.

G liomas are central nervous system tumors of glial origin, with glioblastoma being the most common and aggressive subtype, having a median survival of 14.5 months and 10% survival at 5 years. 1 Despite advanced imaging, accurate noninvasive prediction of glioma grade, survival, molecular status, and treatment response remains challenging. Brain biopsy remains the reference standard for histologic and genetic classification, but it is invasive and costly. 2 Additionally, the inherently high molecular heterogeneity in gliomas may decrease the accuracy and prognostic value of stereotactic biopsy diagnosis. Moreover, despite stereotactic biopsy, the pathologic diagnosis may remain inconclusive in about 7%-15% of patients. 3,4 This scenario necessitates preoperative identification of imaging surrogates to accurately assess global tumor heterogeneity and predict glioma grade, genetic milieu, and survival. 5 Even though multiparametric MR imaging features show significant agreement in terms of morphologic features, some of which are strongly associated with poor survival, the accuracy of these imaging variables to predict genetic heterogeneity and prognosis is rather limited. 6,7 Similarly, advanced MR imaging techniques such as diffusion, perfusion, and MR spectroscopy have also been beneficial, but with modest success. 8 Texture analysis (TA) is a noninvasive method to quantify macroscopic tissue heterogeneity indirectly linked to microscopic tissue heterogeneity. Recently, MR imaging texture analysis (MRTA)-based studies have shown promise in predicting glioma grade, survival, molecular status, and response assessment. However, despite the continued work, consensus on the clinical role of MRTA remains elusive. Here we review the basic concepts behind MRTA, its applications in glioma imaging, its limitations, current challenges, and potential future directions.

MRTA: Concepts and Methodology
Texture, according to Merriam-Webster.com, is defined as "something composed of closely interwoven elements," just as the structure formed by threads of a fabric identifies its character. 9 Similarly, an image texture is a representation of pixel intensities, their distribution, and their interrelationships, which may or may not be discernible to the human eye. TA noninvasively measures tumor heterogeneity (through parameters like kurtosis, entropy, and pixel distribution that potentially correlate with cellular density, angiogenesis, and necrosis) and may better predict tumor biology. 10,11 The workflow of MRTA is represented in the Figure. Acquisition parameters such as magnet strength, spatial resolution, signal-to-noise ratio, and different pulse sequences may influence MRTA features. 12 Most interesting, however, variations in these parameters can provide supplementary texture information, an added advantage of MR imaging over other imaging modalities. 13 Postacquisition, an image undergoes preprocessing, which generally involves segmentation, image interpolation, intensity normalization, gray-level reduction, magnetic field inhomogeneity correction, and filtration. Performance of all these steps except segmentation is not a requirement for TA but helps enhance texture features and maintains uniformity and standardization.
Preprocessing steps can be performed on both open-source and commercially available software. The first step, segmentation, involves drawing an ROI manually or automatically either on a single 2D slice (or multiple slices) or a 3D-VOI. 14 Next, to improve matrix resolution, interpolation is applied for which images are remapped to isotropic spacing to standardize the TA in all 3 directions. Furthermore, interpolation transforms the image into a higher matrix size and improves texture classification. 15 Different MR imaging sequences have various ranges of intensities for the same image. This feature is addressed through intensity normalization, which extends the gray-level distribution of each MR image to the whole value range (0 -255). It enhances the contrast between the tumor and background tissues and is achieved by either remapping the brightness to minimum or maximum value in the histogram, using mean Ϯ 3 SD, or by using the histogram range between the first and the 99th percentile of the gray-scale image.
The undesirable effect of magnetic susceptibility on image texture can also be modified by use of the filtration process. 16 Filtration can also be applied to derive new maps that individually extract and enhance subtle features otherwise lost while analyzing the original conventional image-that is, it converts an image into different anatomic scales varying from 2 mm (fine features), 3-5 mm (medium features), and 6 mm (coarse features). Furthermore, gray-level reduction is essential in the computation of graylevel matrices because TA can be computed on 16, 32, 64, 128, or 256 levels and actual MR imaging ranges up to 1024 levels. 16 Because increasing the number of gray levels makes them computationally extensive without an added advantage, gray-level matrices are therefore computed at 5 or 6 bits per pixel. 17 Feature extraction is the next step and includes agnostic and semantic features. Semantic features include shape, necrosis, vascularity, location, and speculation, and these can be quantified as well. Hundreds of features can be computed from available MRTA software. 18 To overcome the issue of redundancy and overfitting that may be seen with multiple extracted features, several classifier models-Fischer coefficient, principal component analysis (PCA), linear/nonlinear discriminant analysis, regression models, support vector machine (SVM) with recursive feature elimination, artificial neural network, and random forest classifiers-are used as well as application of statistical methods to reduce the false discovery rate. These models extract the features that have the best discriminative power. 19 Alternately, unsuper-vised deep learning models can also be used to agnostically generate discriminating texture features. This obviates generating thousands of random texture features and subsequent optimal feature selection as described above.

Types of TA
At present, statistical-, structural-, transform-, and spectral-based TAs are the most common agnostic methods used. Statisticalbased TA depends on the pixel values, distribution, and spatial interrelationship in the defined ROI. 20 First-order statistical TA is a histogram representation of image intensities in a predefined ROI and calculates mean, median, percentile, SD, skewness entropy, uniformity, and kurtosis. Mean is a measure of central tendency (average brightness), SD depicts dispersion from the mean, skewness reflects asymmetry of the histogram, kurtosis depicts the pointedness of the histogram (visual contrast), and entropy reflects the irregularity of the imageintensity distribution. The more heterogeneous the tumor, the higher the entropy is. 20 Second-order or higher order statistical TA quantifies the image pattern on the basis of the spatial relationship or co-occurrence of the pixel value. It consists of several methods, including the 2 most common ones: gray-level co-occurrence matrix (GLCM) and gray-level run-length matrix (GLRLM). The GLCM measures the frequency of pixel pair distribution at a predefined distance, 21 usually measured in 4 directions (0°, 45°, 90°, and 135°) for 2D and in 13 directions for 3D. 14 GLCM features include homogeneity, inverse difference moment (IDM), dissimilarity, correlation, energy, and entropy. GLRLM observes the run of a specific pixel value over a chosen direction and consists of graylevel nonuniformity, run-length nonuniformity, short-run emphasis, and long-run emphasis. Both GLCM and GLRLM are calculated in different directions and averaged to make them rotationally invariant. GLCM may be measured over different pixel distances (for example, from 1 to 5), and similarly, GLRLM is computed over different run lengths to compute different texture features from the same ROI. GLCM and GLRLM over short distance and run provide fine texture, and over longer distance and run provide coarse texture. Different texture features can also be calculated in statistical methods by application of filters such as bandpass or nonorthogonal wavelet transform, which allow extraction of fine (Յ2 mm), medium (3-5 mm), and coarse texture (Ͼ6 mm) using different filter values. 22 Local binary patterns have high discriminative power and calculate the pixel value by comparing it with neighboring pixels and then assigning a binary value. Other higher order statistics include busyness, coarseness, and contrast, which calculate the spatial relationship among Ն3 gray-level pixel values. 16 Structural (model-based) methods such as fractal analysis provide information about the self-symmetry of the objects. These are computationally extensive and less preferred. Spectral methods include wavelet, Gabor, and Fourier transforms and are based on transforming the spatial information of the image into spatial frequencies. 22 In general, the first-and second-order statistical methods are used most commonly. First-order statistical methods provide global information, and second-order statistical methods provide additional information regarding the transition among pixel values. An important consideration is that 2 different tumors may have similar distribution of intensities but may differ in their spatial interrelationship; thus, histogram TA may be limited in such a setting. Second-order statistical TA may be preferable, especially for markedly heterogeneous tumors. 23,24

MRTA Applications in Glioma Imaging
MRTA applications in gliomas are an active area of research, and multiple reports have shown promising results (On-line Tables  1-4). For the sake of simplicity, we have condensed various studies into 4 broad categories: MRTA for glioma grading, predicting survival, glioma radiogenomics, and a miscellaneous category of studies differentiating gliomas from other CNS tumors and assessing treatment changes.

Glioma Grading
The World Health Organization classifies gliomas as low grade (I and II) and high grade (III and IV). 25 Pretherapy determination of glioma grade can help optimize treatment strategy, predicting therapeutic response, prognosis, and survival. 10,26 On-line Table  1 summarizes prior studies evaluating MRTA for glioma grading. 2,10,27-32 Some of these are briefly discussed below. In general, most studies used either ADC maps, T1-contrast-enhanced (CE) MR imaging, or a multiparametric technique along with a transform statistical (filtration-histogram) technique or purely statistical (first-or second-order) TA. Despite variabilities in TA software, entropy values of the ADC maps consistently showed promising results for differentiating low-grade gliomas (LGGs) from high-grade gliomas (HGGs). Skogen et al 29 performed histogram-based TA in 95 patients using CE-MR images and found SD parameters at a fine texture highly significant (area under the curve [AUC], 0.910) in distinguishing LGGs from HGGs. Tian et al 30 performed multiparametric TA in 153 patients (grades II-IV) using an SVM classifier model. They reported 98% accuracy of MRTA features for glioma grading. They also observed that while multiparametric TA performed better in comparison with singlesequence TA, T1-CE was the best single sequence. Xie et al 31 observed that entropy (AUC ϭ 0.885) and IDM (AUC ϭ 0.901) of model-free and a dynamic contrast-enhanced MR imagingbased model were able to differentiate grade III from grade IV and grade II from grade III gliomas.

Glioma Survival Analysis
Prior studies have used features such as age, extent of resection, degree of necrosis, Karnofsky scores, and enhancing tumor size as prognostic predictors. 33 On-line Table 2 summarizes MRTA studies predicting survival in gliomas. 34-44 As mentioned above, these studies also had considerable heterogeneity in terms of methodologies and classifier models. Most interesting, most studies found CE-MR imaging sequences to be the most useful for predicting survival. Yang et al, 36 for example, noted that even though several texture parameters predicted 12-month survival, CE-MR imaging sequences were the most accurate. They also mentioned that single-image features or MR images may not suffice because different combinations of image features and sequences are predictive for different tasks. Another multiparamet-ric study by Kickingereder et al 40 in 119 patients using supervised PCA predicted progression-free and overall survival after extracting 11 second-order texture features. The MRTA features outperformed clinical and radiologic risk models in predicting prognosis. Another multiparametric MRTA study by Upadhaya et al 35 in 40 patients extracted the top 5 texture features from CE-MR images with an accuracy of 83% in predicting 15-month survival. Liu et al 44 (n ϭ 119) also noted the best survival prediction on CE-MR imaging sequences (AUC, 0.791; accuracy, 80.7%). They also discovered that texture features derived from CE-MR imaging were comparable with features derived from a combination of multiple sequences.

Glioma Radiogenomics
The 2016 World Health Organization classification update of gliomas incorporates genetic information for diagnosis. Radiogenomics refers to the relationship between imaging phenotypes and genomics that might allow improved decision-making and consequently improved patient outcomes. 5 Established glioma biomarkers include isocitrate dehydrogenase (IDH), 1p/19q-codeletion, and methylguanine methyltransferase status. Immunohistochemistry combined with genome sequencing is a standard method for identifying glioma mutations. 45,46 Many studies have correlated multiparametric imaging features with glioma mutations and, to date, have shown greater success for IDH status compared with other mutations. Currently, standard glioblastoma therapy does not include mutation-specific treatment based on molecular status. 47 Multiple ongoing clinical trials are, however, evaluating targeted treatments in gliomas. 1 On-line Table 3 55 also showed that the joint variable derived from T1WI, T2WI, and CE-MR imaging histograms and GLCM features can be used for precise detection of IDH1-mutated gliomas. TA using B 0 and fractional anisotropy maps has also shown a high accuracy of 95% in predicting IDH status. 50 Bahrami et al 51 reported greater FLAIR tissue heterogeneity and lower edge contrast in IDH wild-type compared with IDH mutants. Jakola et al 53 also reported greater accuracy for predicting IDH mutation using 3D-FLAIR. IDH-mutant 1p/19q-codeleted gliomas also have shown similar results compared with an 1p/19q-intact group and an unmethylated group. Shofty et al 52 used retrospective data from various MR imaging scanners with variable parameters. Despite the considerable data heterogeneity, they successfully predicted 1p/19q codeletion and discriminated LGGs on the basis of 1p/19q-codeletion status with an accuracy of 87% by extracting the top 39 texture features, mostly from CE-MR imaging and T2WI. Li et al [56][57][58] accurately predicted alpha thalassemia mentalretardation syndrome, epidermal growth factor receptor, and p53 status in patients with LGG on T2WI. In general, the secondorder TA on CE-MR imaging and FLAIR images mostly contributed to the high accuracy for predicting genomic status.

Miscellaneous Applications
Glioblastoma imaging features may overlap primary central nervous system lymphomas (PCNSLs) and metastases, rendering a noninvasive distinction truly challenging. 59 Recent MRTA studies, however, have shown promising results in differentiating glioblastomas from PCNSLs and metastases (On-line Table  4). [59][60][61][62][63]66 Kunimatsu et al 60,61 differentiated glioblastomas from PCNSLs with 75% accuracy by selecting the top 4 best-performing texture features from CE-MR images. Xiao et al 62 found skewness and kurtosis to be the best first-order features on CE-MR imaging in a similar population. Suh et al 59 reported 90% accuracy of radiomics-based machine learning algorithms compared with visual analysis by 3 readers in differentiating PCNSLs from glioblastomas. Similarly, Alcaide-Leon et al 63 showed superiority of the SVM classifier over human evaluation. Overall, better results were found using CE-MR imaging and machine-classification models. Dynamic histogram analysis is a novel technique using histogram-based texture parameter analysis on a time-series of dynamic susceptibility contrast MR imaging. Dynamic texture parameter analysis is a further extension of dynamic histogram analysis that analyzes a larger set of time-dependent texture maps from dynamic susceptibility contrast-enhanced series. 64,65 By using dynamic texture parameter analysis, Verma et al 66 extracted texture features from the earliest contrast phase of dynamic susceptibility contrast-enhanced perfusion maps and differentiated glioblastomas from PCNSLs. Skogen et al 67 used MRTA on DTI-derived fractional anisotropy and ADC maps and reported significantly higher heterogeneity in peritumoral edema of glioblastomas compared with metastases.
Assessment of the therapeutic response based solely on Response Assessment in Neuro-Oncology criteria, which are based solely on the 2D size and enhancement, may be challenging. 43 Recently, Ismail et al 68 (n ϭ 105) extracted the 2 most discriminative 3D shape features of the enhancing tumor on CE-MR imaging, FLAIR, and T2WI and noted that 3D shape features could distinguish pseudoprogression from true progression. TA may provide useful prognostic information regarding progression and survival in such a patient population. Grossmann et al 41 found that "information correlation," a GLCM parameter, had a significantly higher score in patients on bevacizumab surviving beyond 3 months. Bahrami et al 43 reported that lower edge contrast of the FLAIR signal of gliomas correlated with poor survival after bevacizumab.
Despite the heterogeneity of the data and software, most studies demonstrate the robustness of the MRTA and its clinical transferability for diagnostic use. Second-order statistical TA showed promising results in most studies. Entropy also appears to be a key feature. Quite possibly, multisequence-based MRTA may have higher accuracy. However, it may be time-consuming, and not all advanced sequences are widely available. Performing MRTA on commonly available CE-MR imaging as well as T2-weighted/ FLAIR sequences may be optimal for standardization, given the wide availability and promise shown in early studies. In studies involving LGGs, it may be better to perform MRTA on T2weighted/FLAIR sequences because they better identify the tumor. On the other hand, CE-MR imaging appears to be the single best sequence in glioblastoma, as mentioned in a study by Liu et al. 44

Challenges and Future Directions
Despite the advantages, widespread clinical implementation of MRTA is still limited, mostly due to nonuniformity and lack of standardization and quantification processes. The real challenge lies in the reproducibility and repeatability of these studies. Multiple studies used indigenous MRTA software, likely with varying algorithms. Thus, studies differ not only in image acquisition but also MRTA methodologies.
The other important aspect is use of a cancer imaging data base, which may suffice for conventional multiparametric assessment but nevertheless has considerable heterogeneity in sequences, protocols, and vendors. This is not confined just to the cancer data base but is a practical consideration for any multicenter study.
The impact of acquisition parameters on MRTA has been addressed in multiple studies. Ford et al, 69 using a digital 3D phantom, concluded that multiple texture features vary considerably between T1-weighted images (spin-echo, gradient echo, gradient recalled-echo, and inversion recovery) and T1 maps. They also noted that TR/TE variations on T1WI and T2WI affect texture features. Another phantom-based study by Buch et al 12 assessed the effect of magnet strength, flip angles, number of excitations, and different scanner platforms and concluded that some texture features are more robust (for example, except for histogram-related median, entropy, and GLCM contrast, all other histogram, GLCM, GLRLM, gray-level gradient matrix, and Law features did not show a significant difference from flip angles) and some are more susceptible to acquisition parameters (all Law features were significantly different for different magnetic strengths). Yang et al 70 found that different reconstruction algorithms, noise levels, and parallel imaging acceleration factors can influence texture parameters. Texture features are also affected by a number of coil elements, coil arrangement, and k-space sampling. 71 Rapid k-space sampling techniques can reduce SNR, thus affecting TA, especially histogram intensity-based features. 12,71,72 The inclusion of preprocessing steps may also affect texture features. Mayerhoefer et al 15 found zero-filling interpolation to be the most optimal with an interpolation factor of 4 to improve texture performance. Both Waugh et al 72 and Mayerhoefer et al 13 found spatial resolution to be the most important factor affecting MRTA and that variability in TR/TE, sampling bandwidth, and number of excitations is not significant at higher resolution. However, Molina et al 73 found that several GLCM and GLRLM texture features computed on 3D segmentation of brain gliomas were not robust over different spatial resolution/matrix size and gray-level ranges. They found only entropy to be the most robust feature. For intensity normalization, Collewet et al 74 found mean Ϯ 3 SDs to be the most optimal strategy. Partial volume artifacts can be corrected by iterative optimal thresholding algorithms. 12 In terms of analysis, the choice between analyzing multipleversus-few sequences for MRTA also needs to be addressed. Spinecho sequences are often acquired routinely in suspected brain tumor while advanced imaging may not be routinely performed, especially on the index scan. TA-based conventional sequences seem more practical in terms of generalizability, with T1-CEbased TA being the most optimal.
3D-TA appears more accurate than 2D, given the high spatial resolution of the acquired data. Similarly, results based on a VOI analysis appear more reliable than those based on a single slice (also a prominent limitation of multiple prior studies). 75 However, more studies are needed to further establish better accuracy of 3D-MRTA and justify the additional time and effort. 14 All these factors re-emphasize the need for standardization of MR imaging protocols, including uniform postprocessing techniques, to allow a more valid, multiple-institution comparison of MRTA results.
Challenges in processing include the inhomogeneity of MRTA software, which may be commercial, open-source, or developed in-house. The superiority of one over the other remains speculative at best. 76 Future studies should assess the comparability and accuracy of results across multiple types of software, especially in terms of clinical outcomes, survival, and radiomic parameters, to help with standardization. Finally, adequate training of radiologists is also required for consistent evaluation and implementation in routine workflow.
Another factor is the problem of the "huge data" that need sorting to prevent redundancy. Several classifier models exist to accurately predict the optimal texture feature. However, there is no consensus as to whether one is superior to the others. Artificial intelligence may be helpful in this case, both in feature selection and building prediction models.
Additionally, even though MRTA has shown potential in neuroimaging, certain valid criticisms of this technique should also be acknowledged. One major criticism of MRTA is that it is not hypothesis-driven. In some ways, MRTA is essentially correlating different mathematic computations with various imaging and clinical parameters to see what is statistically significant. This is, however, problematic for 2 main reasons: First, there is no intuitive reason why mathematic variables would make physiologic sense. Whether these significant relationships are merely chance findings secondary to overfitting (see next paragraph) or reflect as-yet unexplored physiologic correlates currently remains unclear. Most interesting, some prior studies have shown correlations between CT texture parameters and histologic markers such as CD34 and Ki-67, findings that may support some tissue-level basis for texture parameters. 11 These, however, remain to be fully determined and validated.
The other major limitation is the problem of overfitting, which can occur when the number of independent parameters being analyzed is larger than the number of data points/sample size. Generally, it is recommended that the sample size be 5-10 times the number of analyzed variables, which is often not the case, especially with studies using a smaller sample size. This issue could be addressed through either larger datasets or analysis of only a few preselected robust variables. Another way to avoid overfitting is to split the data into 3 mutually exclusive sets, one each for training, testing, and finally validation.
Finally, the role of MRTA should also be evaluated in the context of deep learning and neural networks. Even though unsupervised deep learning can self-identify features for itself and does not need manual input (thereby reducing interobserver bias in ROI selection) and feature selection, 76 deep learning methods require higher processing powers and considerable high-quality ground truth data. The insatiable appetite of deep learning for large quantities of labeled training data (which are both expensive and difficult to produce) is another limitation of the deep learning approach. 77 MRTA, on the other hand, is less data-hungry. Additionally, the internal algorithm feature vectors in unsupervised deep learning may not always be apparent (black box), while TA features can be explained more easily. However, ROI selection bias among observers can influence MRTA results and should be addressed prospectively. 75 However, the 2 techniques may be complementary in terms of optimal feature selection (in deep learning) and ease of use for wider applicability (for MRTA), thereby providing optimal output without substantial changes to the clinical workflow.

CONCLUSIONS
MRTA has shown promising results in various glioma-related applications. The inclusion of tumor heterogeneity as a radiologyreporting variable appears to break with the notion of radiology being only diagnostic or qualitative and brings the shift toward prognostic value as an imaging biomarker for precision/personalized medicine. However, before widespread clinical applicability, prospective validation of accuracy, selection of robust sequences, interinstitutional congruity of results, and selection of the best possible technique need to be addressed. Last, development of automated segmentation tools with incorporation of machine learning is essential to expedite feature extraction and analysis, thus saving time and additional burden on the radiologist.