Neuroimaging-Based Classification Algorithm for Predicting 1p/19q-Codeletion Status in IDH-Mutant Lower Grade Gliomas

One hundred two IDH-mutant lower grade gliomas with preoperative MR imaging and known 1p/19q status from The Cancer Genome Atlas composed a training dataset. Two neuroradiologists in consensus analyzed the training dataset for various imaging features: tumor or cyst texture, margins, cortical infiltration, T2-FLAIR mismatch, tumor cyst, T2* susceptibility, hydrocephalus, midline shift, maximum dimension, primary lobe, necrosis, enhancement, edema, and gliomatosis. Statistical analysis of the training data produced a multivariate classification model for codeletion prediction based on a subset of MR imaging features and patient age. Training dataset analysis produced a 2-step classification algorithm with 86.3% codeletion prediction accuracy, based on the following: 1) the presence of the T2-FLAIR mismatch sign, which was 100% predictive of noncodeleted lowergrade gliomas; and 2)a logistic regression model based on texture, patient age, T2* susceptibility, primary lobe, and hydrocephalus. Independent validation ofthe classification algorithm rendered codeletion prediction accuracies of 81.1% and 79.2% in 2 independent readers. BACKGROUND AND PURPOSE: Isocitrate dehydrogenase (IDH)-mutant lower grade gliomas are classified as oligodendrogliomas or diffuse astrocytomas based on 1p/19q-codeletion status. We aimed to test and validate neuroradiologists' performances in predicting the codeletion status of IDH-mutant lower grade gliomas based on simple neuroimaging metrics. MATERIALS AND METHODS: One hundred two IDH-mutant lower grade gliomas with preoperative MR imaging and known 1p/19q status from The Cancer Genome Atlas composed a training dataset. Two neuroradiologists in consensus analyzed the training dataset for various imaging features: tumor texture, margins, cortical infiltration, T2-FLAIR mismatch, tumor cyst, T2* susceptibility, hydrocephalus, midline shift, maximum dimension, primary lobe, necrosis, enhancement, edema, and gliomatosis. Statistical analysis of the training data produced a multivariate classification model for codeletion prediction based on a subset of MR imaging features and patient age. To validate the classification model, 2 different independent neuroradiologists analyzed a separate cohort of 106 institutional IDH-mutant lower grade gliomas. RESULTS: Training dataset analysis produced a 2-step classification algorithm with 86.3% codeletion prediction accuracy, based on the following: 1) the presence of the T2-FLAIR mismatch sign, which was 100% predictive of noncodeleted lower grade gliomas, (n = 21); and 2) a logistic regression model based on texture, patient age, T2* susceptibility, primary lobe, and hydrocephalus. Independent validation of the classification algorithm rendered codeletion prediction accuracies of 81.1% and 79.2% in 2 independent readers. The metrics used in the algorithm were associated with moderate-substantial interreader agreement (κ = 0.56–0.79). CONCLUSIONS: We have validated a classification algorithm based on simple, reproducible neuroimaging metrics and patient age that demonstrates a moderate prediction accuracy of 1p/19q-codeletion status among IDH-mutant lower grade gliomas.

T he revised World Health Organization (WHO) 2016 classification of diffuse gliomas integrates isocitrate dehydrogenase (IDH) gene status and whole-arm codeletion of chromosome arms 1p and 19q with histologic findings to classify grades II and III diffuse lower grade gliomas (LGGs). 1,2 More than 80% of LGGs are IDH-mutant; of those, 37%-50% carry the 1p/19q codeletion. 3,4 The 1p/19q-codeleted IDH-mutant LGGs (oligodendrogliomas; IDHmut-Codel) show better overall survival compared with noncodeleted IDH-mutant LGGs (astrocytomas; IDHmut-Noncodel) and are more sensitive to adjuvant chemotherapy with procarbazine, lomustine, and vincristine. [5][6][7] The integration of genomic data in the updated WHO classification of LGGs has accelerated efforts to noninvasively predict genetic signatures of diffuse gliomas using neuroimaging techniques. While numerous studies have identified neuroimaging features that correlate with 1p/19q-codeletion status in LGG subtypes,  many were performed before the 2016 WHO update, affecting their patient-selection process (ie, no accounting for IDH status). Moreover, these studies have applied variable imaging metrics and neuroimaging analysis, rendering it difficult to assess the relative and combined diagnostic performance of the various neuroimaging metrics reported to be associated with 1p/ 19q-codeletion status. Finally, simple metrics extrinsic to the glioma (eg, hydrocephalus, midline shift) have not been tested. The purpose of our study was to test and validate the combined accuracy of simple neuroimaging features to predict 1p/19q-codeletion status among cohorts of IDH-mutant LGGs.

MATERIALS AND METHODS
This was a Health Insurance Portability and Accountability Actcompliant retrospective study conducted with the University of Virginia Health System institutional review board approval.
The study consisted of 2 phases. First, analysis of a training dataset yielded a multivariate classification algorithm for predicting 1p/19q-codeletion status. Second, the classification algorithm was validated using a separate dataset of cases and separate neuroradiologist readers.

Training Dataset Analysis
The cases composing the training dataset were accrued from The Cancer Imaging Archive (TCIA), an LGG on-line data base. 29 TCIA data base houses MR imaging data for 199 LGGs, with molecular data (including IDH and 1p/19q statuses) available through The Cancer Genome Atlas (TCGA). The inclusion criteria were the following: 1) LGG with histopathologic assessment and grade, 2) LGG with an IDH mutation, 3) LGG with known 1p/ 19q-codeletion status, and 4) preoperative MR imaging (or MR imaging after a small-needle biopsy) containing imaging sequences relevant to the belowdescribed neuroimaging classification. IDH wild-type glioma (n ϭ 42), cases with incomplete molecular/pathologic data (n ϭ 2), and cases with insufficient MR imaging data (n ϭ 53) were excluded from the study, rendering 102 IDH-mutant LGGs included in the training dataset.
Two neuroradiologists, with 5 and 13 years of experience, blinded to the 1p/ 19q-codeletion status, analyzed the MR images from the training dataset in consensus. They measured 14 neuroimaging metrics: 1) primary lobe: yes/no centered on frontal lobe; 2) texture: more or less than 75% of the tumor showing homogeneous signal intensity on T1WI/ T2WI; 3) margins: more or less than 75% of the tumor showing sharp/ circumscribed margins; 4) T2-FLAIR mismatch sign: the presence or absence of complete/near-complete hyperintense signal on T2WI and relatively hypointense signal on FLAIR except for a hyperintense peripheral rim 19 ; 5) T2* susceptibility blooming: present or absent; 6) contrast enhancement: present or absent; 7) cysts: present or absent; 8) necrosis: present or absent; 9) maximum tumor diameter (centimeter); 10) cortical infiltration: present or absent; 11) peritumoral edema: present or absent; 12) gliomatosis: yes/no involvement of Ն3 lobes; 13) midline shift (centimeter); and 14) hydrocephalus: present or absent. Figures 1 and 2 show the characteristic imaging appearance of IDHmut-Noncodel and IDHmut-Codel LGGs, respectively, including a description of several of the above imaging metrics. Univariate and multivariate logistic regression analysis of the MR imaging characteristics and patient age for predicting the 1p/19q-codeletion status was undertaken. On the basis of these results, a classification algorithm for 1p/19q prediction was developed.

Validation Analysis
To validate the classification algorithm developed with the training dataset, two new neuroradiologists analyzed a separate institutional cohort of IDH-mutant LGGs. The same selection criteria used for the training dataset were applied, and 106 IDH-mutant LGGs consecutively accrued from an institutional neuro-oncology/neuroradiology data base between 2010 and 2017 composed the validation cohort. The neuroradiologists (reader A with 3  years of experience, reader B with 19 years of experience), blinded to the 1p/19q-codeletion status, independently reviewed the MR images of these cases. The readers analyzed the MR imaging metrics relevant to the classification model with the same criteria used for the training dataset. Interreader agreement for the neuroimaging metrics and independent reader performance in predicting 1p/19q-codeletion status were determined.

Neuropathology
For TCIA/TCGA cohort, histopathologic assignment and molecular classification were derived from supplemental material in Ceccarelli et al, 30 in 2016, and included somatic mutation analysis of IDH1 or IDH2 from whole-exome sequencing and codeletion of chromosome arms 1p and 19q from the SNP Array 6.0 (Affymetrix, Santa Clara, California).
For the validation cohort, the IDH and 1p/19q statuses were retrieved from the electronic medical record. Both markers were tested in the Clinical Laboratory Improvement Amendments-certified molecular pathology laboratory at our institution. IDH mutation status was first determined by immunohistochemistry using an IDH1 R132H mutant-specific clinically validated antibody (DIA-H09; Dianova GmbH, Hamburg, Germany). 31,32 In immunohistochemistry cases negative for IDH1 R132H mutation, IDH mutation status was assessed by the clinically validated DNA pyrosequencing assay. 33 1p/ 19q status was determined by fluorescence in situ hybridization on paraffin-embedded tissue, using human probes localizing 1p, 1q, 19p, and 19q (Locus Specific Identifier 1p36/1q25 and 19p13/19q13 Dual-Color Probes; Vysis, Downers Grove, Illinois).

Statistical Analysis
The following is an abbreviated description of the statistical methodology; a full description is included in the On-line Appendix.
Training Dataset. Univariate logistic regression analysis of the 14 MR imaging characteristics and patient age for predicting 1p/19qcodeletion status was undertaken for the TCIA/TCGA-derived training dataset. Because the presence of the T2-FLAIR mismatch sign showed 100% positive predictive value (PPV) for the IDHmut-Noncodel molecular subtype (see Results below), these cases were segregated from the cohort. A multivariate logistic regression using the remaining aforementioned predictor variables was applied in a 2-step analytic process to the remaining cases in the cohort (either negative for the T2-FLAIR mismatch sign or had no T2-FLAIR match sign information available). First, a full model was constructed with the goal of identifying predictor variables that contribute unique information about 1p/19q-codeletion status based on a set of type III Wald 2 tests at the ␣ ϭ .10 threshold. Second, a reduced model was constructed using only the unique predictors of 1p/19q-codeletion status. The regression equation of the reduced multivariate model was then used to compute the predicted probability for codeletion status, and these predicted probabilities were used to derive a classification algorithm rule. The predicted probability threshold for the classification rule of the algorithm was derived by identifying the predicted probability threshold that produced the largest Youden J statistic (J ϭ Diagnostic Sensitivity ϩ Diagnostic Specificity Ϫ 1). 34 Validation Dataset. Interreader agreement for readers A and B was evaluated via the unweighted statistic sign. Cases that were deemed by the independent readers to be positive for the T2-FLAIR mismatch sign were included as "true-negatives" (ie, 1p/ 19q noncodeleted), and the training set-derived reduced multivariate logistic regression model equation was applied to the reader data in cases negative for T2-FLAIR mismatch. Cases could be classified as either 1p/19q codeleted or noncodeleted based on whether the predicted probability was greater than, equal to, or less than the established classification algorithm predicted probability threshold, respectively. The overall diagnostic classification performance was assessed per reader.
Statistical Software. The statistical software package Spotfire Sϩ, Version 8.2 (TIBCO, Palo Alto, California) was used to conduct the multivariate logistic regression (MLR) analyses, and the pROC package of R (http://www.r-project.org /) was used to conduct the diagnostic classification performance analyses. 35,36

Training Dataset Analysis
Of the 102 patients with IDH-mutant LGGs in the training dataset, 51% were women (n ϭ 52) and 49% were men (n ϭ 50  (Table 1). Notably, the T2-FLAIR mismatch sign showed PPV ϭ 100% and negative predictive value ϭ 44% for the IDHmut-Noncodel subtype. In 100% of cases (n ϭ 21) in which the T2-FLAIR mismatch sign was present, the glioma was 1p/19qnoncodeleted. Therefore, the cases in which the T2-FLAIR mismatch sign was present were segregated from the cohort, and a multivariate logistic regression analysis was undertaken in the remaining cases (n ϭ 81).
Multivariate Analyses. On the basis of a full multivariate logistic regression model analysis, tumor texture (P Ͻ .001), patient age (P ϭ .010), T2* susceptibility blooming (P ϭ .022), primary lobe (P ϭ .039), and hydrocephalus (P ϭ .052) were determined to be uniquely associated with 1p/19q-codeletion status, and these metrics were used to create a reduced multivariate logistic regression model. A predicted probability threshold of 0.40 resulted in the largest Youden J statistic for the reduced multivariate logistic regression model. Finally, a 2-step classification algorithm was created on the basis of the following: 1) the presence of the T2-FLAIR mismatch sign; and 2) a reduced multivariate logistic regression model, with application of the Youden J statistic-derived predicted probability threshold of 0.40 (Fig 3). The 2-step classification algorithm demonstrated 86.3% accuracy in predicting the 1p/19q-codeletion status among the IDH-mutant LGGs in the training dataset (Table 2).
The performance of the 2-step classification algorithm (Fig 3) was assessed using the independently collected data from readers A and B. Reader A identified the T2-FLAIR mismatch sign in 19 cases; thus, the remaining 87 cases were assessed by applying the reduced logistic regression model based on texture, patient age, T2* susceptibility blooming, primary lobe, and hydrocephalus. Reader B identified the T2-FLAIR mismatch sign in 16 cases, with the remaining 90 cases assessed by applying the reduced logistic regression model. The 2-step classification algorithm for predicting 1p/19q-codeletion status had 81.1% accuracy for reader A and 79.2% accuracy for reader B (Table 3). Notably, the T2-FLAIR mismatch sign demonstrated PPV ϭ 100% for predicting the IDHmut-Noncodel subtype for both independent readers.

DISCUSSION
Prior studies have reported multiple morphologic imaging features in LGGs that were associated with 1p/19q-codeletion status. 8

-28 IDHmut-Codel
LGGs commonly localize to the frontal lobe and typically have indistinct borders, calcification, and tumor heterogeneity. IDHmut-Noncodel LGGs are more typically homogeneous, circumscribed, lack calcification, and more frequently localize to the insula and temporal lobe. Before the 2016 WHO classification update, studies assessing neuroimaging associations with 1p/19q-codeletion status frequently limited their analysis to histologically defined oligodendrogliomas or oligoastrocytomas. [8][9][10][11][12][13][14][15] The impact of excluding diffuse astrocytomas from these earlier studies is unknown. Although recent studies have adopted the 2016 WHO classification scheme, many have limited their analyses to select MR imaging sequences, select morphologic imaging features, or single-institution datasets without training/validation methodology. 20,23,26,28 Four recent studies have used training/validation methodology for analyzing imaging features associated with 1p/19q-codeletion status. 19,24,25,27 Park et al, 24 in 2018, analyzed the neuroimaging features of a discovery set of 175 LGGs and a validation set of 40 LGGs and reported that mixed restricted diffusion and pial invasion were associated with 1p/19q codeletion among IDH1- LGGs, derived from an analysis of the TCGA/TCIA training cohort. The first step of the algorithm is an assessment of the T2-FLAIR mismatch sign; when present, it indicates, with high certainty, the IDH-mutant 1p/19q-noncodeleted subtype. When the T2-FLAIR mismatch sign is absent (or unavailable), a multivariate logistic regression model based on tumor texture, patient age, T2* blooming, tumor location, and hydrocephalus is applied. The equation for the logistic regression model is the following: codeleted ϭ exp[(X␤) / (1ϩexp(X␤)Ј], where X␤ ϭ Ϫ4.8834 ϩ 2.7842 ϫ (texture Ͻ75% homogeneous) ϩ 0.0587 ϫ (patient age) ϩ 3.1948 ϫ (T2* susceptibility blooming present) ϩ 1.6646 ϫ (primary lobe ϭ frontal) Ϫ 3.4496 ϫ (hydrocephalus ϭ present). A predicted probability threshold of 0.40 was established for the logistic regression model, based on the Youden J statistic. Cases were classified as either 1p/19q codeleted or non-codeleted based on whether their predicted probability was Ն or Ͻ 0.40, respectively.  LGGs. Limitations to their methodology included singleinstitution analysis, lack of T2-FLAIR mismatch sign assessment, and lack of IDH2 testing. Kanazawa et al, 27 in 2018, analyzed a discovery cohort (n ϭ 45) and validation cohort (n ϭ 52) of LGGs and found that when at least 3 of the following imaging features were present-calcification, paramagnetic susceptibility, indistinct tumor border, and cystic component-there was Ͼ95% specificity for 1p/19q codeletion. Limitations to their methodology included a lack of interreader agreement determination, mostly nonradiologist readers, lack of T2-FLAIR mismatch sign assessment, and overlap between calcification and paramagnetic susceptibility. Patel et al, 19 in 2017, assessed LGG MR imaging features in training (n ϭ 125) and validation (n ϭ 60) datasets and were the first to report 100% PPV of the T2-FLAIR mismatch sign for predicting the IDHmut-Noncodel molecular subtype. Limitations to their methodology included the small number of imaging metrics tested (n ϭ 4).
Notably, Broen et al, 28 in 2018, confirmed the 100% PPV for the T2-FLAIR mismatch sign in predicting IDH-mutant noncodeleted astrocytomas in a multi-institution cohort of LGGs (n ϭ 154), though they did not use a training/validation methodology. Finally, Lasocki et al, 25 in 2018, analyzed the MR imaging features of an LGG cohort comprising 69 patients (n ϭ 10 in the training cohort, n ϭ 59 in the validation cohort). They found 100% PPV of Ͼ50% T2-FLAIR mismatch for lack of 1p/19q codeletion and high specificity of calcification for underlying 1p/19q codeletion. Limitations to their methodology included low cohort size (only 10 patients used for training), inclusion of cases with unknown IDH status, and single-institution analysis. None of the above described studies used completely different readers for their training and validation analyses.
In our study, we strictly adhered to the molecular classification of diffuse LGGs defined in the 2016 WHO update and excluded cases without relevant molecular data. We excluded IDH wildtype LGGs because our aim was to determine imaging features associated with each of the 2 subgroups among IDH-mutant LGGs, as defined by their 1p/19q-codeletion status. The TCGA/ TCIA data base was selected as our training dataset because it had the highest likelihood for generalizability: Cases were accrued from multiple institutions, MR imaging examinations were performed on a variety of scanners with marked variability in imag-ing quality, and the molecular data were reliable and comprehensive. To further explore the generalizability and reproducibility of our results, we used a large-validation cohort and completely different readers for the training and validation analyses. Our neuroimaging metrics are simple, mostly binary, and can be easily deduced from routine neuroimaging sequences. Unique to our study, we assessed simple extrinsic morphologic features such as hydrocephalus and midline shift, as well as patient age. Aside from Lasocki et al, 25 no prior study assessed the utility of T2-FLAIR mismatch in a multivariate model for predicting 1p/19q codeletion.
Our classification algorithm achieved good accuracy (86.3%) for predicting the codeletion status among the TCGA/TCIA IDHmutant LGGs, and the validation analysis showed comparable accuracy (81.1% and 79.2% for readers A and B). In addition, we revalidated the high PPV of the T2-FLAIR mismatch sign for predicting the IDHmut-Noncodel subtype (PPV ϭ 100% in both training and validation analyses). We also report a novel association between hydrocephalus and midline shift with codeletion status. We found tumor heterogeneity, frontal lobe location, and T2* susceptibility blooming to be significant predictors of the presence of 1p/19q-codeletion status, concordant with prior studies. However, in contrast to prior studies, tumor margin was not a useful discriminatory feature for determining codeletion status in our study. 10,17 This could be partly due to differences in the cohorts chosen for analysis or may reflect the subjective nature of this imaging metric.
Although we did not include IDH wild-type LGGs in our analysis, prediction of IDH status is critical to a neuroimaging-based classification of LGGs. This topic has been extensively studied in recent years, including with the use of conventional neuroimaging metrics, 37,38 advanced methods such as MR spectroscopic detection of 2-hydroxyglutarate (an oncometabolite that accumulates in IDH-mutant gliomas), 39 and machine learning techniques. 40 Our work may complement neuroimaging-based methods for IDH prediction and contribute to a comprehensive prediction of molecular status in LGGs.
Our study has limitations. We followed a retrospective design, and prospective validation of our results would be desirable. The training and validation cohorts had differing frequencies of 1p/ 19q codeletions and WHO grades, which may have affected our results. The moderate accuracy achieved by our classification algorithm for predicting codeletion status underlines the fact that molecular testing of surgical specimens will remain the criterion standard for LGG classification in the foreseeable future. However, prediction by neuroimaging may be useful for patient counseling in the preoperative setting, in cases in which biopsy or resection is challenging or pathologic tissue is insufficient for accurate rendering of molecular results, and in cases of laboratory error or misinterpretation (eg, misinterpretation of flu-  orescence in situ hybridization-based 1p/19q-codeletion results in the setting of chromosomal polysomy or partial chromosome arm deletions). [41][42][43] CONCLUSIONS A 2-step classification algorithm based on the T2-FLAIR mismatch sign and a multivariate logistic regression model using tumor texture, patient age, T2* blooming, location, and hydrocephalus demonstrates an overall moderate prediction accuracy for 1p/19q-codeletion status in IDH-mutant LGGs. We validated the high PPV of the T2-FLAIR mismatch sign for predicting the IDHmut-Noncodel LGG subtype and report novel associations between midline shift/ hydrocephalus and the IDHmut-Noncodel LGG subtype.