NCCT Markers of Intracerebral Hemorrhage Expansion Using Revised Criteria: An External Validation of Their Predictive Accuracy

BACKGROUND AND PURPOSE: Several NCCT expansion markers have been proposed to improve the prediction of hematoma expansion. We retrospectively evaluated the predictive accuracy of 9 expansion markers. MATERIALS AND METHODS: Patients admitted for intracerebral hemorrhage within 24 hours of last seen well were retrospectively included from April 2016 to April 2020. The primary outcome was revised hematoma expansion, defined as any of a $6-mL or $33% increase in intracerebral hemorrhage volume, a $ 1-mL increase in intraventricular hemorrhage volume, or de novo intraventricular hemorrhage. We assessed the predictive accuracy of expansion markers and determined their association with revised he-

S pontaneous intracerebral hemorrhage (ICH) remains a major cause of morbidity and mortality worldwide. 1 Hematoma expansion (HE) is a potentially modifiable predictor of outcome and a promising therapeutic target. 2,3 HE is most often defined as a $ 6-mL absolute or $33% relative increase in ICH volume on follow-up imaging performed 24-72 hours after a baseline NCCT scan (herein considered a standard hematoma expansion [sHE] definition). 4 A recent study redefined HE to include new or increasing intraventricular hemorrhage to capture HE that occurs within the ventricles. 5 This revised hematoma expansion (rHE) definition better predicts 90-day functional outcomes than the sHE definition. [5][6][7] The spot sign is among the most studied imaging-based predictors of HE. 3,8 However, a CTA is not routinely available in the acute setting. The spot sign only mildly improves predictive accuracy when added to other established predictors of HE. [8][9][10] Multiple NCCT hematoma expansion markers (EMs) have been recently developed (Fig 1) in an attempt to mitigate those limitations. 4 These EMs may reflect the cascade phenomenon that occurs during HE, in which secondary hemorrhagic foci lead to irregular margins and heterogeneous density. They are classified as shape (Barras shape, island, and satellite) and density markers (Barras density, black hole, blend, fluid level, hypodensity, and swirl). EMs are associated with sHE, with ORs ranging from 2.01 to 7.87. 11 These markers may be integrated into prediction models to select patients at higher risk of HE for more intensive monitoring and/or treatment or trials in acute ICH. 12,13 They could also be used when other predictors of HE are not available, such as time in unknown-onset ICH. 14 In a meta-analysis, the sensitivity and specificity of individual NCCT hematoma EMs for sHE varied substantially between studies. The Barras shape marker had the highest pooled sensitivity (68%), and the island marker had the highest pooled specificity (92%). 15 However, the authors of the meta-analysis found an important risk of bias in the included studies. Whether the predictive accuracy of NCCT hematoma EMs is generalizable to rHE and routine clinical practice remains unknown. Despite recent standardized EM definitions, most studies evaluating EMs were conducted by expert readers who took part in their development, without external validation. 4,16 Moreover, headto-head comparisons of all 9 EMs within a single study are lacking. NCCT hematoma EMs are not currently used in routine clinical practice but could eventually help select patients at high risk of HE in future trials. To tackle these drawbacks and externally validate EMs, we designed a single-center retrospective diagnosticaccuracy study to assess the predictive accuracy of individual and combined EMs for rHE. We also aimed to evaluate whether EMs could improve the prediction of rHE when added to other established predictors of HE.

MATERIALS AND METHODS
This study was performed in accordance with the Standards for Reporting Diagnostic Accuracy Study (STARD) 17 and the STrengthening the Reporting of OBservational Studies in Epidemiology (STROBE) statement for observational cohort studies. 18 Our study was approved by our institutional ethics review board, including a waiver of consent for the use of deidentified patient data.

Study Population
We retrospectively identified consecutive patients with ICH 18 years of age or older who presented to the Centre Hospitalier Universitaire de l'Université de Montréal between April 2016 and April 2020. The Centre Hospitalier Universitaire de l'Université de Montréal is a high-volume comprehensive stroke center and primarily uses NCCT as initial imaging for patients presenting with acute neurologic symptoms. At our institution, patients are NCCT hematoma EMs. A, Blend sign: a relatively hypoattenuating area next to a hyperattenuating area of the hematoma, with a well-defined margin and a density difference of .18 HU between the 2 areas. B, Island sign: at least 3 scattered small hematomas all separate from the main hematoma (black arrows) or at least 4 small hematomas, some or all of which may connect with the main hematoma (dashed arrows), all visible on a single axial section. The white arrow identifies both a hypodensity (any hypodense region strictly encapsulated within the hemorrhage with any shape, size, and density) and a swirl sign (rounded, streak-like, or irregular region of hypo-or isoattenuation compared with the brain parenchyma that does not have to be encapsulated in the hematoma). C, Satellite sign: a small hematoma (diameter of ,10 mm) separate from the main hemorrhage in at least 1 section and distinct from the main hematoma by a 1-to 20-mm separation (black arrows). D, Black hole sign: a hypoattenuating area with a density difference of .28 HU compared with the surrounding hematoma, which has no connection with the surface outside the hematoma (black arrow). This finding also corresponds to a hypodensity and a swirl sign. For the 2 hypodense foci labeled with white arrows, because the density difference with the hyperattenuating hematoma is ,28 HU, they cannot be considered black hole signs. E, Fluid level: presence of 1 distinct hypoattenuating area (hypodense to the brain) above 1 hyperattenuating area (hyperdense to the brain), below a discrete straight line of separation (dashed line), irrespective of its density appearance. F, Barras density and shape signs are evaluated on the axial section showing the largest hematoma area and are based on a 5-point scale. Density is considered heterogeneous when there are $3 hypodensity foci within the hyperdense hematoma (scale of III, IV, or V). Shape is considered irregular when there are $2 focal hematoma margin irregularities, joined or separate from the hematoma edge (scale of III, IV, or V). Definitions from Morotti et al. 4 worked up for a secondary etiology of ICH in the absence of clinical or imaging findings that suggest small-vessel disease (ie, age, history of hypertension, leukoaraiosis, microbleeds), according to the physician's judgment. Patients were identified from all hospital departments through discharge codes in the medical archives and by query of our institution's prospectively collected stroke repository. We included patients with spontaneous ICH who were diagnosed ,24 hours from symptom onset or last time seen well and had an available follow-up NCCT performed ,72 hours after the initial imaging. We did not specify a minimum time between the initial and follow-up NCCT because patients may clinically deteriorate and develop HE on a follow-up CT performed shortly after the initial NCCT. Patients were excluded for the following reasons: 1) if ICH was known to be due to an underlying macrovascular cause (eg, intracranial aneurysm, arteriovenous malformation, cerebral cavernous malformation, dural arteriovenous fistula, or intracranial venous thrombosis), a brain neoplasm, trauma, or hemorrhagic transformation of a cerebral infarct; and/ or 2) if patients underwent neurosurgical hematoma evacuation or external ventricular drainage before a follow-up NCCT. We used these criteria to enable adequate assessment of HE in our sample. Baseline characteristics and the time interval from symptom onset (or last seen well, if unknown) to the baseline NCCT scan were collected by chart review.

Image Analysis
The presence or absence of 9 EMs was evaluated by study investigators on baseline NCCT using standardized definitions. 4 Images were pseudonymized and analyzed on a dedicated research platform to blind raters to subject identity, clinical information, and follow-up images as well as the other readers' interpretations. The Barras shape and density markers (5-point ordinal scales) were dichotomized to positive (3-5) versus negative (0-2), as previously published. 4 Interobserver agreement was substantial to almost perfect for most EMs except for Barras density (reader 1 versus 2) and Barras shape (reader 1 versus 3) for which estimates were in the moderate-agreement range (Online Supplemental Data). Disagreements were resolved by adjudication of the third investigator.
Two study investigators, blinded to the double-read results of the EM assessment, measured ICH and intraventricular hemorrhage volumes on pseudonymized baseline and followup NCCT using semi-automated manual segmentation techniques (3D Slicer, Version 4.11; http://www.slicer.org) (Online Supplemental Data). If multiple NCCTs were available within 72 hours after the initial imaging, we selected the first available NCCT if rHE occurred and the latest available NCCT if rHE did not occur.

Outcomes
We evaluated the predictive accuracy of EMs for rHE as the primary outcome and sHE as a secondary end point. rHE was defined as any of the following: a $ 6-mL absolute or $33% relative increase in ICH volume, a $ 1-mL increase in intraventricular hemorrhage volume, or de novo intraventricular hemorrhage. sHE was defined as a $ 6-mL absolute or $33% relative increase in ICH volume. 2

Statistical Analyses
The sample size was based on a convenience sample of all available data within the specified timeframe. Baseline characteristics were compared using the x 2 test of independence or the Fisher exact test for categoric variables and Student t test for continuous variables. We built 7 independent a priori composite variables using a combination of markers to evaluate a potential gain in the prediction of HE: 1) any shape marker (ie, at least 1 of Barras shape, island, or satellite), 2) any density marker (ie, at least 1 of Barras density, black hole, blend, fluid level, hypodensity, or swirl), 3) any EM, 4) shape count (ie, total number of shape markers), 5) density count (ie, total number of density markers), 6) EM count (ie, total number of EMs), and 7) the previously reported "expansion-prone hematoma" ($1 of black hole, blend, or island). 19 To evaluate the predictive accuracy of individual and combined EMs for rHE, we calculated their sensitivity, specificity, positive predictive value, negative predictive value, positive and negative likelihood ratios, diagnostic ORs, and accuracy. To determine the association between EMs and rHE, we calculated ORs with 2 sets of models using logistic regressions with maximum likelihood estimations. In our first set of models, we determined the association of each EM with rHE using univariable logistic regression. In our second set of models, we determined the added predictive value of each EM with likelihood ratio tests, used to compare a reduced model (established predictors of HE) nested into a full model (established predictors of HE plus 1 EM). 20 Established predictors of HE include antiplatelet use, anticoagulant use, ICH volume on baseline imaging, and time from symptom onset to baseline imaging, which were based on the findings of a large meta-analysis of individual patient-level data. 8 We used time from last seen well to imaging in our analyses instead of time from symptom onset to include patients with unclear symptom onset, who are also at risk for HE. 14 We used median imputation to replace missing values for last time seen well to imaging (n ¼ 10). Otherwise, all other variables had complete data. We subsequently explored the potential incremental value of the EMs that were found to be independently associated with rHE (Online Supplemental Data). We replicated all analyses using sHE as a secondary end point. Statistical significance was defined as a P , .05, without adjustment for multiple comparisons. All analyses were performed with R Studio (Version 1.4; http://rstudio.org/download/desktop) and R statistical and computing software (Version 4.2.2; http://www.r-project.org/).

Demographics and Outcomes
A total of 270 patients with spontaneous ICH were identified during the study period. After we excluded 146 patients (Fig 2), the study population encompassed 124 patients, of whom 51 (41%) developed rHE and 35 (28%) developed sHE seen on follow-up CT. The median initial ICH volume was 16 mL (interquartile range: 6-37 mL). The median time from last-seen-well to initial CT was 107 minutes (interquartile range: 75-228 minutes). Twenty-six patients (21%) were treated with anticoagulants (Online Supplemental Data). A CT or MRA was performed in 100 patients (81%). A brain MR imaging and a DSA were performed in 40 (32%) and 4 patients (3%), respectively.
No marker had both high sensitivity and specificity for the prediction of rHE. The satellite marker had the highest sensitivity (78%; 95% CI, 65%-89%), while the black hole marker had the highest specificity (97%; 95% CI, 90%-100%) (Fig 3). Only the black hole marker had a positive likelihood ratio of .5, though it had a wide CI (7.16; 95% CI, 1.64-31.30). The negative likelihood ratios of individual EMs ranged from 0.49 to 1.06. All measures of predictive accuracy for rHE are available in the Online Supplemental Data.
The presence of any marker had a high sensitivity (90%; 95% CI, 79%-97%) for the prediction of rHE but low specificity (21%; 95% CI, 12%-32%). Nevertheless, with an estimated negative likelihood ratio of 0.48, the absence of any marker would only shift the probability of rHE from 41% (pretest probability, ie, prevalence) to 25% (posttest probability). Specificity for the prediction of rHE increased slightly with the number of positive EMs. However, combinations of 2-6 positive EMs had low sensitivity for rHE (6%-22%) (Fig 4).

DISCUSSION
Our study provides an independent assessment of 9 NCCT hematoma EMs in a single cohort, including both standard and   revised definitions of HE. 5,7 The predictive performance of prevalence-insensitive metrics (sensitivity, specificity, and positive and negative likelihood ratios) did not differ on the basis of both definitions. Combining EMs did not improve the predictive accuracy for rHE compared with the top-performing EMs. Only 5 of 9 EMs were associated with rHE, with ORs similar to those in a recent meta-analysis. 12 After we adjusted for established predictors of HE, only the black hole marker remained significantly associated with rHE.
In a meta-analysis of individual patient data, the strongest predictors of HE were time from symptom onset to baseline imaging, baseline ICH volume, and antithrombotic medication use. 8 A new image-based prediction biomarker for HE should ideally improve on such established predictors. In our study, only the black hole marker remained associated with rHE in the multivariate analysis, and its added value in the full prediction model was only marginally significant, with an adjusted OR that had wide confidence intervals, partially resulting from its low prevalence. However, previous studies have shown that multiple other EMs (blend, hypodensity, island) are significant predictors of sHE and rHE. 7,21 The latter studies did not include patients treated with anticoagulants, which could partially explain the difference from our findings. The impact of anticoagulation on the predictive performances of EMs is unknown, but at least 1 study suggested that EMs can be used in patients treated with anticoagulants. 22 NCCT EMs may be necessary when other established markers are unavailable, such as the time from symptom onset in patients with an unwitnessed ICH. 14 In addition to its independent association with rHE, the black hole marker had the highest specificity, positive likelihood ratio, and positive predictive value among EMs, in line with previous studies that evaluated this marker using the sHE 23 and rHE definitions. 7 High specificity is important to reduce the rate of falsepositive expansion predictions in a future trial, especially if the experimental treatment is associated with serious adverse events, such as thrombotic complications. However, EMs with low sensitivity, such as the black hole marker, would miss a large proportion of patients at risk of rHE. Using the EMs with a low prevalence as an inclusion criterion in clinical trials could slow down recruitment. This issue was noticed in previous trials using the spot sign as an inclusion criterion. 10 To improve the predictive accuracy of individual EMs, we evaluated different combinations of markers. 24 Li et al 19 introduced the expansion-prone hematoma to tackle the low sensitivity associated with the most specific markers (black hole, blend, island) by allowing the presence of any of the latter. We found that the expansion-prone hematoma had lower sensitivity and specificity than previously reported, with its estimated negative likelihood ratio not in a range that is generally considered clinically useful. The lack of improvement in predictive accuracy with such EM combinations has been described in some studies. 21,25 One potential explanation is that different imaging markers capture similar information, with diminishing returns in predictive value when combined.
In our study, the swirl marker had lower specificity than previously reported. 26 Its higher than previously reported prevalence in our results may result from our inclusion of more diverse patients with ICH (ie, patients hospitalized in neurosurgery, patients treated with anticoagulants) or from a different interpretation of the marker by the raters in our study. This discrepancy raises the question of whether the predictive accuracy of EMs is generalizable to all patients with spontaneous ICH and whether the reliability of EMs is generalizable to all raters. 27 The heterogeneity of sensitivity and specificity metrics found in recent meta-analyses of EMs may partly reflect this issue. 15 Thus, despite a positive association between EMs and HE, such markers might not be robust enough to guide medical decisions in clinical practice. A potential approach to improve the accuracy and reliability of NCCT-based HE prediction could include emerging machine learning approaches, such as those from the fields of radiomics or deep learning. 28 Finally, it is possible that outcome prediction based on baseline imaging might be inherently limited. Even though EMs have a pathophysiologic rationale, the fact that CT is only a snapshot of a dynamic process might limit its potential value as a predictive biomarker, despite optimal imaging analysis. 29 Likewise, it is also possible that the predictive performances of EMs decrease as the delay after stroke onset increases, similar to the spot sign. 30

Strengths and Limitations
The main strengths of our study include appraisal of all patients with ICH for inclusion, blinded evaluation of EMs and HE, and assessment of the 9 standardized EMs. Our results should be interpreted with caution due to our relatively small sample size, which resulted in imprecise estimation of the prevalence of EMs and therefore of their predictive accuracy. Our small sample size also precluded a more comprehensive assessment of incremental prediction values, the detection of smaller associations, and the conduct of subgroup analyses. The time interval between the initial and follow-up NCCT was variable but was not associated with rHE. This study was performed at a single institution and should be repeated in a multicenter setting to mitigate potential selection biases inherent to a tertiary care center. In addition, we used a consensus evaluation of NCCT EMs in a retrospective setting, which differs from the acute setting and could impact the predictive performances. We included only patients with spontaneous ICH who did not undergo surgery before follow-up imaging, which may have potentially excluded patients with larger baseline ICH volumes. Similarly, we did not include patients who died before follow-up CT, which also may have led to a depletion of patients more likely to have HE. Because of these exclusion criteria, the prevalence of rHE may be underestimated. The median ICH volume was low (16 mL) but similar to that reported in large trials. 31, 32 We did not evaluate functional outcome due to a significant proportion of missing follow-up data, because many patients were transferred to secondary stroke centers after their follow-up imaging. The established predictors of HE included in our models were initially reported with the sHE and not the rHE definition. 8 However, the rHE definition is a refinement of the sHE definition and evaluates the same underlying process. We did not evaluate the interaction between time from ICH onset and the predictive performances of EMs. Finally, we performed multiple analyses without accounting for multiplicity, possibly having led to false-positive findings.

CONCLUSIONS
Most NCCT EMs were not found to be significantly associated with rHE after adjustment for established predictors of HE. No individual or combined NCCT EMs provided both the high sensitivity and specificity that would be required to identify patients at risk of HE. Larger and ideally multicenter studies are needed to further evaluate an NCCT-based approach to HE prediction before implementing these markers for decision-making in acute ICH.

Data Sharing
Data, analytic methods, and study materials will be made available to any researcher for reproducing the results or replicating the procedure. Requests to receive these materials should be sent to the corresponding author, who will maintain their availability.