Abstract
BACKGROUND AND PURPOSE: Complications from endovascular thrombectomy (EVT) can negatively affect clinical outcomes, making the development of a more precise and objective prediction model essential. This research aimed to assess the effectiveness of radiomics features derived from presurgical CT scans in predicting the prognosis post-EVT in patients with acute ischemic stroke.
MATERIALS AND METHODS: This investigation included 336 patients with acute ischemic stroke from 2 medical centers from March 2018 to March 2024. The participants were split into a training cohort of 161 patients and a validation cohort of 175 patients. Patient outcomes were rated with the mRS: 0–2 for good, 3–6 for poor. A total of 428 radiomics features were derived from intrathrombus and perithrombus regions in noncontrast CT and CTA images. Feature selection was conducted using a least absolute shrinkage and selection operator regression model. The efficacy of 8 different supervised learning models was assessed using the area under the curve (AUC) of the receiver operating characteristic curve.
RESULTS: Among all models tested in the validation cohort, the logistic regression algorithm for the combined model achieved the highest AUC (0.87; 95% CI, 0.81–0.92), outperforming other algorithms. The combined use of radiomics features from both the intrathrombus and perithrombus regions significantly enhanced diagnostic accuracy over models using features from a single region (0.81 versus 0.70, 0.77), highlighting the benefit of integrating data from both regions for improved prediction.
CONCLUSIONS: The findings suggest that a combined radiomics model based on CT serves as a potent approach to assessing the prognosis following EVT. The logistic regression model, in particular, proved to be both effective and stable, offering critical insights for the management of stroke.
ABBREVIATIONS:
- AUC
- area under the curve
- EVT
- endovascular thrombectomy
- KNN
- k-nearest neighbors
- LASSO
- least absolute shrinkage and selection operator
- LightGBM
- Light Gradient-Boosting Machine
- LR
- logistic regression
- MLP
- multilayer perceptron
- Rad
- radiomics
- RF
- random forest
- SVM
- support vector machine
- XGBoost
- eXtreme Gradient Boosting
SUMMARY
PREVIOUS LITERATURE:
Prior studies have examined the predictive value of CT-derived thrombus radiomics in stroke, focusing mainly on intrathrombus features for predicting thrombectomy or thrombolysis outcomes. These investigations highlighted the role of CT signs and thrombus properties in prognostication but also revealed limitations due to the reliance on subjective interpretation and a singular focus on intrathrombus analysis.
KEY FINDINGS:
Our study validates a CT-based combined radiomics model using both intrathrombus and perithrombus features, with logistic regression demonstrating the highest predictive accuracy for post-EVT outcomes in stroke.
KNOWLEDGE ADVANCEMENT:
The research advances understanding by integrating perithrombus features into predictive modeling, offering a more comprehensive and objective analysis that surpasses traditional evaluations, thereby enhancing stroke-outcome predictions.
Acute ischemic stroke considerably contributes to death, disability, and high morbidity globally, greatly impacting global mortality rates.1 Endovascular thrombectomy (EVT) is the standard treatment recommended for patients experiencing acute anterior circulation large-vessel occlusion.2,3 However, EVT is associated with certain complications, such as intracranial hemorrhage and malignant brain edema.4⇓⇓⇓–8 The presence of these conditions notably compromises the probability of positive clinical outcomes and elevates the risk of death. Given the heterogeneity of functional outcomes even after a successful procedure, there is an urgent need to both identify patients who are suitable for EVT and predict early poor outcomes. These efforts can subsequently enhance the patient prognosis.
In recent years, certain radiologic signs such as the high-density MCA sign, intracranial high-density areas, large ischemic cores, and mismatch of CT perfusion have been identified as predictors of clinical outcomes.9⇓⇓–12 Although these CT features provide valuable insight, their evaluation depends heavily on the subjective interpretation by radiologists and may not be sufficient for accurate prognosis. Thus, developing a more precise and objective prediction model is essential.
Radiomics analysis offers a quantitative approach by analyzing the variations in gray levels among pixels, allowing a detailed, high-throughput examination of imaging data that surpasses the conventional visual assessments performed by experts. This method holds promise for enhancing diagnostic accuracy. Machine learning models such as support vector machine (SVM), logistic regression (LR), and random forest (RF) have proved effective in delivering precise predictions, thus aiding health care professionals in refining stroke management and enhancing patient outcomes.13 Recent progress in CT-derived radiomics, particularly in analyzing thrombus properties, has shown potential in forecasting various clinical outcomes. While previous research has validated the effectiveness of CT-derived thrombus radiomics in determining thrombus age, composition, and origin, as well as predicting outcomes after thrombectomy or thrombolysis treatments, and using a CTA-based thrombus radiomics model to estimate the timing of stroke onset,14⇓⇓⇓–18 these studies have primarily focused on intrathrombus features. There remains a substantial research gap in examining perithrombus areas and their role in predicting clinical outcomes after EVT.
Consequently, this research sought to evaluate the predictive capacity of both intrathrombus and perithrombus radiomics features extracted from CT for clinical outcomes post-EVT. We also aimed to identify the most effective machine learning classifier for this purpose through rigorous statistical analysis.
MATERIALS AND METHODS
Patients
This research adhered to the guidelines of the Declaration of Helsinki and received approval from the ethics committees of the participating hospitals and was granted a waiver of informed consent. We performed a retrospective analysis on patients with acute stroke who were admitted to 2 medical centers (A: Affiliated Hospital of Nantong University, B: Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine) between March 2018 and March 2024. The inclusion criteria were the following: 1) acute stroke resulting from anterior circulation large-vessel occlusion; 2) visible thrombus-related signs on initial NCCT or CTA at admission; 3) an mRS score of <3 before stroke; 4) subsequent immediate EVT; and 5) the availability of comprehensive demographic and clinical data. Criteria for exclusion were inadequate imaging clarity due to motion or metal artifacts and incomplete clinical records. Collected clinical data encompassed age, sex, medical history (including hypertension, diabetes, hyperlipidemia, atrial fibrillation, and coronary artery disease), and NIHSS score at admission. In this study, “prognosis” is defined as the clinical outcomes observed 90 days after EVT, as gauged by the mRS. The assessment specifically targets the restitution of motor function and the frequency of major complications. At 90 days, 2 specialized stroke neurologists (J.L.W., J.X.J.) conducted a systematic evaluation of the mRS scores. Patients were categorized into 2 groups according to their mRS scores: the good outcome group with mRS scores of <3, and the poor outcome group with mRS scores ranging from 3 to 6. Patients from Center A were assigned to the training cohort, while those from Center B were allocated to the validation cohort. The patient-selection process and analytic pathway are depicted in Fig 1. For further details on the code and model files, please contact the corresponding author via e-mail.
Flow chart of the patient-selection process.
CT Data Acquisition and Thrombus Segmentation
The radiomics process encompassed outlining the ROI, extraction of radiomics features, feature selection, and construction of predictive models (Fig 2). NCCT and CTA were performed using 64- to 256-slice CT scanners from 2 vendors (Somatom Force, Siemens; Revolution CT and Optima CT680, GE Healthcare), with a reconstruction slice thickness between 0.63 and 1.00 mm. Before thrombus segmentation, all CT scans were subject to intensity normalization, adjusting the intensity values to a 0–600 range. The images were also adjusted to a uniform resolution of 1 × 1 × 1 mm to standardize voxel dimensions. Thrombus-associated ROIs were outlined using ITK-SNAP software (Version 3.6.0; http://www.itksnap.org/pmwiki/pmwiki.php) referencing DSA images with the method used in our previous study.15 After we segmented the intrathrombus regions, the perithrombus areas were automatically segmented by increasing the radius by 1 mm from the original ROIs using Python (Version 2.7.13). To assess the precision of segmentation, we delineated 30 thrombi selected at random 2 times from CTA scans via 1 radiologist (M.D.L.) within a 2-week period and independently verified by another radiologist (H.M.G.). Both readers were unaware of the patients’ clinical data during the segmentation process.
Workflow of the CT-based radiomics model.
Feature Extraction and Selection
Following the delineation of intrathrombus and perithrombus regions, radiomics features were obtained via the PyRadiomics library (https://pypi.org/project/pyradiomic/). From both regions on NCCT and CTA scans, 428 features were derived in total. To normalize these features and reduce variability across variances, we applied z score normalization, scaling the features to a 0–1 range in the training cohort. This normalization process was replicated in the validation data sets as well. Feature selection was performed on the training cohort via the Mann-Whitney U test to screen out redundant radiomics features, keeping only those significant at P < .05. To assess the interfeature relationships, we calculated the Spearman rank correlation coefficients, and only features that demonstrated a correlation coefficient of >0.9 with at least 2 other features were kept. The refined data set was then subjected to the least absolute shrinkage and selection operator (LASSO) regression model to develop a predictive radiomics signature. In this study, we conducted k-fold cross-validation as part of our regularization process, specifically tuning the λ parameter to select features optimally.
Classifier Model Building and Evaluation
For the selection of radiomics features, the maximal relevance and minimum redundancy method followed by the LASSO technique was implemented sequentially. This method ranked radiomics features with an intraclass correlation coefficient of >0.90 on the basis of their relevance-redundancy index. From this ranking, the top 10 features exhibiting the highest relevance were preserved. These chosen features were further refined through the LASSO classifier to pinpoint an optimized subset for model development. A radiomics signature was established via multiple logistic regression, using the selected features, and a radiomics score (Rad score) was computed by summing these features, each weighted by its respective coefficient.
To evaluate the clinical differences between patient groups with and without good outcomes, we performed both univariate and multivariate analyses. Additionally, 8 supervised machine learning algorithms—RF, LR, SVM, k-nearest neighbors (KNN), Extra Trees, Light Gradient-Boosting Machine (LightGBM; https://lightgbm.readthedocs.io/en/latest/index.html), multilayer perceptron (MLP), and eXtreme Gradient Boosting (XGBoost)—were used as classifiers. After feature selection through the LASSO method, these features were incorporated into the models, and a 5-fold cross-validation strategy was adopted to confirm the final radiomics signature. The DeLong test was used to statistically assess differences in predictive performance among the radiomics models (intrathrombus, perithrombus, and combined models). The CheckList for EvaluAtion of Radiomics study (https://pubmed.ncbi.nlm.nih.gov/37142815/) served as the guideline for standardized reporting in this radiomics research. The optimal classification algorithm was identified to develop a clinical prediction model that incorporates selected clinical variables.
Statistical Analysis
Clinical characteristics were evaluated via the t test, Mann-Whitney U test, or χ2 test as appropriate. To analyze the correlations among features, we used the Spearman rank correlation coefficient, retaining those features with a coefficient of >0.9. The consistency of the ROI delineation was verified using the intraclass correlation coefficient, with an intraclass correlation coefficient of >0.75 indicating strong reliability. The efficacy of the predictive models post-EVT was assessed with receiver operating characteristic curve analysis and the DeLong test for variations. A P value < .05 was statistically significant. To mitigate type I errors from multiple comparisons, we used false discovery rate corrections in our analysis.
We hereby present this article following the STARD reporting checklist (Supplemental Data).
RESULTS
Patient Characteristics
In our study, 336 patients with stroke were carefully chosen on the basis of defined criteria; 128 (38.1%) were assessed as having a poor outcome following EVT. The study divided these patients into 2 cohorts: The study group comprised 161 from Center A as the training group and 175 from Center B as the validation group. The Table provides a detailed summary of the demographic and clinical characteristics of the patients, categorized on the basis of the outcome after EVT for both the training and validation groups. The analysis showed no significant statistical differences in sex, age, hypertension, hyperlipidemia, diabetes, smoking habits, coronary heart disease, or NIHSS scores between the good outcome and poor outcome groups across both cohorts. However, a notable statistical difference was found in the incidence of atrial fibrillation between the groups. No statistically significant variance was observed when comparing the 2 cohorts.
Baseline demographic characteristics and clinical variables of enrolled patients
Feature Extraction and Selection
The intraclass correlation coefficient values demonstrated strong agreement (0.75–0.90) for the radiomics features. After we confirmed this consistency, all pertinent radiomics features were extracted and used to build predictive models. Ultimately, the Rad scores were formulated using 6, 12, and 15 features, with nonzero coefficients for the intrathrombus, perithrombus, and combined models, respectively, as shown in Fig 3. Detailed information on the chosen radiomics features can be found in the Supplemental Data. All selected radiomics features originated from CTA images, with no features being chosen from NCCT images.
Radiomics feature-selection based on the LASSO algorithm and Rad score based on intrathrombus (A), perithrombus (B), and combined regions (C). Details of selected radiomic features are in the supplementary materials.
Performance and Comparison of Models
The performance of 8 classifiers—LR, SVM, KNN, RF, Extra Trees, XGBoost, LightGBM, and MLP—was assessed in both the training and validation cohorts. The results are detailed in the Supplemental Data and Fig 4.
In the training (A) and validation (B) cohorts, the 8 classifiers including the LR, SVM, KNN, RF, extra trees, XGBoost, LightGBM, and MLP obtained AUCs for the models, respectively. In the validation cohort, the AUC range of each model based on intrathrombus regions is 0.61–0.70. Based on perithrombus regions, it is 0.74–0.87, and on combined regions, it is 0.74–0.90 (C).
In the evaluation of models for the intrathrombus, perithrombus, and combined regions, XGBoost consistently outperformed other algorithms in the training cohort. However, it experienced a noticeable decline in performance when evaluated on the validation cohort. KNN, LightGBM, and RF showed performance similar to that of XGBoost. LR performed slightly lower in the training cohort compared with these models and was relatively stable but excelled in the validation cohort.
Within the intrathrombus models of the validation cohort, LR achieved the highest area under the curve (AUC) of 0.70 (95% CI, 0.62–0.78), showing statistically significant differences compared with XGBoost (P = .03), RF (P = .03), and KNN (P = .01), exhibiting no statistically significant disparities compared with LightGBM (P = .26), Extra Trees (P = .97), MLP (P = .36), and SVM (P = .13). For the perithrombus models, LR exhibited the greatest AUC of 0.80 (95% CI, 0.73–0.87), demonstrating statistically significant distinctions compared with Extra Trees (P = .002), LightGBM (P = .04), MLP (P = .03), and SVM (P = .01), whereas it did not differ significantly from RF (P = .18), XGBoost (P = .05), and KNN (P = .60). In the combined regions, the LR model reached an AUC of 0.87 (95% CI, 0.81–0.92) and was statistically different compared with Extra Trees (P = .01), KNN (P = .02), RF (P = .01), XGBoost (P < .001), LightGBM (P < .001), MLP (P = .01), and SVM (P = .01).
Furthermore, in the validation cohort, the diagnostic capability of the LR model using the combined regions significantly surpassed that of both the intrathrombus (P < .001) and perithrombus models (P = .01). However, no significant difference in diagnostic performance was observed between the intrathrombus and perithrombus models (P = .05).
We integrated clinical parameters into an LR radiomics model but observed no significant predictive gains in the validation cohort (P > .05). An exclusively clinical LR model has also been formulated. The receiver operating characteristic curves for all 3 models can be examined in the Supplemental Data.
DISCUSSION
In this retrospective analysis, our aim was to use radiomics features derived from intrathrombus and perithrombus regions on CT to forecast prognosis following EVT in patients with acute ischemic stroke. A substantial gap exists in the current research regarding the use of thrombus-related radiomics for predicting clinical outcomes following EVT, especially in perithrombotic areas, because some studies focused on recanalization following EVT.13,17 Our study developed and validated a radiomics model that uses features from both intrathrombus and perithrombus regions to estimate the prognosis after EVT. We used 8 different classifiers to determine which models offer robust diagnostic effectiveness and superior generalization capabilities. Among these, the LR model using combined radiomics features proved to be the most precise in predicting outcomes.
Prior research has shown that the duration of thrombectomy and the frequency of interventions can influence the long-term outcomes in patients.19,20 There is also evidence suggesting a relationship between the structural composition of the thrombus and the number and duration of EVT procedures.21,22 Variables like the use of stent retrievers, thrombotic makeup, and the number of thrombectomy sessions might lead to varying extents of vascular trauma in patients with acute ischemic stroke,23,24 indicating that thrombus composition could be a critical factor in forecasting prognosis post-EVT. In this study, we constructed a combined model using 15 selected radiomics features derived solely from CTA, which included 9 features from intrathrombus regions and 6 from perithrombus regions. Our analysis revealed that radiomics features from both intra- and perithrombus regions have potent predictive capabilities, particularly those from perithrombus areas. In the validation cohort, radiomics from the perithrombus areas notably enhanced the prognosis post-EVT over features from intrathrombus regions alone. On the basis of prior research25,26 and actual measurements of vessel wall thickness on high-resolution MR images, we defined the perithrombus region as extending 1 mm outward from the thrombus boundary. The perithrombotic region, which includes structures like the vessel wall and perivascular fat, may offer predictive insights into the disruption of the BBB associated with a heightened risk of complications, aligning with previous findings.27,28 Given that BBB disruption is commonly observed post-EVT and is linked to an increased risk of complications,27,29 radiomics features from perithrombus regions could serve as crucial predictors of clinical outcomes post-EVT. Using information from both regions significantly improves diagnostic performance, surpassing that of models using only intra- or perithrombus data, underscoring the value of integrating data from both regions for enhanced prediction accuracy.
The selection of classifiers is pivotal to the effectiveness of predictive models, yet there remains no universal standard guiding this choice, leading researchers to rely on personal preference and experience.30 Consequently, this study assessed and compared 8 different machine learning classifiers. The findings indicated that the LR model was consistently more effective than the others in both training and validation cohorts, particularly when analyzing combined radiomics features, achieving the highest AUC across all classifiers. LR was favored over complex models for its statistical simplicity, interpretability, and robust performance for binary classification tasks associated with lesser risks of overfitting.15,31,32 Given the scope and nature of our data, we believed that a parsimonious model like LR would be more appropriate. Complex models like SVM, MLP, RF, Extra Trees, XGBoost, KNN, and LightGBM, despite their high-dimensional data-handling and robustness, were prone to overfitting without substantial data and careful tuning. The notable decline in the validation performance of XGBoost highlighted the overfitting risk and the need for a balance between model complexity and generalizability. The consistent validation cohort performance of LR affirmed its suitability and reliability for clinical diagnostics, justifying its choice due to effective generalization as demonstrated across data sets.
In this study, except for atrial fibrillation, no clinical variables exhibited statistically significant differences between the good outcome and poor outcome groups in either the training or validation cohorts. Notably, a significant difference in the occurrence of atrial fibrillation between the good outcome and poor outcome groups was identified, in contrast to other studies that found no significant differences.31,33,34 This inconsistency can be explained by variations in data arrangement and the sample sizes involved in our research. To further understand these discrepancies, additional research with larger sample sizes is recommended. The combined model did not demonstrably enhance predictive performance over the radiomics-only model, suggesting a greater reliance on superior-performing radiomics features rather than clinical variables.
This study has several notable limitations. First, the patient sample size is relatively small, which may impact the stability of the outcomes of the machine learning model; applying these models to larger data sets could potentially provide more robust results. Second, the thrombus-segmentation process was manually conducted, which could be time-consuming and might compromise the reliability of the results. Future research should focus on developing automated or semiautomated methods for more efficient and accurate thrombus segmentation. Third, we sourced training and validation cohorts from separate centers, and despite image calibration and cross-validation efforts, potential bias may exist. Last, even with cross-validation and regularization techniques, overfitting is a challenge in high-dimensional data like ours. Future studies need to encompass more centers and larger samples to validate our findings with external data.
CONCLUSIONS
We developed and validated a CT-based radiomics model to evaluate the prognosis following EVT in patients with acute ischemic stroke. This model could provide critical insights for clinical decision-making and outcome prediction. The analysis showed varied performance across different thrombus regions and classifiers, with models that combined features from multiple regions proving most effective. Specifically, the LR models exhibited high efficacy and stability in predicting clinical outcome.
Footnotes
This study was supported by the Jiangsu Province Capability Improvement Project through Science, Technology, and Education (Jiangsu Provincial Medical Key Discipline Cultivation Unit, JSDW202242) and the National Natural Science Foundation of China (Youth Program, Grant No. 82402364).
Minda Li and Jingxuan Jiang contribute equally to this work.
Disclosure forms provided by the authors are available with the full text and PDF of this article at www.ajnr.org.
Indicates open access to non-subscribers at www.ajnr.org
References
- Received August 1, 2024.
- Accepted after revision September 30, 2024.
- © 2025 by American Journal of Neuroradiology