Coarse-to-Fine Deep Learning Model with Acute Ischemic Stroke Using a Thrombus on NCCT and CTA in Patients Automated Segmentation of Intracranial

BACKGROUND AND PURPOSE: Identifying the presence and extent of intracranial thrombi is crucial in selecting patients with acute ischemic stroke for treatment. This article aims to develop an automated approach to quantify thrombus on NCCT and CTA in patients with stroke. MATERIALS AND METHODS: A total of 499 patients with large-vessel occlusion from the Safety and Ef ﬁ cacy of Nerinetide in Subjects Undergoing Endovascular Thrombectomy for Stroke (ESCAPE-NA1) trial were included. All patients had thin-section NCCT and CTA images. Thrombi contoured manually were used as reference standard. A deep learning approach was developed to segment thrombi automatically. Of 499 patients, 263 and 66 patients were randomly selected to train and validate the deep learning model, respectively; the remaining 170 patients were independently used for testing. The deep learning model was quantitatively compared with the reference standard using the Dice coef ﬁ cient and volumetric error. The proposed deep learning model was externally tested on 83 patients with and without large-vessel occlusion from another independent trial. RESULTS: The developed deep learning approach obtained a Dice coef ﬁ cient of 70.7% (interquartile range, 58.0% – 77.8%) in the internal cohort. The predicted thrombi length and volume were correlated with those of expert-contoured thrombi ( r ¼ 0.88 and 0.87, respectively; P , .001). When the derived deep learning model was applied to the external data set, the model obtained similar results in patients with large-vessel occlusion regarding the Dice coef ﬁ cient (66.8%; interquartile range, 58.5% – 74.6%), thrombus length ( r ¼ 0.73), and volume ( r ¼ 0.80). The model also obtained a sensitivity of 94.12% (32/34) and a speci ﬁ city of 97.96% (48/49) in classifying large-vessel occlusion versus non-large-vessel occlusion. CONCLUSIONS: The proposed deep learning method can reliably detect and measure thrombi on NCCT and CTA in patients with acute ischemic stroke

R andomized controlled trials in patients with acute ischemic stroke (AIS) have demonstrated the efficacy and safety of endovascular therapy (EVT) compared with medical therapy in patients with large-vessel occlusion (LVO). 1,2 Identifying the presence, location, and extent of thrombi on NCCT and CTA images is important when selecting patients with AIS for reperfusion therapy. Thrombus characteristics such as location, length, volume, and permeability are helpful in predicting recanalization after both thrombolysis and EVT. 3,4 Recent studies have also shown that thrombus radiomics is able to predict recanalization with IV alteplase 5 and first-attempt recanalization with thromboaspiration. 6 Accurate segmentation of thrombi on baseline imaging is the first step in assessing thrombus characteristics. Manual delineation of the thrombus is still the criterion standard in assessing thrombus characteristics in clinical practice. 7 It requires expertise in imaging interpretation and is observer-dependent. Semiautomated segmentation techniques use clinicians' input to help with this task, but the variability introduced by user input is still a concern. 8 A fully-automated thrombus segmentation approach readily available in the acute setting is, therefore, desirable.
Automated segmentation of thrombi on NCCT or CTA is, however, challenging due to various reasons. These include the low signal-to-noise ratio on CT-based imaging, partial volume effects, CT image artifacts, and intracranial calcification, and so forth, all of which hinder accurate delineation of the thrombus. These challenges imply that the traditional model-based or thresholding-based segmentation methods may not be able to achieve accurate or acceptable results. [9][10][11][12] To the best of our knowledge, there are very few established approaches to automatically segment intracranial thrombi on CT images. This study, therefore, aims to develop and externally validate an automated thrombus-segmentation approach on NCCT and CTA images in patients with acute stroke presenting with intracranial vessel occlusion.

Study Participants
Patients were retrospectively selected from the Safety and Efficacy of Nerinetide in Subjects Undergoing Endovascular Thrombectomy for Stroke (ESCAPE-NA1) randomized controlled trial (ClinicalTrials.gov: NCT02930018). 13 Study approval was obtained from the ethics board at each site and the responsible regulatory authorities. Signed informed consent was obtained from the patients or their legally authorized representatives. Inclusion criteria for the main study were the following: 1) 18 years of age or older with LVO (intracranial ICA, MCA M1, or functional M1 [proximal occlusion of all M2 branches]), 2) NIHSS score of $5, 3) time from last seen well ,12 hours, 4) pial collateral filling of $50% of the ischemic MCA territory, and 5) ASPECTS $ 5. For this study, we included only patients with available thin-section (#2.5 mm) baseline NCCT and CTA images (single-phase or 1 phase of multiphase CTA). We excluded patients whose imaging showed the following: 1) irremediable coregistration errors (n ¼ 13), 2) severe motion artifacts (n ¼ 16), or 3) beam-hardening artifacts (n ¼ 11). A total of 499 patients were included, of whom 329 were randomly selected for the training (n ¼ 263) and internal validation (n ¼ 66); the remaining 170 patients (independent of the derivation cohort) were used to internally test the derived model (Fig 1).
An external validation set was also used to test the generalizability of the derived DL model. This data set comprised 83 randomly chosen patients with AIS with anterior circulation occlusions from the Precise and Rapid Assessment of Collaterals Using Multi-Phase CTA in the Triage of Patients with Acute Ischemic Stroke for IV or IA Therapy (PRoVe-IT) study. 14,15 Of the 83 patients, 34 had LVO, 28 had medium-vessel occlusion (MeVO) (M2/M3/M4 segments of the MCA or A2/A3/A4 segments of the anterior cerebral artery), and 21 had no identifiable intracranial occlusion.

Reference Standard: Manual Segmentation of Thrombi
The thin-section NCCT images were automatically coregistered onto the CTA images (using the second phase if multiphase CTA was available) using rigid-body transformation (the SimpleITK packages in Python; https://pypi.org/project/SimpleITK/), 16 followed by skull-stripping. 17 An expert neuroradiologist visually inspected the registration results and performed manual corrections by using the 3D Slicer software (Version 4.1, https://www. slicer.org/) when the coregistration was suboptimal. 18 Four trained readers (3 neuroradiologists with .5 years' experience in vascular imaging and 1 vascular neurologist with .10 years of stroke imaging experience) manually segmented intracranial thrombi section-by-section on NCCT images while referring to the coregistered CTA images using ITK-SNAP (http://www. itksnap.org/). 19 Four readers were each responsible for onefourth of the entire data set, and they were blinded to clinical information and follow-up imaging.
All 4 readers segmented a subset of 10 patients randomly selected from the internal test set. An expectation-maximization algorithm for simultaneous truth and performance level estimation (STAPLE) algorithm was performed to generate a computational "golden reference standard" based on the 4 experts' segmentations, which was used to assess the variability of manual segmentations. 20

Deep Learning Model
A 2-stage coarse-to-fine thrombus segmentation neural network was proposed on the basis of the U-net architecture. 21 The proposed deep learning (DL) model used a multiscale training strategy with a deep-supervision mechanism. In particular, a channel and spatial attention block was designed to make the model focus more on the salient areas on images at different scales and obtain more conducive features. 22 The spatial attention module generates a spatial attention map using the interspatial relationship of features. Different from channel attention, spatial attention focuses on where an informative part is, which is complementary to channel attention. To compute the spatial attention, we apply average-pooling and max-pooling operations along the channel axis first and concatenate them to generate an efficient feature descriptor. 22 On the concatenated feature descriptor, we apply a convolution layer to generate a spatial attention map, which encodes where to emphasize or suppress. In this article, the patch size for spatial transform was 96 Â 160 Â 160 voxels. To make full use of multiscale context information, we used a scale-aware pyramid fusion module, in which 3 parallel dilated convolutions with different dilation rates were used to capture information at different scales. 23 The detailed architecture of the proposed model is shown in Fig 2, and the details around the model architecture and hyperparameter optimization are summarized as follows: The 3D kernel size for convolutions was 3 Â 3 Â 3. Feature numbers at each layer were 32, 64, 128, 256, and 320 (limited to 320 to ensure sufficient context aggregation). Batch normalization was used in the proposed network, which took a step toward reducing the internal covariate shift and, in doing so, dramatically accelerated the training of deep neural nets. The batch size of the networks was 2, to enable large patch sizes, and the leaky Rectified Linear Unit nonlinearity was implemented. Stochastic gradient descent with Nesterov momentum (m ¼ 0.99) was used as the optimizer. The initial learning rate, dampening, batch size, and decay weight were 0.01, 0, 2, and 3 Â 10 À5 , respectively. The learning rate was decayed throughout the training following lr init Â ð1 À epoch=epochmaxÞ 0:9 , where lr init is 0.01 and epochmax is the maximum of the epoch.
Specifically, NCCT images after skull-stripping were downsampled from the original spacing of 0:625 Â 0:488 Â 0:488 mm 3 (257 Â 456 Â 436 voxels) to 1:08 Â 0:844 Â 0:844 mm 3 (149 Â 264 Â 252 voxels) and passed into the coarse model to obtain maximal contextual information and localize thrombi candidates at the first coarse stage. The output of the prediction maps at the coarse stage was upsampled to the original resolution and fed into the fine stage together with the original images to obtain more detailed segmentation. In particular, the fused features of each layer at the coarse stage were upsampled and concatenated to the features of the corresponding layer at the fine stage to use both global contextual information and local detail information obtained at coarse and fine stages.
Additionally, 5-fold cross-validation on the training data set was performed to select the optimal model at the coarse and fine stages. Spatial augmentations (rotation, scaling, low-resolution simulation, and so forth) were applied in 3D to increase the diversity of the training data. The combined Dice and cross-entropy were used as the loss function. 24 At each level of deep supervision, the ground truth segmentation mask was correspondingly downsampled for loss computation on the basis of the size of the feature maps. The training objective was the sum of the losses (L) at all resolutions, Hereby, the weights (W) were halved with each decrease in resolution, resulting in W 2 ¼ 0:5W 1 , W 3 ¼ 0:25W 1 , and so forth, and were normalized to sum 1.

Statistical Methods
Patient demographics, including clinical and imaging variables, were compared across the derivation and internal test sets using the x 2 and Student t tests as appropriate.
The proposed segmentation method was quantitatively evaluated using the spatial overlap metric of the Dice coefficient (DC) and 2 boundary distance error metrics: average symmetric surface distance (ASSD) and 95th percentile of the Hausdorff Distance (HD95), 25 compared with the reference standard of manual (expert) segmentation. The DC is a spatial overlap index ranging from 0 to 1, where 1 indicates a perfect overlap between the reference standard and the predicted segmentation and vice versa. The ASSD and HD95 represent the average and largest HD95 errors between the 2 surfaces derived from the segmented and reference objects in 3D space. For ASSD and HD95, 0 mm indicates perfect segmentation. The correlations regarding thrombus length and volume between the DL-derived model and manual measurements were analyzed using the Pearson correlation and Spearman correlation as appropriate to the data distribution. 26 The correlation was considered excellent if $0.70 and good if between 0.5 and 0.7. 26 Absolute and relative errors of thrombus volume and length were also calculated. These metrics were all calculated in 3D space at a patient level and applied onto both the internal data set (ESCAPE-NA1) and the external test data set (PRoVe-IT) for internal and external validation, respectively. On the basis of the segmentation results, the model predictions can be also used to distinguish the patients with LVO versus without LVO after localizing the segmented thrombus and thresholding thrombus volume. We further assessed the specificity and sensitivity of the DL model in classifying intracranial thrombi (LVO versus non-LVO and occlusion versus nonocclusion) in the external data set.
All statistical analyses were performed using the SciPy package (https://scipy.org/). All P values were 2-sided, and statistical significance was defined as P , .05.

Patient Characteristics
Patient characteristics of the derivation and internal and external test data are summarized in Table 1. There were no statistically significant differences in baseline characteristics between the derivation and internal test data (all, P . .05, Table 1). The choice of the external test data, with different characteristics than the internal data, was deliberate.

Internal Validation
All patients had ICA or M1 segment MCA occlusions. Figure 3 shows 2 examples predicted by the proposed DL model compared with the reference standard. Quantitative results on the internal test data of 170 patients are shown in Table 2. The proposed DL model obtained a median DC of 70.7% (interquartile range [IQR], 58.0%-77.8%), a median ASSD of 0.38 mm (IQR, 0.24-0.77 mm), and a median HD95 of 1.31 mm (IQR, 0.79-3.83 mm). The median thrombus length measured by the DL model was 13.94 mm (IQR, 6.32-25.76 mm) and was strongly correlated with that of the expert segmented thrombi (r ¼ 0.88, P , .001). The median difference d L diff ðmmÞ and median absolute difference jd L diff j ðmmÞ of thrombus length between the expert segmentation and algorithm predictions were À5.41 mm (IQR, À11.34 to À0.5mm) and 5.79 mm (IQR, 1.79-11.37 mm), respectively. The median thrombus volume of 71.9 mm 3 (IQR, 39.25-126.15 mm 3 ) obtained by the DL model also strongly correlated with that obtained by expert segmentation (r ¼ 0.87, P , .001). The median volume difference d V diff ðmm 3 Þ and the median absolute volume difference jd V diff j ðmm 3 Þ between the expert segmentation and algorithm predictions were À0.9 mm 3 (IQR, À3.41-0.33 mm 3 ) and 1.76 mm 3 (IQR, 0.71-4.23 mm 3 ), respectively.

External Validation
The prevalence of LVO, MeVo, and no occlusions in the external data was 34 (41%), 28 (33.7%), and 21 (25.3%), respectively. Quantitative results from the 34 patients with LVO in the external data are shown in Table 2. In these 34 patients with LVO, the proposed DL model obtained a median DC of 66.8% (IQR,

Analysis of Variability in Manual Segmentations
The median thrombus length and volume derived from the 10 patients using the STAPLE algorithm were 20

DISCUSSION
This study describes a fully automated DL model for intracranial thrombus segmentation on NCCT and CTA images in patients with AIS. This model used a coarse-to-fine DL network with multilevel and multiscale feature fusion and deep-supervision strategy. It was developed using a large data set of 329 patients and tested internally and externally for generalizability. Both internal and external validations demonstrate that the developed model can accurately detect and segment intracranial thrombi, especially in patients with LVO, compared with manual segmentation by experts.
There are no well-established methods for automated thrombus segmentation, and only a few that use semiautomated methods. [7][8][9][10][11] Existing methods of thrombus segmentation rely on manual or semiautomated measurements of thrombus density on NCCT. This method of density assessment by using small ROIs is prone to interobserver variability due to the heterogeneity in thrombus composition and the small size of intracranial thrombi and is sensitive to partial volume effects, image noise, and the presence of vessel wall calcification. 7 Santos et al 27 developed a semiautomated region-growing segmentation method that was limited by a low observer agreement and variability in thrombus density. Qazi et al 12 used linear regression to build statistical models to predict patient-specific optimal Hounsfield unit thresholds, which replaced a universal single Hounsfield unit threshold for thrombus segmentation favored by Riedel et al. 28 However, these thrombus density threshold-based methods are subject to image-intensity variability, and their generalizability is a concern. Lucas et al 29 proposed a cascaded neural network to segment thrombi. Unfortunately, this method was restricted to 2D images and limited to the MCA 1 ICA region, used fixed ROIs, and was developed using a small data set (the segmentation network was trained on only the 216 positive cases). Mojtahedi et al 30 used dual-modality U-Net-based CNNs to detect the thrombus location and then limited the search area by creating a bounding box around the detected thrombus location, which would allow the first-level prediction errors to stack up later. To the best of our knowledge, our study represents the largest data set of automated intracranial thrombus segmentation on NCCT/ CTA with internal and external validation. The proposed method can automatically segment small thrombi in 3D whole-brain NCCT images, which overcomes the limitations of segmentation methods such as intensity-based and fixed ROI annotations.
Among the 170 patients with LVO in the internal validation, the DL model failed to detect thrombus in only 10 cases. Visual inspection showed that these false-negative cases could be attributed to one or a combination of reasons: 1) small thrombi (,30 mm 3 ) (n ¼ 4); 2) isodense (to surrounding tissue) thrombi (n ¼ 7); and 3) the presence of severe beam-hardening artifacts on the thrombus (n ¼ 5). Only 7.5% of training data had such imaging characteristics in the retrospective analysis. Including more sample images with such characteristics in the training data could have improved the performance of the derived DL model.
The volumetric analyses regarding thrombus length and volume also show excellent internal and external validation. However, the difference in thrombus length and HD95 seems to be large, possibly explained by the challenges in segmenting thrombi in curved vessels. Nonetheless, this study reports HD95 values, 3.05 (range, 1.37-8.19), similar to those reported in the Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke in the Netherlands (MR CLEAN) study, 5.67 (range, 4.30-7.04). 30 The median DC was 70.7% in internal validation in the LVO-only ESCAPE NA1 study and 66.8% in the 34 LVO cases in external validation in the PRoVe-IT study. This slight difference could be because patients in the ProVe-IT study had less severe stroke than patients in the ESCAPE-NA1 trial (NIHSS score 8 versus 16) and, therefore, less extensive thrombi. Indeed, the median thrombus volume of the patients with LVO in the PRoVe-IT data (37.51 mm 3 ; IQR, 29.19-78.3 mm 3 ) was much smaller than that in the internal data set of ESCAPE-NA1 (80.53 mm 3 ; IQR, 49.92-155.39 mm 3 ).
Moreover, the DC of the proposed model in the external validation set was 66.8%, suggesting a good agreement between the predicted and the measured thrombi. Our DC is similar to the algorithm developed from the MR CLEAN data, which achieved a DC of 62%. 30 Despite the smaller thrombus burden, the developed DL model obtained a high specificity in identifying the presence of thrombus in the external validation. The sensitivity of 77.42% (48/62) was comparably low because 14 of 28 MeVO cases were identified as having no occlusions by the model, which might be because the derived DL model was trained using only LVO cases. Including more patients with distal occlusions or without occlusions in the training data could have improved the accuracy of the model in detecting small thrombi.
The results of our study have several implications in clinical practice. Automated segmentation can be used to extract radiologic thrombus characteristics, such as thrombus length and volume, which were shown to be associated with clinical outcomes and reperfusion success. 31,32 Although not included in the output of our proposed model, thrombus density and permeability are also useful in predicting clinical and angiographic outcomes, 31 thus justifying future work to automate their calculations. Physicians can use the output of our model to inform their decisions regarding bridging therapy; for instance, long and large thrombi might benefit more from adjunctive IV thrombolysis compared with smaller thrombi; however, this possibility needs to be validated in future work. 33 Furthermore, information about the precise location and length of the thrombus is useful for neurointerventionalists to plan the EVT procedure and choose the best device to achieve fast and effective reperfusion. Last, automated segmentation can be applied on big databases to extract thrombus characteristics in a faster and easier manner compared with humans and, thus, could be used to improve the design of EVT devices.
Detection and segmentation of thrombi on NCCT/CTA are tedious and time-consuming for physicians. 5,27,34,35 Improvements in image quality, better training, and systematic assessments of thrombus characteristics (parameterization and morphology) are useful to help humans improve thrombus detection on NCCT and CTA. Regardless of these strategies, detecting thrombi on NCCT and CTA continues to be challenging for humans, especially with small thrombi. Furthermore, the results using the reference standard generated by the STAPLE algorithm show the variability of manual segmentations across different raters and thus highlight the need for an automated process to standardize the extraction of thrombus characteristics.
Our study has several limitations. First, patients with unavailable thin-section NCCT and CTA images were excluded, therefore introducing selection bias; however, we chose to do so to decrease measurement error from thick-section scans. Second, 4 well-trained experts manually contoured the data for evaluation. Even though the reproducibility of manual segmentations in our experiments was acceptable, the variability introduced by cognitive biases and heuristics and image misalignment should be considered. The variability in the results is partly explained by the difference in experience and training among the 4 raters. Two raters were neuroradiologists (raters 1 and 3), one was a neuroradiology resident (rater 2), and one was a vascular neurologist (rater 4). The 2 neuroradiologists achieved the highest DCs. Moreover, intracranial thrombi are small and occur in curved vessels. Annotation of lesions that are small with curved shapes can also result in variability compared with larger-sized lesions where variability will be inherently less. Third, the internal data sets did not include MeVO occlusions, explaining the low model performance for these cases. Future studies focusing on this occlusion subgroup would improve detection and delineation of these thrombi. Fourth, the proposed model did not show good performance in small and isodense thrombi; however, we chose to keep these cases, contrary to a prior study, to increase the generalizability of our results. Including more studies with artifacts (beam-hardening and so forth) would also improve the generalizability. Fifth, the developed model can be applied only on thinsection NCCT and CTA images. The extension to a more widely used NCCT with 5-mm-thick slices should be investigated.

CONCLUSIONS
An automated method based on DL is capable of detecting and segmenting thrombi reliably, especially those causing LVOs, on NCCT and CTA images in patients with AIS. Extensive validations demonstrate the efficacy of the proposed technique compared with the reference standard (ie, manual segmentation). If translated into a clinical setting, this algorithm could help physicians in their decision-making for AIS.