Performance of Semiautomatic Assessment of Carotid Artery Stenosis on CT Angiography: Clarification of Differences with Manual Assessment

BACKGROUND AND PURPOSE: Semiautomated methods for ICA stenosis measurements have the potential to reduce interobserver variability and to speed up its analysis. In this study, we estimate the precision and accuracy of a semiautomated measurement for carotid artery stenosis degree and identify and explain differences compared with the manual method. MATERIALS AND METHODS: In this retrospective study involving 90 patients, 2 observers determined the stenosis degree twice, with both the semiautomated and the manual method. Intra- and interobserver correlations were calculated for both methods. The accuracy was estimated by comparing average semiautomated with manual measurements. The semiautomated stenosis calculations were performed using either the minimal or maximal intersection at the reference site. Individual cases with large differences in measurement were retrospectively inspected by 3 observers. RESULTS: Intra- (R = 0.93, 0.96) and interobserver (R = 0.98) correlations for the semiautomated method were excellent and exceeded the manual performance correlations (R = 0.87, 0.86). The semiautomated measurements correlated well with the manual measurements (R = 0.87), with high specificity of 96% and lower sensitivity of 63%. Large differences were caused by misinterpretations of the semiautomated method associated with calcified plaques, resulting in overestimations of the minimal diameter, underestimation of stenosis degree, and incorrect centerlines. The effect of using the minimal diameter at the reference position resulted in a small, but significant, underestimation of the stenosis degree by the semiautomated method. CONCLUSIONS: The semiautomated method showed an excellent reproducibility and good correlation with manual measurements with a high specificity and lower sensitivity for detecting a significant stenosis. Erroneous semiautomatic stenosis measurements were associated with the presence of calcium.

A therosclerotic stenosis of the ICA may lead to neurologic symptoms and is an important risk factor for ischemic stroke. Large randomized trials determined that CEA is beneficial for recently symptomatic patients with a severe (70%-99%) stenosis. [1][2][3] In the trials with symptomatic patients, a higher degree of stenosis was associated with increased benefit from surgery. Therefore, precise assessment of the degree of stenosis is crucial for decisions on CEA. Currently, CTA is increasingly used to measure the degree of carotid artery stenosis. 4 Determining the degree of carotid stenosis on CTA, according to the NASCET method, is tedious and may lead to clinically important differences. 5,6 Reading CTA studies requires some familiarity with postprocessing techniques, such as MPR. Semiautomated methods have been developed and introduced in the market to overcome the drawbacks of these measurements. [7][8][9][10][11] The potential advantages of such a system, such as the acceleration of measurements and reduced interobserver variability, have been widely acknowledged; however, the diagnostic value has not been sufficiently determined. Several studies have shown excellent intra-and interobserver variability, 7-13 yet the diagnostic accuracy and the cause of deviations of semiautomatic measurements have received little attention.
The aim of this study was to validate semiautomated carotid stenosis measurements by comparison with a standard manual method. [14][15][16]

Patient Selection
CTA scans of patients with a suspected carotid artery stenosis were retrospectively collected from April 2006 through December 2008. All patients who underwent CTA on a 64-section CT scanner with a 0.9-mm section thickness were included in the current analysis. Patients with previous CEA of the carotid artery and patients with a common carotid artery stenosis were excluded.

CT Protocol
CTA was performed with a 64-section scanner (Brilliance 64; Philips Healthcare, Best, the Netherlands). Eighty mL of contrast (Visipaque 320; GE Healthcare, Chalfon St. Giles, United Kingdom) was infused at 4 mL/s. Acquisition and reconstruction parameters were as follows: 120 kV tube voltage, 265 mAs effective, pitch of 0.765, section thickness of 0.9 mm, increment of 0.45 mm. The scan ranged from the aortic arch up to 3 cm above the sella turcia. The in-plane grid was 512 ϫ 512 pixels, with a field of view ranging from 128 ϫ 128 mm 2 to 217 ϫ 217 mm 2 , with an average of 155 ϫ 155 mm 2 .

Stenosis Measurements
Stenosis grading was performed by 2 neuroradiologists (C.B.M., R.v.d.B.), both with more than 10 years of experience. The observers were blinded to patient information, each other's findings, and previously collected clinical measurements. After the measurement of stenosis degree, the artery was categorized according to the standard NASCET stenosis categories 3 : minimal stenosis (0%-29%), mild stenosis (30%-49%), moderate stenosis (50%-69%), severe stenosis (70%-99%), and occlusion (100%). A level of confidence was given on a 5-point scale, with a score of 1 for an unreliable measurement and 5 for excellent image quality. The processing time of the measurements was recorded.

Manual Stenosis Measurements
Manual measurements were performed using a review workstation with MPR functionality (IMPAX v5.2; Agfa, Mortsel, Belgium), using the method of Bartlett et al. 16 The diameters were determined in a plane perpendicular to the center lumen line of the vessel. The reference ICA diameter was measured at least 2 cm distally from the site of narrowing. The first observer (C.B.M.) performed the manual measurements of all arteries and a subset of 50 arteries a second time. The second observer (R.v.d.B.) measured a subset of 48 arteries.

Semiautomated Stenosis Measurements
Semiautomatic measurements were performed on a dedicated workstation (Vitrea 2 version 4.1.2.0; Vital Images, Plymouth, Minnesota) using the "Carotid CT" protocol. In the vessel of interest, a seed point was placed within the ICA. The software subsequently automatically determined the center lumen line of the selected vessel. In case the software presented an incorrect centerline, the observer was able to adjust it. The number of correction steps was recorded. Subsequently, the lumen area for the selected artery was segmented and its contours were displayed on perpendicular views. The observer selected the vessel segment that contained the minimal diameter. Within this segment, the smallest cross-section, as determined by the software, was used as the diameter of the stenosis. The reference location was selected by dragging a slider along the distal ICA well beyond the site of stenosis. At this location, the minimal and maximal diameters were given. The software used the minimal diameter as the reference diameter for the stenosis calculation. All arteries were evaluated twice by both observers. To study a potential bias caused by the use of the minimal diameter for the reference diameter in the stenosis degree calculation, we calculated the degree of stenosis once more by using the maximal diameter as reference diameter for all arteries.

Calcium Volume Measurement
Calcified plaques in CT images are known to hamper a straightforward and accurate diameter assessment. To associate its contribution to biases in the semiautomated measurement, it was recorded whether calcium was present adjacent to the lumen at the site of the minimal diameter. Furthermore, the calcium volume was measured in milliliters (cc) by a single observer (L.S.) using the method described by McKinney et al 17 and Marquering et al. 18

Inter-and Intraobserver Variability
The inter-and intraobserver variability of the stenosis measurements was assessed by constructing scatterplots and the calculation of Pearson correlation coefficient and its 95% CIs. Furthermore, average difference and average absolute difference were calculated. The significance of differences between the 2 methods is determined using paired t tests. Linear weighted values were calculated for the NASCET categorization of the degree of stenosis. Based upon the categorization, the arteries were labeled significant or not-significant for a stenosis degree larger or smaller than a cutoff value. This categorization was performed twice with a cutoff value of 50% and 70%. The inter-and intraobserver variability was calculated by statistics of the categorized stenosis. values and correlation coefficients were considered significant when the hypothesis that findings for manual and semiautomated measurements were equal could be rejected with 95% certainty. The interobserver correlation was calculated using the second series of semiautomated measurements to avoid a possible learning effect. Measurements with confidence scores of 1 were excluded from the statistical analyses.

Accuracy
To estimate the bias introduced by the automated analysis, we compared semiautomated stenosis measurements with the manual measurements, using the manual measurement as reference. To reduce observer dependency and learning biases in the measurements, we averaged the 4 semiautomated measurements and the 3 manual measurements. Measurements classified as near occlusion or low quality were not included in the averaging.
The accuracy of the semiautomated degree of stenosis measurement was assessed using a Bland-Altman plot. Pearson correlation coefficient and average differences were calculated for the degree of stenosis, minimal diameter, and reference diameter. These calculations were performed for the average of the 4 semiautomated measurements as well as for a single run with maximal reference diameter. To investigate potential biases due to the presence of calcified plaques, we compared minimal diameter and stenosis measurements between the semiautomated and manual measurement for maximal stenosis positions with and without calcium.
Using the average manual measurements as reference, the diagnostic accuracy of the semiautomated method for determining a stenosis larger than cutoff values of 50% and 70%, and its 95% CIs, were determined.

Retrospective Error Analysis
Three observers (C.B.M., R.v.d.B., and H.A.M.) retrospectively inspected all cases in which the semiautomated measurement differed more than 20% from the manual reference to determine the cause of the deviation. For a single run of semiautomated analysis, we critically evaluated the centerline to determine the number of incorrect centerlines near the site of maximum stenosis.

Results
Between April 2006 and December 2008, 180 patients underwent a CT scan for assessment of a possible carotid stenosis. Of these CT scans, 156 were performed on a 64-section CT-scanner and 120 scans were reconstructed with a 0.9-mm section thickness. A total of 90 patients (mean age ϭ 66.8; range 35-89; male-to-female ratio ϭ 1.46) with suspected ICA stenosis were included, after exclusion due to previous carotid intervention or insufficient quality of the images.
The average processing time was 90 Ϯ 50 and 138 Ϯ 31 seconds for the semiautomated and manual measurements, respectively. Seventy-six percent of the semiautomated analyses required manual edits of the centerline. The number of manual edits ranged between 0 and 12, with an average of 1.6 Ϯ 2.0 per analysis. The processing time of the semiautomated measurement was strongly correlated with the number of edits. Table 1 and Fig 1 present the inter-and intraobserver variability results. The intraobserver variability for the semiautomated stenosis assessment was excellent, with Pearson correlation coefficients of 0.93 and 0.96 for observers 1 and 2, respectively (P Ͻ .01). The interobserver agreement was also excellent for the semiautomated stenosis degree assessment (r ϭ 0.98, P Ͻ .01). The values for the NASCET categorization were 0.87 and 0.82 for the intraobserver categorization, and 0.84 for the interobserver categorization. The reproducibility measurement of the detection of a stenosis degree, with a cutoff of 50%, resulted in a value of 0.92 and 0.85 for the intraobserver variability, and 0.88 for the interobserver variability. For the 70% cutoff point, the values were all above 0.90 for the semiautomated measurements.

Inter-and Intraobserver Variability
We observed a significant difference between the semiautomated and manual measurements for the interobserver correlation and NASCET categorization . The difference in intraobserver correlation was significant only for observer 2. In addition, the difference between the semiautomated and manual interobserver for detecting Ͼ70% stenosis was significant.

Accuracy
The correlation of the averaged semiautomated measurement with manual reference is presented in Table 2 and Fig 2, showing a good correlation with a Pearson correlation coefficient of 0.87. The average difference in stenosis degree between the 2 methods was 6.2% (Ϯ 16%), indicating that the semiautomated method underestimated the degree of stenosis compared with the manual measurement. The average difference in minimal diameter was close to zero and not significant. The average reference diameter was significantly underestimated by the semiautomated method compared with the manual measurements. Table 2 also shows the result of the semiautomated measurements in which the maximal diameter at the reference position was used in the degree of stenosis calculation, indicating that this approach corresponds better with the manual measurements. Table 3 shows the diagnostic accuracy of the semiautomated method compared with the manual reference measurements, indicating a high specificity of 96% and 93% for a 70%  and 50% degree of stenosis cutoff, respectively. The sensitivity was lower, with 63% and 72% for a 70% and 50% degree of stenosis cutoff, respectively. The diagnostic accuracy of the semiautomated measurements, using the maximum diameter at the reference position, was higher for all diagnostic accuracy measurements.

Calcifications
There were 46 (30%) arteries with calcifications adjacent to the lumen at the site of the minimal diameter. The difference in minimal diameter between the semiautomated method and the reference minimal diameter is presented in Table 4 and illustrated in a Bland-Altman plot in Fig 3. This figure shows that there is a large spread around zero, but that for average diameters between 1.5 and 4 mm, the semiautomated method overestimated the minimal diameter, especially for arteries with calcifications. The semiautomated minimal diameter measurement was, on average, 0.4 Ϯ 0.6 mm larger when a calcified plaque was adjacent to the lumen and 0.2 Ϯ 0.9 mm smaller without calcifications present. The 2 approaches to study the influence of calcium show similar results.

Retrospective Inspection
Retrospective analysis of individual cases with large differences revealed that deviations were caused by the semiautomated method due to either incorrect centerlines or lumen segmentation errors because of the presence of calcifications.
In total, 56 of 354 (16%) centerlines were considered incorrect, mainly due to running through calcifications. Figs 4 and 5 show examples in which the centerline runs along the calcifications and skips part of a very tortuous artery. Such deviations were generally accepted in all 4 semiautomatic stenosis measurement runs. In several cases, the semiautomated method overestimated the minimal lumen due to the presence of calcifications (Fig 6).

Discussion
In this retrospective study involving 180 carotid arteries, the semiautomated method demonstrated excellent inter-and intraobserver variability in the ICA stenosis degree measurement and excelled over the manual method. This is in line with previous reports evaluating similar semiautomated methods. 7,9,10,12 The accuracy of detecting a significant stenosis compared with manual measurements was good. However, for several cases, the semiautomatic measurements resulted in large (and consistent) deviations due to incorrect centerline detection and incorrect lumen segmentation. These deviations were mainly caused by the presence of calcified plaques. For patient selection for carotid endarterectomy, an accurate and reproducible measurement of the degree of ICA stenosis is of paramount importance. 1,3 CTA has been shown to be a reliable noninvasive imaging technique in the estimation of degree of stenosis. 4 Several automated software solutions have become available, with the potential to ease stenosis measurement, shorten the evaluation time, improve precision, and reduce interobserver variations.
The accuracy of a semiautomated method has not sufficiently been tested: Zhang et al 7 compared an automated stenosis-degree method with and without additional manual interaction with rotational DSA for 31 patients. The correlation of a semiautomated stenosis assessment with MRA of 56 ICAs was performed by Hackländer et al. 13 Scherl et al 12 validated their approach on a small set of 10 ICAs. The accuracy of a semiautomated method was assessed by the comparison with consensus reading using axial and curved multiplanar plane reformatting images of 46 patients by Bucek et al. 9 Wintermark et al 10 used a larger patient group of 125 patients for their validation, but because most of the patients in this study had no neurologic indications, a large part of their patient group showed no significant stenosis. White et al 11 assessed the reproducibility of a semiautomated method in 81 ICAs.
In our study, we compared the semiautomated measurement with the manual measurement as reference. In particular, the specificity was good but the sensitivity was rather low, with values around 65%-75%. The combined accuracy was 87% and 82% for detecting a significant stenosis with a cutoff value of 70% and 50%, respectively. This means that, based upon the semiautomated method, a different treatment would have been chosen for approximately 15% of the patients. In most cases the semiautomated method underestimates the stenosis degree.
There are several explanations of why a semiautomated method results in improved reproducibility. First, the manual generation of a plane perpendicular to the running of the vessel is tedious, requiring the manual generation of perpendicular MPRs, which is a cause of variation. The generation of a central lumen line with the semiautomated method is straightforward and insensitive to the positioning of control points by the user. This results in a consistent generation of a straightened vessel view, with perpendicular planes with little variation. Furthermore, due to partial volume effects, a high contrast edge-such as the lumen vessel wall boundary-is imaged as a smooth transition, with diminishing intensity. For such a smooth transition, it is difficult to pinpoint which gray value represents the lumen boundary. The lumen boundary assessment depends on the observer's interpretation, scanner settings, and window-level settings. An automated method follows well-defined rules and is insensitive to these variations, resulting in a more reproducible result. Finally, in general, automated methods give an instant overview of the vessel dimensions along its course. This eases the selection of the site of minimal stenosis and reduces the variation occurring in manual methods, in which the site of maximal stenosis has to be estimated first. The retrospective inspection of arteries with large differences showed that the semiautomated method was frequently incorrect due to erroneous centerline generation and artifacts caused by calcifications at the site of maximal stenosis. Remarkably, some erroneous semiautomated analyses were repeatedly made. A number of incorrect centerlines were ac-cepted for all 4 semiautomated measurement series. This resulted in a consistent, but incorrect, measurement. Apparently, the semiautomated method tempts an observer to accept the proposed measurements as true and makes the radiologist less "aware." The excellent observer agreement therefore also has a downside: The semiautomated measurement leads to less critical evaluation of the image data. In this study, the semiautomated method resulted in different clinical decisions for approximately 15% of the patients. If a semiautomated method will become standard in a clinical setting, additional training of the observer is required to be more vigilant in detecting these errors.
In previous studies on carotid stenosis assessment with CTA, calcifications have been found responsible for hampering the measurement of degree of stenosis. 19 In several cases, we have seen that presence of calcifications resulted in incorrect lumen contours by the software, leading to an overestimation of the minimal diameter. For many cases, it was observed that the presence of calcium resulted in an overestimation of the minimal diameter and thus underestimation of the degree of stenosis. However, the quantitative analysis indicated that this difference is not significant.  Note:-The values are given for average semiautomated measurements and for the corrected reference diameter. All differences between these 2 approaches were not significant. NPV indicates negative predictive value; PPV, positive predictive value. Note:-All differences between the semiautomated and manual measurement were statistically significant. Variations of the reference diameter, as used in stenosisdegree calculation, lead to differences in the degree of stenosis. The NASCET criterion requires a diameter at the reference position that is measured in the same obliquity as the minimal diameter. This approach is difficult to perform in the manual measurement, because MPRs perpendicular to the running of the artery are used, and in the semiautomated method, in which only the minimal and maximal diameters are given. When the lumen area is not perfectly circular-shaped, by definition, the area has no diameter, and we can only have intersections rather than diameters. This means that intersections at different positions differ in length. To our knowledge, there is no definition of the reference "diameter" for a luminal cross-section that is not perfectly circular. Because the reference position should be taken at a healthy part of the vessel, it is expected that there is not much variation of the cross-sections. However, our study shows that the choice of reference diameter has a significant effect on the stenosis measurement. The semiautomated method uses the minimal cross-section of the lumen area as the reference diameter. This was significantly smaller than the reference diameters obtained by the manual method. The semiautomated stenosis measurement, in which the maximum intersection at the reference position was used for the degree of stenosis calculation, agreed better with manual measurements. Using either an average diameter or an intersection in the same obliquity as the minimal diameter measurement may be more accurate alternatives; however, the study of an optimal reference measurement is beyond the scope of this paper.
Stenosis assessment with the semiautomated method is more convenient and approximately 1.5 times faster than the manual method. In some cases, the automated generated centerline was incorrectly positioned inside calcifications or passing through the external carotid artery. In these cases, the cen-  terline had to be manually corrected, resulting in an increase of analysis time.
The semiautomated method for stenosis measurement may provide a useful approach in a clinical setting; however, as previously pointed out by Zhang et al, 7 manual correction is still required. Furthermore, the software user should be aware of potential errors due to misinterpretation of the software. We are still far from a fully automated method.
We have validated the semiautomated stenosis degree measurement with manual measurement of CTA data as described by Bartlett et al. 16 This stenosis measurement is comparable to the original NASCET criteria, as applied to DSA in the symptomatic carotid surgery trials. Currently, it is well accepted that 3D images by CTA provide more information on the morphology of the stenosis than conventional DSA, and it has been shown that DSA does not always reveal the narrowest residual lumen. 20 Therefore, it is likely that CTA reveals a more precise estimate of the actual degree of stenosis. However, its relation with DSA-based trial results becomes more remote. DSA may not be the "gold standard" with respect to state-of-the-art imaging, but it remains the standard of reference with respect to clinical decision-making based upon what we know from the trials. To separate the contribution of the semiautomated measurement on accuracy, we therefore chose to compare these results with a reference standard based upon the manual measurements.

Conclusions
In this study, a semiautomated ICA stenosis-degree measurement in CTA showed an excellent reproducibility. There was a good correlation of the semiautomated measurements with the manual measurements. For detecting a significant stenosis, the semiautomated method had a high specificity and low sensitivity. The overall diagnostic accuracy was 87% and 82% for cutoff values of 70% and 50% for a significant stenosis. Erroneous measurements of the semiautomated method were associated with presence of calcium near the site of maximal stenosis.

Main Results Summarized
• The intra-and interobserver variability was significantly lower for the semiautomated method than for the manual method. • The semiautomated method correlated well with the manual method (r ϭ 0.87) but underestimated the degree of stenosis by 6.2% on average. • The semiautomated measurements overestimated the minimal diameter near calcifications and underestimated the minimal diameter in the absence of calcifications. • 16% of the used centerlines were incorrect, mainly due to the presence of calcifications. • The automated method was more convenient to use and approximately 1.5 times as fast as the manual method. • This semiautomated stenosis approach for the measurement of the degree of stenosis is very promising but not yet suitable for incautious use in daily clinical practice.
Disclosures: Lucas Smagge-Research Support (including provision of equipment or materials): Vital Images, Details: We were able to borrow a laptop from Vital Images with Vitrea Software on it to perform measurements; the laptop was returned to Vital Images after measurements were completed. Hugo A. Gratama van Andel-Other Financial Relationships: Employee of MILabs, Details: During my PhD research at the AMC I contributed to this study. Currently I am employed by MILabs, a company that develops small animal scanners (PET CT SPECT).