Ultrasound-Based Diagnostic Classification for Solid and Partially Cystic Thyroid Nodules

BACKGROUND AND PURPOSE: The ability of US to differentiate benign thyroid nodules from malignant ones is still a matter of debate. The aim of this study was to assess the diagnostic efficacy of a US-based classification system for solid and PCTNs through a prospectively designed study. MATERIALS AND METHODS: We studied 1289 thyroid nodules in 1036 patients who underwent thyroid US, US-FNA, and thyroid surgery. Each thyroid nodule was prospectively classified into 1 of 5 diagnostic categories following real-time US examination: benign, probably benign, borderline, possibly malignant, and malignant. Solid nodules were classified by using all 5 categories, and PCTNs were classified by all except the borderline category. We calculated the diagnostic efficacy of thyroid US by comparing US diagnoses with histopathologic results of surgically resected thyroid nodules. RESULTS: One thousand fifty-five solid nodules and 234 PCTNs were prospectively classified as benign (n = 435 and 179), probably benign (n = 213 and 25), borderline (n = 94 and 0), possibly malignant (n = 115 and 15), and malignant (n = 198 and 15), respectively. Of these 1289 nodules, 505 were surgically resected and confirmed by pathology (191 benign and 314 malignant nodules); there were 44 resected solid nodules with a borderline category. For solid nodules and PCTNs, the sensitivity, specificity, positive and negative predictive values, and accuracy of US diagnosis were 86.1 and 66.7, 90.0 and 88.9, 94.3 and 75.0, 77.3 and 84.2, and 87.5% and 81.5%, respectively, based on 505 surgical specimens and excluding the 42 solid borderline nodules. CONCLUSIONS: Our US-based classification system can provide helpful guidance for the management of thyroid nodules.

High-resolution thyroid US has been widely used in the evaluation of thyroid nodules, resulting in the establishment and general acceptance of certain characteristics that mark benign and malignant thyroid nodules. [1][2][3][4][5][6][7][8][9][10] On the basis of previous studies, US findings suggestive of a solid malignant nodule include marked hypoechogenicity, spiculated margin, microcalcifications, and a taller-than-wide shape. [1][2][3][4][5][6][7][8]10,11 Malignant US findings of a PCTN are considered different from those of a solid nodule and include an eccentric configuration with an acute angle, microcalcifications, macrolobulation or irregularity of the free margin, perinodular infiltration, and a centripetal vascularity in the pedicle. 9,17 Several researchers have suggested that associated cervical lymphadenopathy with intranodal cystic components or microcalcifications should be added to this list as one of the malignant US features of both solid nodules and PCTNs. 13,[17][18][19] This study is not the first to attempt to use a US-based diagnosis or classification as the sole method to assess or manage thyroid nodules. 1,4,17 To the best of our knowledge, however, no other study has used a distinct classification system for solid and partially cystic thyroid nodules, respectively. In this study, we assessed the efficacy and feasibility of using a US-based classification system as a diagnostic technique to predict whether solid or partially cystic thyroid nodules are malignant or benign.

Patients
From January 2008 to December 2009, 1036 patients (876 women and 160 men; mean age, 49.0 Ϯ 12.0 years) who underwent thyroid US were enrolled in our study. Each underwent US-FNA for Ն1 thyroid nodule Ն5 mm in the largest diameter. We obtained informed written consent from all patients, and the study was approved by the institutional review board.

Thyroid US
Real-time thyroid US was performed by an experienced radiologist by using a high-resolution sonographic instrument (iU 22; Philips Healthcare, Bothell, Washington) equipped with a 12-to 5-MHz linear probe. Thyroid nodules were placed in 1 of 2 categories: solid thyroid nodules, defined as purely solid or as predominantly solid with any cystic component accounting for Ͻ10% of the total volume; and PCTNs, defined as thyroid nodules with a cystic component accounting for Ն10% of the total volume.
For solid thyroid nodules, the US features that we used to indicate benignancy included an ovoid shape, isoechogenicity, a smooth margin, and peripheral vascularity. The US features of solid thyroid nodules that we termed borderline included hypoechogenicity; centrally predominant vascularity; and macrocalcifications, such as eggshell calcification and intranodular macrocalcifications, the latter defined as a nodule with macrocalcifications diffusely scattered over onethird of the entire nodule volume. Solid thyroid nodules diagnosed as malignant were characterized by marked hypoechogenicity, a spiculated margin, microcalcifications, taller-than-wide shape, and associated cervical lymphadenopathy with intranodal cystic components or microcalcifications. For PCTNs, the US features of a benign nodule included a configuration that was either concentric or eccentric with a blunt angle, a smooth free margin, peripheral or no vascularity, a spongiform appearance or daughter cysts in the solid component, intranodular comet-tail artifacts, and isoechogenicity. The US features of a malignant PCTN included an eccentric configuration of the main solid or cystic component with an acute angle, microcalcifications, macrolobulation or irregularity of the free margin, perinodular infiltration, a centripetal vascularity in the pedicle, and associated cervical lymphadenopathy showing intranodal cystic components or microcalcifications. The associated cervical lymphadenopathy criterion was applied only to a dominant thyroid nodule, defined as the thyroid nodule most likely to be malignant among all the nodules in a given patient on thyroid US.

Nodule Classification
Different criteria for US-based diagnosis of thyroid nodules were defined for the type of nodule (solid or partially cystic) (Fig 1). The US-based diagnostic criteria for solid thyroid nodules were as follows:

US-FNA and Cytologic Analysis
US-FNA was performed immediately after thyroid US examination by the same radiologist. All 1036 patients underwent US-FNA; collectively, they had 1289 nodules (nodule size range, 0.5-9.8 cm; mean size, 1.5 cm). We used the criterion recommended by the Korean Society of Thyroid Radiology to determine which nodules were eligible for US-FNA examination (largest diameter, Ն5 mm) rather than the American Thyroid Association guidelines (largest diameter, Ն10 mm). 13 For each sample, smears were prepared on 4 -6 slides, fixed in 95% ethanol, and were examined after Papanicolaou staining. For PCTNs, cytologic analysis was performed on the remaining aspirate in the syringe.
The cytologic analysis was categorized as follows: 1) inadequate (Bethesda class I), 2) benign (Bethesda class II), 3) indeterminate (Bethesda class III), 4) follicular neoplasm (Bethesda class IV), 5) suspicious for malignancy (Bethesda class V), and 6) positive for malignancy (Bethesda class VI). The cytologic results were deemed inadequate when fewer than 6 clusters of thyroid follicular cells containing no identifiable colloid were observed in a preparation. Nodular goiter, nodular goiter with hyperplastic nodules, colloid nodules, cyst contents with or without benign follicular cells, and lymphocytic thyroiditis were classified as benign cytology. Cytology was termed "indeterminate" in the specimens with atypical cells or follicular cells of undetermined significance. Cellular specimens with abundant follicular cells arranged in a microfollicular pattern with little or no colloid or cellular specimens with a predominant Hurthle cell population were reported as follicular neoplasms. Specimens were considered suspicious for malignancy if they demonstrated features of a malignant neoplasm that were quantitatively or qualitatively insufficient to make a definite diagnosis of malignancy. Specimens showing abundant cells with malignant cytologic features were positive for malignancy.

Statistical Analysis
Thyroid nodules that were diagnosed as benign or probably benign were classified as negative (benign), while those that were diagnosed as possibly malignant and malignant were classified as positive (malignant). We calculated the sensitivity, specificity, positive and negative predictive values, and the accuracy of US diagnoses and compared these with histopathologic results for solid nodules and PCTNs. Then, borderline solid nodules were excluded and reclassified as either benign or malignant, before comparing all results by either the McNemar or the 2 test. A P value Ͻ .05 was considered statistically significant. Data analyses were performed by using the Statistical Package for the Social Sciences for Windows (Version 17.0; SPSS, Chicago, Illinois).
After excluding 44 of the resected solid nodules that were diagnosed as borderline, the sensitivity, specificity, positive and negative predictive values, and accuracy of US diagnoses in differentiating malignant from benign nodules were 86.0, 90.0, 93.1, 78.6, and 86.8%, respectively. Furthermore, after we divided the nonborderline resected nodules into solid and PCTN categories, 451 solid nodules were included in the analysis. The sensitivity, specificity, positive and negative predictive values, and accuracy of US diagnoses in differentiating malignant from benign nodules among solid nodules were 86.1, 90.0, 94.3, 77.3, and 87.5%, respectively, and 66.7, 88.9, 75.0, 84.2, and 81.5%, respectively, among the PCTNs ( Table  2). The diagnostic efficacy of thyroid US when borderline nod- ules were excluded did not differ significantly from that when the same nodules were reclassified as malignant (P ϭ .389, McNemar test), but it significantly differed from that obtained when the same nodules with a borderline US diagnosis were reclassified as benign (P ϭ .001, McNemar test). The diagnostic indices of individual US diagnostic classes for the resected nodules are shown in Table 3. The diagnostic accuracy for nodules in SN-US class I, SN-US class II, SN-US class V, PCTN-US class I, and PCTN-US class IV was significantly higher than that for nodules in other US classes.
The US features of the 44 surgically resected nodules that were classified as borderline by US are listed in Table 4. We found a high incidence of hypoechogenicity (14/44, 31.8%) and eggshell calcification (23/44, 52.3%), as well as a high malignancy rate (65.9%, 29/44). The US features of the 23 nodules with eggshell calcification were retrospectively reviewed. Two eggshell nodules simultaneously showed 2 subfindings: 1 nodule showed interrupted eggshell and thick hypoechoic outer rim and the other nodule showed thickening of eggshell and a thick hypoechoic outer rim.

Discussion
The US criteria that we used in this study for assessment of solid nodules were based on a number of previous studies, 1-8 but different US criteria for PCTNs were predicated on the theory that a malignant PCTN originates from the wall of a thyroid cyst. 20 Some US features that are associated with malignancy in a solid thyroid nodule include marked hypoechogenicity, a spiculated margin, microcalcifications, taller-thanwide shape, and associated cervical lymphadenopathy with intranodal cystic components or microcalcifications, [1][2][3][4][5][6][7][8][9][10]18,19 whereas those associated with malignant PCTNs have an eccentric configuration with an acute angle, microcalcifications within a solid component, macrolobulated or irregular free margin of the solid component, perinodular infiltration, a centripetal vascularity in the pedicle, and associated cervical lymphadenopathy with intranodal cystic components or microcalcifications. 9,[17][18][19] In addition, the possibility of malignancy or benignancy for thyroid nodules is considered to be related to the number of malignant or benign US features. 17,21,22 In particular, Kwak et al 21 examined the risk stratification of thyroid malignancy by using similar US features such as those used in our study; this study showed that the risk of malignancy increased with an increase in the number of suspicious US features.
Kim et al 1 believed that US features predicting malignant solid nodules include marked hypoechogenicity, microcalcifications, microlobulated margin, and taller-than-wide shape. They reported that the sensitivity, specificity, positive and negative predictive values, and accuracy of US diagnosis were 93.8, 66.0, 56.1 95.9, and 74.8%, respectively when cytopathologic diagnoses of 155 solid nodules were used as a reference standard. Our results demonstrated that the sensitivity, specificity, positive and negative predictive values, and accuracy of US diagnosis by using only histopathologic diagnoses of 451 solid nodules were 86.1, 90.0, 94.3, 77.3, and 87.5%, respectively, which suggest that the diagnostic accuracy of US is high. In addition, all diagnostic indices were high regardless of exclusion or inclusion of borderline nodules. We suppose that the high diagnostic efficacy, regardless of the category of the borderline nodules, was related to the high malignancy rate of resected borderline nodules.
Recently, Horvath et al 22 showed that a US-based reporting system improved patient management and cost-effectiveness by helping avoid unnecessary FNA. Only a small percentage of their results (21.0%, 230/1097) were surgically confirmed, however, with histologic analysis by US-guided fine-needle aspiration biopsy being used as the reference standard in most cases. Furthermore, their diagnosis was based on retrospectively reviewed US features from limited US images, rather than real-time US. In contrast, we used real-time US as the diagnostic tool and histopathologic results of resected thyroid nodules as our reference standard. In addition, solid thyroid nodules were prospectively classified into 1 of 5 categories (SN-US classes I-V), and PCTNs were grouped into 1 of 4 categories according to their US features (PCTN-US classes I-IV).
Previously, we used an US-based classification to demonstrate that prospective studies that used 5 US categories for solid thyroid nodules and 4 US categories for PCTNs have a high diagnostic efficacy. 17,23 However, this study showed the   diagnostic accuracy of US for thyroid nodules on a larger scale, correlated these with histopathologic results, and attempted to use a different US classification system for solid nodules versus PCTNs. The diagnostic accuracies for nodules in SN-US class  I, SN-US class II, SN-US class III, SN-US class V, PCTN-US  class I, and PCTN-US class IV are considerably higher than  those for SN-US class IV, PCTN-US class II, and PCTN-US  class III. In addition, all of the specificities of individual US diagnostic classes for solid nodules and PCTNs were considerably high, but all of the sensitivities were relatively low. The usefulness of hypoechogenicity, macrocalcifications, and centrally predominant vascularity as characteristics that predict malignancy for solid nodules has been debated. [2][3][4][5][6][7][8][11][12][13]24,25 Nonetheless, we used these characteristics to classify nodules as borderline malignancy. The nodules that we classified as borderline had a high incidence of malignancy (65.9%, 29/44). Of 44 borderline nodules, 23 had eggshell calcification, and they showed a high incidence of malignancy (60.9%, 14/23). On the basis of retrospective US image analysis, 14 malignant nodules with an eggshell calcification had either an interrupted eggshell (10/14, 71.4%) or a thick hypoechoic outer rim (7/14, 50%), which concurs with studies that suggest that these 2 findings predict malignancy in eggshell-calcified nodules. 24,25 Nevertheless, large-scale studies are needed to correctly predict the risk of malignancy in borderline nodules because of the high possibility that the 50 nonresected borderline nodules were benign in this study.
There are several limitations to our study. First, 784 thyroid nodules were not surgically confirmed and were not included in the calculation of the diagnostic efficacy of US diagnosis. In addition, 62 nodules with suspicious cytology were not surgically removed due to patient loss and/or patient refusal of thyroid surgery. Furthermore, there was a high incidence of papillary thyroid carcinoma (95.2%, 299/314), which may introduce a bias because generally accepted malignant US features are not helpful for diagnosing follicular carcinoma. 26,27 Finally, all US evaluations were performed by a single radiologist. Therefore, large-scale multicenter studies are recommended to ensure reproducibility.