Prospective Validation of Two 4D-CT–Based Scoring Systems for Prediction of Multigland Disease in Primary Hyperparathyroidism

BACKGROUND AND PURPOSE: Patients with multigland primary hyperparathyroidism are at higher risk for missed lesions on imaging and failed parathyroidectomy. The purpose of this study was to prospectively validate the ability of previously derived predictive score systems, the composite multigland disease score, and the multiphase multidetector contrast-enhanced CT (4D-CT) composite multigland disease score, to identify patients with a high likelihood of multigland disease. MATERIALS AND METHODS: This was a prospective study of 71 patients with primary hyperparathyroidism who underwent 4D-CT and successful parathyroidectomy. The size and number of lesions identified on 4D-CT, serum calcium levels, and parathyroid hormone levels were collected. A composite multigland disease score was calculated from 4D-CT imaging findings and the Wisconsin Index (the product of the serum calcium and parathyroid hormone levels). A 4D-CT multigland disease score was obtained by using the CT data alone. RESULTS: Twenty-eight patients with multigland disease were compared with 43 patients with single-gland disease. Patients with multigland disease had a significantly smaller lesion size (P < .01) and a higher likelihood of having either ≥2 or 0 lesions identified on 4D-CT (P < .01). Composite multigland disease scores of ≥4, ≥5, and 6 had specificities of 72%, 86%, and 100% for multigland disease, respectively. 4D-CT multigland disease scores of ≥3 and 4 had specificities of 74% and 88%. CONCLUSIONS: Predictive scoring systems based on 4D-CT data, with or without laboratory data, were able to identify a subgroup of patients with a high likelihood of multigland disease in a prospectively accrued population of patients with primary hyperparathyroidism. These scoring systems can aid in surgical planning.

P rimary hyperparathyroidism is characterized by excessive parathyroid hormone production resulting in hypercalcemia. Surgical resection remains the only definitive cure. In recent years, preoperative localization of the abnormal parathyroid gland is routinely performed and plays an integral part in operative guidance. [1][2][3] Most cases of primary hyperparathyroidism are caused by a single parathyroid adenoma; however, 10%-30% of patients are known to have multigland disease (MGD). 4,5 These patients pose a considerable challenge in preoperative localization because the performance of technetium Tc99m sestamibi scanning and sonography is inferior in patients with MGD compared with patients with single-gland disease (SGD). 6,7 The decision to perform minimally invasive parathyroidectomy or 4-gland exploration relies heavily on preoperative imaging findings. Accurate preoperative identification of abnormal parathyroid lesions promotes effective operative planning and patient counseling, and can prevent a failed operation. 3,[8][9][10] For example, reliable preoperative determination of SGD allows a surgeon to perform minimally invasive unilateral parathyroidectomy, which minimizes the incision length and reduces the risk of bilateral recurrent laryngeal nerve injury.
Multiphase multidetector contrast-enhanced CT (4D-CT) has emerged in recent years as a new technique of preoperative localization of an abnormal parathyroid gland. 4D-CT has been shown to successfully localize abnormal parathyroid glands that have been missed by scintigraphy and sonography. Furthermore, multiple studies have shown the superior sensitivity of 4D-CT in identifying abnormal parathyroid glands in patients with both SGD and MGD. [6][7][8] However, the sensitivity of 4D-CT for MGD remains low at 32%-53%, compared with its sensitivity for SGD (88%-93%). [6][7][8] To improve identification of patients with a high likelihood of MGD, Sepahdari et al 11 developed a composite MGD score based on 4D-CT imaging and biochemical data derived from a retrospective review of 155 patients. Variables included in this scoring system were the following: 1) the size of the largest lesion identified on 4D-CT, 2) the number of lesions identified on 4D-CT, and 3) the Wisconsin Index (product of serum calcium and parathyroid hormone levels). Scores of Ն4, Ն5, and 6 had specificities of 81%, 93%, and 98%, respectively, for the identification of MGD. 11 Although promising, this MGD scoring system was derived solely from a retrospective review and therefore was limited due to biases inherent to its study design. The aim of the current study was to evaluate the performance of the MGD scoring system for predicting MGD by applying it to a prospectively accrued patient population.

Study Subjects
Institutional review board approval was obtained for this study, which was performed with a waiver of informed consent and a waiver of Health Insurance Portability and Accountability Act authorization. All subjects having 4D-CT for primary hyperparathyroidism were accrued in a prospective manner during the 12 months between January 2014 and January 2015 in a single academic institution (University of California Los Angeles). Subjects were identified at the time of the initial scan. Clinical data and CT-derived data were recorded. Periodic chart review was then performed to track clinical outcomes. Subjects who went on to have successful parathyroidectomy, defined as an intraoperative parathyroid hormone (PTH) drop of 50% and/or at least 6 months of postoperative eucalcemia, were included for further analysis.

4D-CT Technique
Imaging was performed on either a 64 -detector row scanner (Somatom Definition; Siemens, Erlangen, Germany) or a 256 -detector row scanner (Somatom Definition Flash; Siemens). Scanning included noncontrast, arterial phase, and delayed-phase images from the hard palate to the carina. The parameters for all 3 phases were the following: section thickness, 0.6 mm; tube rotation time, 0.5 seconds; pitch factor, 1; FOV, 24 cm; 120 kVp; 230 reference mAs with automated tube current modulation (CARE Dose4D; Siemens). Arterial phase images were obtained 25 seconds following the initiation of a 100-to 120-mL IV bolus of iohexol, 350 mg of iodine/mL, injected through either a 20-or 22-ga antecubital catheter at either 4 or 3 mL/s. The delayed phase was acquired 30 seconds after the arterial phase ended. All images were reconstructed at 1-mm section thickness in the axial, coronal, and sagittal planes and reviewed in the PACS.

Lesion Localization
All parathyroid lesions were classified as correctly or incorrectly localized on 4D-CT by correlating the operative notes with the original radiology reports and by using anatomic landmarks re-ported in both the operative and radiology reports. Lesions were described by using a system based on the anteroposterior location relative to the course of the recurrent laryngeal nerve. Lesions posterior to the expected course of the nerve were defined as superior parathyroid glands, and those anterior to the recurrent laryngeal nerve were defined as inferior parathyroid glands. The location along a superior-inferior axis was also described with the thyroid isthmus and the lower edge of the thyroid gland as landmarks. Lesions lying outside these typical locations were described as ectopic. Radiology reports were generated by a subspecialty-certified neuroradiologist with 10 years' experience in CT interpretation, including 5 years' experience in interpretation of 4D-CT. Sensitivities for lesion localization were based on these original radiology reports.

Predictors of Multigland Disease
4D-CT imaging and biochemical predictors of MGD were originally proposed by Sepahdari et al 11 on the basis of prior surgical literature. 12,13 4D-CT imaging predictors were the number of lesions identified on the original radiology report and the size of the largest lesion (maximum diameter in any plane). Laboratory data included serum calcium levels (milligram/deciliter), serum PTH levels (picogram/milliliter), and the Wisconsin Index (WIN). The WIN is the product of the serum calcium levels (milligram/deciliter) and PTH levels (picogram/milliliter) and was shown to help discriminate MGD and SGD in prior studies. 11,12 A composite MGD score was derived from variables of lesion size on 4D-CT, the number of prospectively detected lesions on 4D-CT, and the WIN. Each variable contributed up to 2 points to the MGD scores ( Table 1). The cutoff values used to assign points in the score were determined in a previous study by Sepahdari et al, 11 on the basis of prior literature for lesion size and ranges of biochemical markers. Maximum lesion size of Ͼ13 mm, 7-13 mm, and Ͻ7 mm were assigned scores of 0, 1, and 2. A single prospectively identified candidate lesion was assigned a score of zero versus a score of 2 for multiple candidate lesions or no candidate lesion. WINs of Ͼ1600, 800 -1600, and Ͻ800 were assigned scores of 0, 1, and 2, respectively. A second scoring system, the 4D-CT MGD score, was based only on the 4D-CT imaging variables of lesion size and the number of prospectively detected lesions on 4D-CT. The composite MGD score ranged from 0 to 6. The 4D-CT MGD score ranged from 0 to 4. For both scoring systems, a higher score more strongly favored MGD.  Table 1 and ranges from 0 to 6. The 4D-CT MGD score does not include the WIN and ranges from 0 to 4.

Data Analysis
The characteristics of MGD were compared with those of SGD for individual variables and the scoring system. The 2 test was used to assess differences in patients with MGD and SGD for categoric data. The Student t test was used to assess differences between MGD and SGD for continuous variables. Receiver operating characteristic analysis was performed to determine the sensitivity and specificity of each feature for predicting MGD. P Ͻ .05 was the threshold used for statistical significance for all tests.

Study Subjects
One-hundred four patients with primary hyperparathyroidism were imaged with 4D-CT during the study period from January 2014 to January 2015. Twenty-nine patients had no record of parathyroidectomy and therefore were excluded due to an inability to definitively categorize them as having either SGD or MGD on the basis of operative diagnoses. Two had failed parathyroidectomy, 1 had parathyroid carcinoma, and 1 patient died of an unrelated cause before the operation. The final analysis group included 71 patients ( Table 2). Failed parathyroidectomies were defined as a lack of persistent eucalcemia for 6 months postoperatively and likely indicate that the abnormal parathyroid adenoma responsible for hyperparathyroidism was not removed in these patients. Therefore, it is unclear whether these patients had a single abnormal adenoma that was missed or had MGD and not all of the abnormal adenomas were removed. Therefore, these 2 patients were removed from analysis because we could not definitively place them in either the SGD or MGD group. Of these patients, 28 had MGD and 43 had SGD. There were 77 abnormal glands among the 28 patients with MGD, with 120 abnormal glands identified surgically. There were no statistically significant differences between the SGD and MGD groups with regard to age and sex.

Lesion Localization
Among the 28 patients with MGD, 43 of 77 abnormal glands (56%) were identified preoperatively by using 4D-CT. For 8 of 28 patients (29%), all of the abnormal glands in an individual patient could be identified preoperatively with 4D-CT alone. Of the 43 patients with SGD, 39 (91%) lesions were identified preoperatively with 4D-CT.

Imaging Findings and Biochemical Factors
Patients with MGD had a smaller mean lesion size of 7.5 mm compared with patients with SGD, 11.2 mm (P ϭ .003).
There was also a significant difference between patients with MGD and SGD in the number of lesions identified with 4D-CT (Table 2). A single candidate lesion was identified in 63% of patients with SGD on 4D-CT compared with 21% of those with MGD (P ϭ .003). There was no statistically significant difference in mean calcium, PTH, and the Wisconsin Index between the 2 groups (Table 2).
Although notable differences were observed between MGD and SGD with regard to gland size and the number of lesions identified on 4D-CT, these factors individually did not serve as reliable predictors of MGD. Identification of either multiple or no abnormal glands was only 63% specific for MGD and 79% sensitive (Table 3). A lesion size of Յ7 mm had a specificity of 79% for MGD with a sensitivity of 50%. A WIN of Ͻ800 had a 93% specificity for MGD but only a 25% sensitivity.

Performance of Composite MGD and 4D-CT MGD Scores
The mean composite MGD score was significantly higher among patients with MGD at 4.3 compared with those with SGD at 2.7 (P Ͻ .001). Similar findings were noted for the 4D-CT MGD score (2.9 versus 1.5, respectively; P Ͻ .001) ( Table 2). Composite MGD scores of Ն4, Ն5, and 6 had high specificities of 72%, 86%, and 100%, respectively, for multigland disease (Table 4). When applied to the current prospective patient population, the calculated area under the receiver operating characteristic curve was 0.78 ( Figure). The 4D-CT MGD scores of Ն3 and 4 were similarly effective in predicting MGD, with specificities of 74% and 88%, respectively (Table 4), and with an area under the receiver operating characteristic curve of 0.78 (Fig 1B).

DISCUSSION
Preoperative localization of abnormal parathyroid glands plays an integral role in surgical planning. 4D-CT has improved localiza- tion ability in both MGD and SGD compared with ultrasound and sestamibi scans, but the sensitivity of 4D-CT for MGD remains low at 32%-53% compared with its sensitivity for SGD at 88%-93%. [6][7][8] The composite MGD score and the 4D-CT MGD score systems proposed by Sepahdari et al 11 suggested that a multifactorial system using 4D-CT information, with or without laboratory information, could effectively categorize patients into subsets with a very high likelihood for MGD, a very high likelihood for SGD, or intermediate likelihood of either MGD or SGD.
The major limitation of that study is that it was based on a retrospectively derived sample, without a separate validation cohort. In our current study, we applied the composite MGD and 4D-CT MGD score systems to a prospective cohort and confirmed that these score systems could be used to predict MGD with high specificity. The retrospectively derived MGD score systems by Sepahdari et al 11 performed equally well in our prospective setting. Using a retrospective patient population, Sepahdari et al observed high specificities of 81%, 93%, and 98% in predicting multigland dis-ease for composite MGD scores of Ն4, Ն5, and 6, respectively. In our current prospective study, similarly high specificities of 72%, 86%, and 100% were observed for composite MGD scores of Ն4, Ն5, and 6, respectively. This finding was also true for the 4D-CT score system. 4D-CT scores of Ն3 and 4 yielded specificities of 81% and 96% in the retrospective population, compared with 74% and 88% in the current prospective population. Last, areas under the receiver operating characteristic curves of MGD predictive score systems were also similar between the retrospective and prospective patient cohorts (0.82 and 0.83 for composite and 4D-CT MGD scores in the retrospective cohort compared with 0.78 for both techniques in the prospective cohort). Our prospective cohort validated the retrospectively derived system of Sepahdari et al, achieving results nearly identical to those of the retrospective cohort.
Our study also shows that the number of candidate lesions identified on 4D-CT alone is not sufficient for distinguishing MGD from SGD. In clinical practice, radiologists may rely on detecting multiple glands as the best sign of MGD. However, we showed that identification of Ն2 lesions on 4D-CT is neither sensitive nor specific for MGD. Among patients with SGD, 12 of 43 (28%) were noted to have Ն2 suspected lesions (false-positive), and 6 of 28 (21%) patients with MGD had only 1 suspected lesion (false-negative). These findings illustrate the need for additional data regarding lesion size and biochemical information to better estimate the risk for MGD.
The composite and 4D-CT MGD score systems should serve as evidence-based foundations for communicating certainty and uncertainty in the diagnosis of abnormal parathyroid glands. They should maximize the clinically actionable information available from imaging. These scoring systems are valuable in giving objective probabilities of MGD versus SGD. In clinical practice, a radiologist can advise surgeons that a composite MDG score of Ն5 or a 4D-CT MGD score of 4 is highly likely to represent MGD. Surgeons can then use these objective data to better plan operative approaches and counsel patients accordingly. For instance, given a high preoperative probability of MGD, surgeons may decide to avoid minimally invasive unilateral parathyroidectomy, which risks failing to identify abnormal glands during the operation, and instead opt for a 4-gland exploration to identify multiple abnormal parathyroid glands.
Several limitations exist in our study. Although this study was performed in a prospective manner, it remains a single-institu-   tion study with studies interpreted by a single neuroradiologist, who also participated in the original retrospective study. The composite MGD predictive score system is largely dependent on the 4D-CT findings, and imaging interpretation may vary in a systematic way among different radiologists. However, previous studies have shown high concordance in 4D-CT interpretations between neuroradiologists, suggesting that assignment of MGD scores is likely to be highly reproducible. 14,15 Last, a patient population presenting to a single institution may represent inherent bias. Thus, the applicability of the MGD predictive score system should also be prospectively validated in a multi-institution setting. Notably, the previous study by Sepahdari et al, 11 in which the MGD predictive score system was developed, was performed in 2 academic institutions with 2 different interpreting neuroradiologists. Rates of lesion identification were similar between the 2 institutions and the MGD score system was developed on the basis of patients from both institutions.

CONCLUSIONS
The composite MGD score and 4D-CT MGD score performed well in a prospective setting, identifying patients with a high likelihood of MGD by using 4D-CT data and biochemical information. Use of these scores preoperatively to identify patients at risk for MGD may play an important role in operative planning.