Improving Imaging Diagnosis of Persistent Nodal Metastases after Definitive Therapy for Oropharyngeal Carcinoma: Specific Signs for CT and Best Performance of Combined Criteria

BACKGROUND AND PURPOSE: Criteria for detection of persistent nodal metastases in treated oropharyngeal tumors are sensitive but nonspecific, leading to unnecessary nodal dissections. Developing specific imaging criteria for persistent nodal metastases could improve diagnosis while decreasing patient morbidity. MATERIALS AND METHODS: Patients with oropharyngeal squamous cell carcinoma with nodal metastases treated by definitive radiation therapy and subsequent nodal dissection were retrospectively evaluated. One hundred thirty-eight patients had pre- and posttherapy contrast-enhanced CTs evaluated by radiologists blinded to the status of pathologically proved hemineck persistent nodal metastases. Composite scoring criteria for CT, combined from individual parameters, were compared with radiologists' opinions, previous multiparameter criteria, and outcome data. RESULTS: New low-attenuation areas and a lack of size change (<20% cross sectional area) were both highly specific for persistent nodal metastases (99%; P = .0004). Extranodal disease on pretherapy imaging was moderately specific (86%; P = .001). The CSC correctly placed 29 patients in a low-risk category compared with 14 by previously reported criteria and radiologist reports. With good second-rater reliability, the CSC cutoff values stratified patients at highest risk of persistent nodal metastases, thereby improving specificity while maintaining sensitivity. CONCLUSIONS: Comparing pre- and posttherapy examinations improves specificity by discriminating focal findings and size change compared with a single time point. The CSC can categorize the risk of persistent nodal metastases more accurately than previous CT methods. This finding has the potential to improve resource use and reduce surgical morbidity.

persistent nodal metastases. Unlike other sites for head and neck squamous cell carcinoma, oropharyngeal cancer prevalence is rising with increasing human papilloma virus rates [9][10][11] ; these HPVassociated cancers also show an improved response to nonsurgical treatment compared with non-HPV-associated cancers. 10,11 This trend is likely to further increase the rate of neck dissections with negative findings compared with historical series. As a result, unnecessary patient morbidity associated with postradiation neck dissection 8 is likely to increase in coming years.
With improvement in imaging modalities, there has been a change of practice from obligate nodal dissection after definitive therapy to observation for patients with complete response to treatment by clinical and imaging criteria. [12][13][14] Multiparameter contrast-enhanced CT criteria 14,15 can safely place some patients on imaging follow-up, thereby avoiding a nodal dissection with negative findings. Because of low specificity, however, many patients still undergo surgery for equivocal imaging findings, underscoring the need for refinements in posttherapy imaging criteria to more accurately define treatment response. The purpose of this study was to determine whether CT imaging features and multiparameter criteria can improve specificity while maintaining sensitivity, to safely reduce the number of node-negative dissections performed.

Clinical
After approval by our institutional review board, we used our clinical data base to identify patients with nodal metastases from oropharyngeal squamous cell carcinoma treated with definitive radiation therapy, with or without chemotherapy, who underwent subsequent nodal dissection between 2000 and 2010. Preand posttherapy contrast-enhanced CT scans were available in 138 patients, obtained Ͻ180 days after radiation therapy completion to determine persistent, rather than recurrent, nodal metastases. CT was performed an average of 49 Ϯ 17 days after completion of radiation therapy with only 6 CT scans not obtained between 30 and 90 days. Patients were clinically followed an average of 4.6 Ϯ 2.0 years after dissection, with 1 perioperative mortality and 2 patients lost to follow-up before 180 days.
Whether pathologically-proved viable persistent tumor was demonstrated in each hemineck by nodal dissection was recorded, as well as the size, number, and position by nodal station. Viable tumor was determined from the pathologist report, usually from an area of non-necrotic tumor with possible mitoses.
Patients were predominately middle-aged (55 Ϯ 9 years of age; men, 88%), with stage 3 and 4 oropharyngeal tumors, preoperative nodal metastases (stage N2A-C in 83%), and no distant metastases. Tobacco use was common (67%). The most commonly involved oropharynx sites were the base of tongue and the palatine tonsil. Concurrent chemotherapy was common (62%) in addition to definitive radiation treatment (ϳ70 Gy), while induction chemotherapy was less common (25%). Of 138 patients, 22 (14%) were pNϩ within 54 of 1958 dissected lymph nodes (3%). The reasons for nodal dissection were diverse, including persistent primary tumor and planned neck dissection, though the most common reason was concern about abnormal nodal tissue on imaging (62%).
A fellowship-trained neuroradiologist (J.D.H.) recorded the imaging parameters (see below). Most CT parameters, including all pNϩ cases and a random selection of pNϪ cases, were independently re-evaluated by a senior member of the American Society of Neuroradiology (L.E.G., S.A., or A.J.K). This re-evaluation was performed to test how well a practicing neuroradiologist could reproduce these methods with little training in the methodologies (Ͻ5 cases). All radiologists were blinded to the pathology results during reading.
During this study, CT technology improved so that section thickness decreased from 5-7.5 mm to our current standard of 1.25-mm-section thickness and 25-cm FOV. Most patients had a 120-mL iohexol (Omnipaque; GE Healthcare, Princeton, New Jersey) contrast bolus injected at 3-mL/s with a 90-second delay on an Excite scanner (GE Healthcare, Milwaukee, Wisconsin). Scans were displayed with a window width and level of 300 and 70.

Individual Parameters
The largest lymph node on pretreatment axial images and any node with new focal findings or growth on posttreatment images had bidimensional measurements obtained on axial pre-and posttreatment images. These measurements were used to assess the following: 1) estimation of the change in cross-sectional areas; 2) modified Response Evaluation in Solid Tumors (RECIST, Version 1.1) criteria with "complete response" defined as all lymph nodes Ͻ1 cm in the short axis 16 ; and 3) previously reported criteria, including a maximal axial diameter decrease by Ն50%. 17 The presence of focal findings and changes between pre-and posttherapy imaging for cervical lymph nodes abnormal on pretherapy CT or newly abnormal on posttherapy CT was recorded as positive, equivocal, or negative and new or old. Examples of these various parameters are given in Fig 1. Cystic nodes had water attenuation. Necrosis was defined as attenuation lower than that in muscle but higher than that in water. A working definition of "difference" in attenuation was ϳ20 HU. Cystic/necrotic areas were differentiated from a fatty hilum on the basis of attenuation and spatial location (ie, only a single fatty hilum presumed). Enhancement was graded as positive (more than muscle attenuation), negative (equal or less than muscle), or equivocal. Enhancement was considered focal if there were both enhancing and nonenhancing portions in the same lymph node. Because focal enhancement is dependent on technique to differentially attenuate tissues, suboptimal examinations were defined as section thickness Ͼ2.5 mm because of volume averaging, poor enhancement of tumor and regional tissues due to bolus timing, and/or patient motion in the area of interest. Ring enhancement was defined as at least 270°of enhancement with a center of low attenuation. Performance for ring enhancement was calculated for the optimal technique only or so that equivocal cases (eg, 180°to 270°of faint enhancement) were positive on a suboptimal study. Extranodal disease was determined on pretherapy imaging because loss of fat planes, cortical irregularity, and/or fat stranding is common after radiation. An example of an equivocal case is a 90°-180°abutment of the sternocleidomastoid muscle without an intervening fat plane or associated fat stranding. Ring enhancement was the only criterion that was generated post hoc (ie, after J.D.H. knew the pathology results).

Multiparameter Criteria and Statistics
The single parameters were combined to determine whether this combination would improve performance. The individual nodal findings and 2 measurement sets were compiled for each patient's hemineck among the affected lymph nodes because not all of the findings for a patient were demonstrated in a single lymph node. For example, 1 lymph node may have increased in size while a different one demonstrated new necrosis. Multiple combinations of parameters were tested to yield the best performance. The new criteria were then compared with the performance of the CT criteria of Ojiri et al, 15 defining complete response as the following: 1) maximal axial dimension of any lymph node of Ͻ1.5 cm; 2) no shows an enlarged lymph node with inhomogeneous enhancement. Although there is a decrease in size and a small absolute size, the remnant lymph node demonstrates new necrosis with ring enhancement (arrow). C and D, Slight increase in size but a small absolute size and nonenhancing areas that are not definitely lower attenuation than muscle (arrow). E and F, Extranodal disease with loss of fat planes and soft-tissue stranding on pretherapy (arrow in E), with residual necrosis and subtle ring enhancement (arrow in F). G and H, Loss of fat plane with the sternocleidomastoid (arrow in G) but without fat stranding is equivocal for extranodal disease. After therapy, there is a partial rim of enhancement, which is equivocal for ring enhancement (arrow in H) with persistent necrosis.
internal focal low-attenuation or calcification; and 3) no evidence of extracapsular spread. A comparison was also made with the radiologists' original reports and the clinical outcomes. The reports were interpreted as negative, equivocal, or positive for persistent nodal metastases, with only negative reports indicating lack of persistent nodal metastasis without further diagnostic steps required.
Sensitivity, specificity, and predictive values for new and previously reported single parameters and combined criteria were calculated. Fisher exact 2-tailed tests were used to detect differences between the lymph nodes from pNϩ and pNϪ patients and the prevalence of nodal tumor between the first and second halves of the trial. Significance was defined as P ϭ .05, with a corrected value of P ϭ .002 for multiple correlations by Bonferroni correction. statistics were performed for second-rater reliability.

Single Parameter
Performance of imaging parameters and multiparameter criteria is given in the On-line Table. Focal abnormality was the most sensitive parameter (98%) but was common in lymph nodes regardless of pathology status, with positive and negative predictive values of 19% and 88%, respectively. Also sensitive (84%-88%) were ring enhancement on an optimal examination (P ϭ .0006), previously reported criteria of Յ50% decrease in maximal axial dimension (P ϭ .02), 17 and the presence of a low-attenuation area (old or new) on the posttreatment examination (P ϭ .08). The absolute size did not correlate well with pNϩ, contrary to many current criteria, 4,15,16 and a Ͼ80% diameter decrease 4 only applied to a single pNϪ patient in this study.
The most specific single CT parameters were new low-attenuation necrotic/cystic areas (99%; P ϭ .0004) and lack of size change in bidimensional cross-sectional areas (Ͻ20% decrease; 96%; P ϭ .01), followed by extranodal disease (86%; P ϭ .001) and new calcification (86%, not significant). Specificity for cross-sectional changes further improved (99%) if only larger pathologic-appearing lymph nodes were used to decrease measurement error. The cross-sectional area had less decrease in pNϩ than in pNϪ patients (unpaired t test, P ϭ .0004). While neither sensitive nor specific, the presence of focal enhancement on postoperative imaging was more common in pNϪ than pNϩ patients (P ϭ .001). Patients having undergone induction chemotherapy were slightly more likely to be pNϩ (P ϭ .03) and have new necrosis (P ϭ .03) than those without induction. There was a nonsignificant trend toward increased prevalence of pNϩ from 16% for the first 5 years compared with 25% for the second.

Multiparameter Criteria
Analysis of multiple combinations of parameters supported the utility of point-based composite scoring criteria. One point was given for the following: 1) any necrosis or cyst on posttherapy examination, 2) definite ring-enhancing lesion or equivocal on a suboptimal examination, 3) Յ50% decrease in the maximum axial dimension for any affected lymph nodes that were Ն1.5 cm in diameter on pretherapy CT, 4) new partial calcification, or 5) unequivocal extranodal disease on pretherapy CT. Two additional points were given if the necrosis was new (total of 3 points). An example of scoring is given in Fig 2. Cross-sectional area was not used because this would be a cumbersome calculation to perform in daily clinical practice. New calcification is the weakest correlation to nodal status but improved performance of the CSC, whereas adding lack of focal enhancement on posttherapy CT did not significantly improve performance.

Second-Reader Analysis
A second reader evaluated 73 patient examinations (53% of total available), including all pNϩ cases. The same lymph nodes were chosen by both readers 88% of the time ( ϭ 0.87). Of these lymph nodes, size measurements were within 5 mm for 90% and 2 mm for 69% of cases. Focal-finding data had 82% agreement ( ϭ 0.74), mostly due to non-CSC items such as the presence of enhancement. The agreement on the categoric data for the CSC was 90% ( ϭ 0.86), including ring enhancement assessed by second readers blinded to pathology. The second reader's CSC would have had 31 additional points toward the pathologic diagnosis and 42 away from it, compared with the first reader.

DISCUSSION
To summarize, the most sensitive individual CT parameters for persistent nodal metastases after definitive radiation therapy were the presence of focal findings (especially low attenuation), Ͼ50% change in maximum diameter, 17 and ring enhancement. The most specific findings were new low attenuation and a lack of decrease in the cross-sectional area (especially in larger lymph nodes), emphasizing the need for comparison studies. Because most of these patients with pathologically proved nodal dissection were referred because of abnormal posttherapy imaging findings, most previously reported criteria did not perform well, given that clear-cut negative cases were absent. Most of these patients were pNϪ, thus underscoring the need for better diagnostic accuracy to avoid unnecessary surgery.
Focal nodal findings were common despite the low rate of pNϩ, reducing the specificity of these findings compared with other reports. 4,14,15 Focal findings may have been common because of multiple factors including: 1) the patient selection bias, with indeterminate initial imaging findings; 2) improvement in CT spatial resolution (increased sensitivity, lower specificity); 3) HPV-associated oropharyngeal squamous cell carcinoma increasing in prevalence and commonly showing necrosis or cyst formation, 18 which may not fully resolve after treatment 19 ; 4) nonenhancing tissue in lymph nodes that may be difficult to discern from true low-attenuation necrosis, especially if small; and 5) artifacts such as beam-hardening that may mimic focal nodal changes. Additionally, homogeneous enhancement of normal nodes was a common finding postradiation, but inhomogeneous enhancement may still be helpful in identifying pathologic lymph nodes pretherapy (Fig 1A). Many of the dystrophic calcifications were in small portions of the node, which may have represented heterogeneity in the nodal response to therapy.
The proposed CSC outperformed prior CT criteria 14,15 and expert radiologists' reads, while the second-reader analysis showed that the CSC have good reproducibility for experienced readers with little training. CSC also offer the advantage of having variable performance depending on the cutoff value used, so that patients can be stratified to pNϩ risk. In an era in which cost-efficient medicine is increasingly important and the incidence of HPV-related oropharyngeal cancer is expected to rise, without additional testing, the new CSC could potentially reduce cost. CSC would send 29 patients with scores of 0 -1 to monitoring compared with 14 by the other methods, without a pNϩ patient included. CSC also identified the highest risk patients who could proceed directly to nodal dissections. Stratification of risk also better reflects current management practice, in which options include not only nodal dissection and CT follow-up but also PET-CT 20,21 or sonography-guided biopsy 22 for the intermediate-risk groups. The prevalence of pNϩ in our study increased with time, a trend suggesting that these additional imaging modalities may already be affecting patient selection for nodal dissection. Further work is needed to determine how clinical variables such as staging information, HPV status, and additional imaging modalities for intermediate-risk groups could be combined with CSC to improve diagnostic performance and further limit pNϪ nodal dissections.
A mnemonic for remembering the scoring criteria is "NE 2 Ck REaD": NE 2 ϭ NEcrosis and to remember to add 2 points for NEw NEcrosis C ϭ Calcification (new) R ϭ Ring-enhancing E ϭ Extranodal disease D ϭ Diameter decrease (50% for nodes of Ͼ1.5 cm initially).
Another potential solution for improving the specificity of posttherapy imaging is to use FDG-PET/CT. At our institution, we tend to use PET-CT as a confirmatory test because it is often performed later after treatment because of false-positives from inflammation and infection early on [23][24][25] and is best used in patients at high risk for treatment failure, 20 such as those with HPV-unrelated disease. Negative findings on PET, without uptake, for noncystic nodal remnants are particularly useful in excluding pNϩ, though finding higher uptake value cut-offs that are both applicable and accurate is problematic. 24,26 This was our experience in the small number of patients in this cohort who had posttherapy PET. The increasing HPV-associated oropharyngeal squamous cell carcinoma prevalence with better radiation response rates 10,11 and associated cystic lymph nodes 18 will likely change the best diagnostic algorithms in coming years. No new calcifications were noted. The total score of 6 points places the patient at very high risk. The diagnosis was further confirmed with a PET-CT with a standardized uptake value of 7.1 g/mL. At dissection, pathology showed extracapsular extension (a microscopic diagnosis that is related to extranodal disease) and 2 other regional lymph nodes that were positive for nodal metastases. dark gray) and negative nodal remnants (pNϪ; light gray) in patients by the proposed CT composite scoring criteria. Percentages of pNϩ for each scoring group are given above each bar. The criterion for the scoring system is given in the top right. Patients required both pre-and posttherapy examinations for this scoring. The overall pNϩ rate was 14% for this cohort.
There are a number of limitations to this study, including the following: 1) a retrospective study during 10 years with heterogeneity in treatment, imaging technique, radiologists interpreting, and indications for nodal dissection; 2) low prevalence of persistent nodal metastasis; 3) lack of clinical factors such as HPV and p16 status available; 4) the small number of PET cases; 5) limited second-reader evaluations; and 6) limited information (nodal station and size) for pNϩ node location based on pathology reports. Given this last consideration, nodal findings were compiled for the individual patient's affected hemineck rather than an individual lymph node. Additionally, the single-time-point CT scans did not demonstrate physiologic variable as does PET or perfusion CT imaging but did allow multiple independent parameters from a single study. The method for determining previous CT criteria 15 scoring differed slightly from the original methodology because extranodal disease was assessed preoperatively and equivocal findings ("indeterminate because of a borderline finding and/or artifact" 15 ) were not included as being predictive of tumor because this would have further decreased the performance of the previous criteria. Further work is needed to determine whether the CSC are useful in a more generalized population with more clearly negative-appearing posttreatment imaging findings and other squamous cell carcinoma head and neck sites.

CONCLUSIONS
New focal lucency and small decrease in the size of affected nodes are the most specific CT imaging markers of persistent nodal metastases after treatment, emphasizing the need for comparison with pretreatment imaging examinations. Our composite scoring system for CT ("NE 2 Ck REaD") outperformed previously reported CT criteria. Although further work is necessary, these criteria may help to determine the posttherapy risk for persistent nodal metastases and more efficiently triage an individual patient to the next step in diagnostic evaluation or therapeutic management.