Evaluating “Eee” Phonation in Multidetector CT of the Neck

BACKGROUND AND PURPOSE: Since the advent of pharyngography, “eee” phonation has been used to distend the airway during imaging. When imaging shifted to axial CT, “eee” phonation was used to delineate mucosal anatomy better. However, because patients could not phonate for the entire examination (which could take several minutes), the technique was only useful during supplemental imaging, performed after the primary acquisition through the neck. The supplemental images covered a limited area (usually the larynx or supraglottis) and, therefore, could be obtained while patients phonated. Imaging of the neck has now shifted to multidetector CT (MDCT). With a 64-detector MDCT, the entire neck can be imaged in 8 seconds. We evaluated whether “eee” phonation could be used during the entire acquisition through the neck without degrading image quality. MATERIALS AND METHODS: Forty-eight patients who performed “eee” phonation during a CT examination of the neck were compared with 96 patients scanned following a breath-hold command. All patients were scanned on the same 64-detector MDCT scanner after intravenous contrast administration. Images were acquired at a 2-mm section thickness and reconstructed at 1-mm intervals. All scanning times ranged from 5 to 7 seconds. Studies were evaluated separately by 2 neuroradiologists for image degradation due to motion. Statistical analysis was performed by using the proportional odds ratio. RESULTS: We found no significant difference in motion during phonation compared with the breath-hold technique. CONCLUSIONS: Our results indicate that “eee” phonation can be performed during an entire image acquisition through the neck, when performed with the speed of the 64-detector scanner, without increasing motion.

W hen evaluating the upper airway, "eee" phonation has been recommended as a dynamic breathing technique for Ͼ20 years. The technique was first applied to pharyngography and then CT. 1 When phonating, the airway distends with air, which provides a natural contrast for the mucosa. Most head-and-neck tumors are mucosal in origin (eg, squamous cell carcinoma). Earlier studies have shown that phonation can improve the detection and delineation of tumors. [2][3][4] In our review of published series using phonation, however, it has only been performed during supplemental CT acquisitions through the neck (usually in conjunction with a sequence acquired during quiet breathing).
When considering any dynamic breathing maneuver, radiologists have previously been constrained by scanning times on axial CT or early spiral CT scanners. With current multidetector CT (MDCT), imaging times for the neck have significantly declined. At our institution, a neck CT acquired on a 4-detector scanner takes approximately 27 seconds. On a 16detector scanner, a neck CT averages 20 seconds. On the 64detector scanner, a neck CT averages Ͻ8 seconds. Our study evaluated the impact of phonation on motion when performed throughout the 5-7 seconds required for imaging the neck with 64-detector MDCT.
Phonation is one of several dynamic breathing maneuvers that has been used for supplemental CT imaging of the neck. Other dynamic maneuvers include the "puffed-cheek" maneuver and a modified Valsalva technique. 2,3,[5][6][7] The puffedcheek maneuver provides air contrast for evaluation of lesions within the oral vestibule. The modified Valsalva technique and phonation are more broadly applicable to mucosal lesions within the pharynx and larynx. Both create air contrast for better tumor detection. When compared with quiet breathing and the modified Valsalva technique, phonation has been shown to offer improved delineation of tumors of the pharynx and larynx. 4 More recent applications of phonation during CT imaging of the neck include evaluation of vocal cord paralysis. 8 Supplemental scanning increases the radiation dose to the patient and imaging time. In the past, these drawbacks have limited the widespread use of dynamic breathing maneuvers, despite studies that supported the usefulness of these techniques. Our study evaluates "eee" phonation to determine whether phonating throughout a neck CT degrades image quality secondary to motion, when performed with 64-detector speed.

Materials and Methods
This study compared 48 consecutive "eee" phonation patients (32 men and 16 women) with 96 breath-hold control patients (64 men and 32 women). The study was performed during a time when our breathing protocol for imaging the neck with MDCT changed from breath-hold to "eee" phonation. Forty-eight consecutive phonation patients were included in the study. As with the 96 control patients, the clinical histories ranged from tumor to infection, including both inpatients and outpatients. All patient information was evaluated retrospectively. The 96 patients in the breath-hold control group were selected retrospectively to match the age and sex of the "eee" phonation group. Selection of the control group was blinded to all information other than age and sex. The average age was 55.69 Ϯ 14.44 years for control patients and 55.69 Ϯ 14.52 years for "eee" phonation patients. The proportion of sexes was the same in 2 groups of patients: Sixty-nine percent of both groups were male; 31% were female.
Ideally, phonation patients would have been scanned performing both phonation and during breath-hold, so that an exact comparison could be made between the maneuvers. This was not the case due to the additional radiation costs that patients would have incurred. There were no additional cost or time constraints that limited our study. Patients were scanned according to the protocols accepted by our institution at the time of their scanning, with no added interruption to workflow.
All patients were scanned at Vanderbilt University Medical Center between January 2006 and October 2007. The same protocol was used for control and phonation patients. Patients were scanned on a 64detector MDCT (Brilliance-64; Philips Medical Systems, Best, the Netherlands) with the following technique: tube current, 300 mAs; voltage, 120 kV; detector collimation, 64 ϫ 0.625 mm; pitch, 0.891 mm/rotation; rotation time, 0.75 seconds. Scanning began at the skull base and finished at the aortic arch. Two-millimeter sections were reconstructed at 1-mm intervals and were reviewed in the axial plane. One hundred milliliters of intravenous contrast (ioversol, Optiray 320; Mallinckrodt, St. Louis, Mo; or iodixanol, Visipaque; Nycomed, Princeton, NJ) was injected with a power injector for 60 seconds before scanning. Scanning time for the "eee" phonation group and breath-hold group was calculated manually from the times recorded on the CT scanner. Twenty patients were chosen randomly from both groups and scanning times were measured. The time range for the "eee" phonation group was 6 -7 seconds with an average time of 6.35 seconds. The time range for the breath-hold group was 5-7 seconds with an average time of 6.2 seconds.
Before scanning, patients were trained by the CT technologists to perform "eee" phonation and practiced before beginning the examination. Patients were instructed to begin phonation when scanning commenced. Control patients were instructed by an automated command to "take in a breath and hold" as scanning commenced.
The CT examinations were reviewed retrospectively by 2 neuroradiologists independently; both were blinded to all clinical data and to the scanning technique performed. Images were evaluated in the axial plane by using a soft-tissue algorithm. Images were scored for motion, on a scale from 1 to 5, as follows: 1, no motion; 2, mild motion; 3, moderate motion; 4, moderately severe motion; 5, severe motion. Examiners divided scoring by airway region. The most cephalad region included the oral cavity and pharynx (from the nasopharynx superiorly to the hypopharynx inferiorly) and the supraglottis. The glottis, which includes the false and true vocal cords and the laryngeal ventricle, was evaluated separately for motion, because there was concern that phonation would increase true vocal cord motion and, therefore, degrade images at this level. 9 The most caudad region evaluated included the trachea, extending inferiorly to the thoracic inlet. These results are shown in the Table. Statistical analysis was performed as follows: Sample size was calculated for ordered categoric data. With a 1:2 recruitment ratio of "eee" phonation to the breath-hold control group and assuming a significance level of .05, at least 47 phonation patients and 94 control patients were needed to detect an odds ratio of 2.5. This odds ratio conveys the probability of having detected a larger motion score for an "eee" phonation patient relative to a control patient with 75% power. This power was computed post hoc. We collected the data first; therefore, we already had a maximum sample size and only needed to find the power corresponding to that sample size. The power of this study was between 75% and 80%.
The agreement between the motion scores from the 2 neuroradiologists was evaluated by using statistics. A proportional odds model was used to evaluate the association between "eee" phonation and a decrease in image quality due to motion. Each model was fitted separately for each region evaluated, yielding 3 different fits. In each region, age and sex were included as covariates. To correct for the correlated responses from the same patients, we used the Huberwhite sandwich estimator to adjust the variance-covariance matrix. A P value less than .05 was considered significant. The statistical software package used to perform the analysis was R, Version 2.6.0, (R Development Core Team, 2007). 10 The study was approved by the institutional review board.

Results
Motion scores from the 2 neuroradiologists at 3 measurement sites are summarized in Fig 1. In Fig 2, axial images at the level of the glottis demonstrate the range of motion encountered with "eee" phonation and the scoring criteria applied. There was no significant difference in the distribution of motion scores between "eee" phonation patients and control patients at each of the 3 regions evaluated. Eighty-six percent of the ratings (864 total scores) were of no or only mild motion (247 of 288 scores with phonation and 493 of 576 scores with breath-hold technique). There was a trend toward less motion above the glottis (55% of phonation with no motion compared with 44% of controls), though the results did not reach statistical significance. In Fig 3, the odds ratios comparing the "eee" phonation group with the control group are given. There was no statistically significant difference in motion between the 2 techniques at any region (to be statistically significant, the confidence interval should not cross 1.0).
The agreement of the readings from the 2 neuroradiologists was evaluated by using Cohen statistic with quadratic weighting. 11 The Cohen was 0.712 for images taken above the glottis (P Ͻ .001); 0.567, at the glottis (P Ͻ .001); and 0.398, below the glottis (P Ͻ .001). A value of zero corresponds to no more agreement than would be expected by chance alone, and a value of 1 represents total agreement. According to the Landis and Koch classification, 12 there was substantial agreement between readings from the 2 neuroradiologists for images taken above the glottis, moderate agreement for images taken at the glottis, and fair agreement for images taken below the glottis. More than 95% of scores were within 1 point of agreement. Scoring varied slightly below the glottis. This finding appears to be at least in part attributable to beam-hardening artifacts from the shoulders, which degraded image quality at the level of the thoracic inlet and made it difficult to delineate no and mild motion.
Scanning times were consistent between the phonation and control groups, ranging from 5 to 7 seconds (a 20-patient sample from the phonation group averaged 6.35 seconds; a 20patient sample from breath-hold group averaged 6.2 seconds).

Discussion
The results of this study indicate that "eee" phonation can be performed without increased patient motion. This advance is only possible because of the speed of a 64-detector MDCT scanner-which allows an entire scan to be acquired in Ͻ8 seconds. In prior studies, which were performed on early spiral CT scanners, patients were unable to comply with dynamic techniques. In a 1992 study involving early spiral CT in the evaluation of head and neck lesions, scanning times ranged from 24 to 36 seconds (a significant improvement over the 3-4 minutes required with the single-section scanning that preceded spiral CT). However, even with this improvement, half of the study patients were unable to perform a dynamic breathing maneuver, which was essentially a modified Valsalva technique, throughout the scanning. 3 For this reason, more recent studies have focused on supplemental scanning with dynamic maneuvers, limiting the z-axis for the supplemental scanning to the area of concern.
A study performed on a 4-detector scanner compared the modified Valsalva maneuver to "eee" phonation and quiet breathing. 4 Forty patients were initially scanned in quiet breathing with supplemental scanning during dynamic breathing maneuvers. The authors recommended "eee" phonation for supplemental scanning, finding that phonation improved tu- A, Motion scores above the glottis by using "eee" phonation and breath-hold techniques. More than 80% of both groups had no or only mild motion. B, Motion scores at the glottis by using "eee" phonation and breath-hold techniques. Despite the mild vibratory motion necessary for phonation, scores did not differ significantly between phonation and breath-hold, and most patients did not have any significant motion. C, Motion scores below the glottis by using "eee" phonation and breath-hold techniques. Because scanning commenced from cranial to caudal, there was concern that patients would move as they ran out of breath. However, motion scores are best below the glottis. mod indicates moderately. mor assessment. The study authors found that phonation was particularly useful in the hypopharynx, with improved localization of squamous cell carcinomas within the pyriform sinuses. 4 Ninety-five percent of hypopharyngeal tumors are squamous cell carcinomas, and it is not always possible to exclude a tumor when the pyriform sinus is not well distended. 13 Figure 4 demonstrates the advantage of pyriform sinus distention when evaluating tumor involvement. The evaluation of the larynx has always been limited by scanner speed because the larynx is particularly susceptible to motion artifacts from breathing and swallowing. 8 In our study, evaluation of the larynx demonstrated limited or no motion in nearly all patients with both "eee" phonation and breath-hold techniques. Because scans were obtained starting at the skull base and extending inferiorly to the thoracic inlet, there was concern that images acquired at the end of the scanning acquisition would be degraded by motion as patients became fatigued. This did not occur. Phonation also did not increase motion at the larynx over the breath-hold technique, even though phonating requires mild vibratory motion of the cords. With phonation, the cords tense and approach the midline, but they are not completely adducted. Kim et al 8 found that phonation (described as "hee" in their study) was more useful to evaluate vocal cord paralysis than conventional CT. In their study, radiologists improved from 81%-86% accuracy to 95% accuracy when diagnosing cord paralysis by switching from conventional CT to CT performed during phonation.
Some limitations of our study deserve mention. Ideally, each patient would have been scanned twice, once during "eee" phonation and once by using a breath-hold technique. We were not able to achieve this due to the radiation costs. We attempted to standardize patients by controlling for age and sex and including only patients scanned on the 64-detector MDCT scanner. We did not control for patient size or physical condition. Secondary analysis could be performed probing the effect of physical condition on the success of phonation, based on the clinical record. In our study, both phonation and control groups contained inpatients and outpatients, but we did not standardize this feature. There was some selection bias in  The odds ratio comparing phonation to the breath-hold technique is given above the glottis, at the glottis, and below the glottis. The estimated odds ratio is the center line, shouldered by an 80% confidence interval (darkest gray bar) and a 95% confidence interval (gray bar). the phonation group. Although the patients were consecutive phonation patients, not all patients scanned during this time were included in the study. There were patients who were scanned without phonation after the protocol had changed. These patients were not included in the study.
This study moves us closer to a streamlined approach to imaging the neck. Radiation exposure to patients will decrease when supplemental scanning is curtailed. Past studies have validated "eee" phonation, but only as a supplemental technique. Future research needs to be directed at whether primary scanning with "eee" phonation improves the depiction and characterization of mucosal tumors over scans obtained during quiet breathing. If phonation can be validated in this way, radiation could be minimized while maximizing our diagnostic capabilities in head and neck imaging with MDCT.

Conclusions
"Eee" phonation does not increase motion in the head and neck compared with the breath-hold technique when scanned with 64-detector MDCT speed. This is an early step toward a single acquisition with dynamic breathing maneuvers, which could streamline MDCT imaging of the head and neck. A 48-year-old man with squamous cell carcinoma of the supraglottic larynx. CT was acquired during "eee" phonation. A coronal reformatted image is shown. Because the pyriform sinuses (asterisks) are distended with air, the margins of the tumor are clearly delineated (arrowheads). Because tumor involved the undersurface of the high left pyriform sinus wall, a pharyngotomy was required. The thin arrow marks the laryngeal ventricle, which is distended with air during "eee" phonation.