Abstract
BACKGROUND AND PURPOSE: Several studies have reported on the clinical utility of DWI in head and neck cancer, but none of these studies compared HASTE with EPI-DWI in patients with head and neck cancer. The aim of our study was to compare detection and delineation of primary tumors and lymph nodes by using HASTE and EPI-DWI techniques in patients with HNSCC.
MATERIALS AND METHODS: Twelve patients with HNSCC and a total of 12 primary tumors and 77 visualized lymph nodes on MR imaging underwent DWI by using both EPI-based and HASTE techniques. Interobserver agreement for detection, delineation, and ADC values of primary tumors and lymph nodes was assessed by 2 radiologists, and artifacts for both DWI techniques were described.
RESULTS: The number of lesions (primary tumors and lymph nodes) identified on pretreatment EPI-DWI was higher compared with pretreatment HASTE-DWI, with means of total lesions of 88.5 and 69.0, respectively. Delineation of lesions was also better on pretreatment EPI-DWI compared with pretreatment HASTE-DWI, with means of well-delineated lesions of 80.5 and 27.5, respectively. Both EPI- and HASTE-DWI showed good interobserver agreement between radiologists of ADC values in lesions with ICC values of 0.79 and 0.92, respectively. Intraobserver agreement for ADC values in lesions assessed with EPI- versus HASTE-DWI techniques was low, with ICC values of 0.31 and 0.42, respectively. Significant interobserver disagreement concerning detection was only seen with HASTE-DWI, and none of the DWI techniques showed significant interobserver disagreements regarding delineation. EPI-DWI was more prone to susceptibility artifacts than HASTE-DWI: Ninety-one percent of primary tumors and 16% of lymph nodes were affected by susceptibility artifacts on pretreatment EPI-DWI, whereas these artifacts were not seen on HASTE-DWI.
CONCLUSIONS: Primary tumors and lymph nodes are more easily visualized on EPI-DWI compared with HASTE-DWI. EPI-DWI has geometric distortion, however, which has a negative effect on interobserver agreement of ADC values.
ABBREVIATIONS:
- CI
- confidence interval
- HASTE
- half-Fourier acquired single-shot turbo spin-echo
- HNSCC
- head and neck squamous cell carcinoma
- ICC
- intraclass correlation coefficient
- STIR
- short τ inversion recovery
HNSCC accounts for ∼5% of all malignancies.1 Head and neck cancer can be imaged with a variety of radiologic techniques, notably CT, MR imaging, and PET. DWI is an MR imaging−based technique whereby diffusion properties of water can be quantified by using the ADC. Hypercellular tissue is characterized by a low ADC, while hypocellular tissue like necrotic tumor is typically characterized by a high ADC. Because of this potential sensitivity to areas of high cellularity, DWI may be able to differentiate between recurrent or residual tumor versus radiation-induced inflammation and/or necrosis,2 a distinction that can be difficult with conventional MR imaging. Other possible clinical uses of DWI in the head and neck region are histopathologic classification of tumors and distinction of malignant-from-benign lymph nodes.3 Recently published clinical studies confirm the value of DWI to predict and assess in a timely manner the effectiveness of radiation or chemoradiation in patients with HNSCC.4,5
Measurements of ADC values depend on the position, size, and ease of delineation of the ROI that is placed within the lesion, and both intra- and interobserver reproducibility of ADC values can be problematic.6 Determining lesion borders may be complicated on DWI, making it difficult to precisely localize the lesion and to obtain reliable ADC values. More accurate ADC values can be obtained by combining DWI with contrast-enhanced T1-weighted imaging to localize the lesion and to define ROIs, as has been done in a DWI study of the breast.7 It is also unclear which DWI technique is best-suited to the head and neck. DWI studies are performed either with an EPI or with a non-EPI method such as HASTE, but EPI-based techniques are most commonly used in the head and neck.2,4⇓⇓⇓–8 MR images obtained with these different MR imaging techniques vary regarding contrast, SNR, and artifacts. Baltzer et al9 compared HASTE with EPI-DWI and concluded that both HASTE and EPI-DWI techniques could be useful in the breast to detect lesions and measure ADC values. DWI obtained with the EPI technique demonstrated more frequent susceptibility artifacts, resulting in geometric distortions compared with images obtained with the HASTE-DWI technique, but HASTE-DWI images showed a low SNR compared with the EPI technique.10
A comparison of visualization (detection and delineation) and calculation of interobserver agreement for both DWI techniques (HASTE and EPI) for primary tumors and regional lymph nodes in patients with head and neck cancer have not been performed previously, to our knowledge. High interobserver agreement is necessary before DWI can be routinely used in the clinic because disagreement between observers can result in dissimilar recommendations concerning treatment. The aim of this study was to compare HASTE and EPI-DWI techniques in head and neck cancer regarding detection and delineation of primary tumors and lymph nodes, interobserver agreement of ADC values, and the presence of artifacts.
Materials and Methods
Study Population
The study population consisted of 12 patients (11 men and 1 woman) with an age range from 51 to 71 years (median, 56.5 years) who were examined by MR imaging between 2006 and 2010. All patients had advanced-stage HNSCC of the tonsillar fossa (n = 5), piriform sinus (n = 3), base of the tongue (n = 3), or vallecula (n = 1), and all tumors were clinically staged T2-T4 and N2 or N3, according to the International Union Against Cancer staging system (2002).11 One patient had distant metastases. Patients with a history of head and neck carcinoma or other malignant diseases were excluded. The study was performed with the approval of our institutional review board, and informed consent was obtained from all patients.
MR Imaging Protocol
MR imaging was performed on a 1.5T MR imaging scanner (Sonata; Siemens, Erlangen, Germany) by using a head coil combined with a phased array spine and neck coil. After an axial STIR series with 7-mm sections covering the whole neck area, subsequent images were centered on the area of interest containing the primary tumor and pathologic lymph nodes. Axial images (22 sections of 4-mm section thickness and 0.4-mm gap, in-plane pixel size of 0.9 × 0.9 mm) were obtained with STIR (TR/TE/TI = 5500/26/150 ms, 2 averages) and T1-weighted spin-echo (TR/TE = 390/14 ms, 2 averages, no fat saturation) before and after injection of contrast material. Gadovist (0.1 mL/kg of gadobutrol; Bayer Schering Pharma, Berlin-Wedding, Germany), Dotarem (0.2 mL/kg of gadoteric acid; Guerbet, Aulnay-sous Bois, France), or Magnevist (0.2 mL/kg gadopentetate dimeglumine; Bayer Schering Pharma, Berlin, Germany) was intravenously administered to obtain contrast-enhanced T1-weighted imaging.
DWI with both EPI- and HASTE techniques was obtained for the same 22 sections at the same section position as the axial STIR and T1-weighted images. Parameters for EPI were the following: TR/TE = 5000/105 ms, in-plane pixel size = 2 × 2 mm, and b-values = 0, 500, and 1000 s/mm2 (3 averages). Acquisition time of EPI-DWI was 1 minute 50 seconds. Parameters for HASTE were the following: TR/TE = 900/110 ms, in-plane pixel size = 1.1 × 1.1 mm, and b-values of 0 s/mm2 (3 averages) and 1000 s/mm2 (12 averages). Acquisition time of HASTE-DWI was 5 minutes. ADC maps of both EPI- and HASTE-DWI were calculated on-line or off-line, respectively, by using the software of the scanner.
Analysis of MR Imaging
The MR images were independently evaluated by 2 radiologists with 18 and 4 years of experience in head and neck imaging. Radiologists were blinded to clinical data, and the 2 sets of DWI were evaluated independently from each other; the time gap between assessments was at least 1 week. All visible primary tumors and lymph nodes with a minimal axial diameter ≥5 mm as measured on T1-weighted images were included.12,13 Delineation of primary tumors and lymph nodes was assessed on ADC maps by viewing, at the same time on a second computer screen, the corresponding EPI- or HASTE-DWI (b = 1000 s/mm2) images by using a 3-point scoring system: 1 = poor delineation of lesion, 2 = moderate delineation of lesion, and 3 = good delineation of lesion. Examples of the ease of delineation of primary tumors and lymph nodes on DWI are seen in Figs 1 and 2.
The primary tumors and lymph nodes were first identified on conventional MR images. Then ROIs including the entire primary tumor or each lymph node were drawn on axial ADC maps, regarding the corresponding EPI- or HASTE-DWI (b = 1000 s/mm2) and conventional MR images. When lesions had geometric distortions, a smaller ROI was placed only in the undistorted area of the lesion. Totally cystic or necrotic lymph nodes were excluded to avoid nonvalid ADC values, and the ROI was placed in that area of the tumor or lymph node that showed contrast-enhancement in the corresponding T1-weighted images.
Geometric distortion as a result of susceptibility artifacts was assessed on sections demonstrating the most representative maximal surface of primary tumors and lymph nodes on T1-weighted or STIR MR images. Geometric distortions were defined as modifications in size, contour, and/or orientation on pretreatment EPI- and HASTE-DWI compared with T1-weighted or STIR images.
Statistical Analyses
The interobserver agreement and agreement between pretreatment EPI- and HASTE-DWI ADC values were evaluated by using both ICC and Bland-Altman plots.14 ICC values can range between 0 and 1, with higher ICC values indicating stronger agreement. ICCs can be classified according to Nunnally.15 ICCs > 0.80 are reliable for basic research, and ICCs > 0.90 are necessary for essential assessments with individuals in the clinic.
κ and quadratic weighted κ values were used to define (interobserver) agreement of, respectively, detection and delineation of primary tumors and lymph nodes within and between both DWI techniques. κ values ranged from −1 to 1, and <0.00 indicated no (interobserver) agreement; 0.00–0.20, slight; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, substantial; and 0.81–1.00, almost perfect (interobserver) agreement.16 Significant disagreement between EPI- and HASTE-DWI for the detection and delineation of primary tumors and lymph nodes was assessed by the McNemar and Wilcoxon signed rank tests, respectively. All statistical analyses excluding weighted κ values were performed by using the Statistical Package for the Social Sciences software, Version 15.0 (SPSS, Chicago, Illinois). The software STATA 11.1 (StataCorp, College Station, Texas) was used to calculate weighted κ values to analyze (interobserver) agreement concerning the delineation of primary tumors and lymph nodes on DWI.17
Results
Visualization of Primary Tumors and Lymph Nodes
A total of 12 primary tumors and 77 lymph nodes of ≥5 mm (median size of lymph nodes was 10.1 mm with a range of 5–49 mm) were detected on pretreatment T1-weighted MR images. On DWI, generally a smaller number of lesions was delineated due to lower image quality (Fig 2). The number of detected primary tumors and lymph nodes (in subsequent text referred to as “lesions”) on pretreatment EPI-DWI, which were first identified on conventional MR images, was higher than that on pretreatment HASTE-DWI, with means of 88.5 and 69 lesions, respectively. Delineation of these lesions was also more reliable on pretreatment EPI-DWI: The mean number of well-delineated lesions was 80.5 on pretreatment EPI-DWI compared with a mean of 27.5 lesions on pretreatment HASTE-DWI (Table 1).
For the detection of lesions by pretreatment EPI- and HASTE-DWI, good interobserver agreement of, respectively, 99% and 90% and κ values of 0.85 and 0.74 were found. The McNemar test revealed no significant disagreement between radiologists in the detection of the number of lesions on EPI-DWI, whereas significant disagreement between radiologists regarding the detection of the number of lesions was seen on HASTE-DWI (Table 2). There was moderate interobserver agreement concerning delineation of lesions with pretreatment EPI- and HASTE-DWI of, respectively, 98% and 90% and quadratic weighted κ values of 0.52 and 0.54, along with no significant disagreements in delineation between radiologists for both DWI techniques by the Wilcoxon signed rank test (Table 3).
Agreement concerning detection of lesions between pretreatment EPI- and HASTE-DWI for each observer separately was, respectively, 84% and 75% with κ values of 0.24 and 0.20, with significant disagreements indicated by the McNemar test (Table 4). Agreement regarding delineation of lesions between pretreatment EPI- and HASTE-DWI for each observer separately was, respectively, 78% and 81% with quadratic weighted κ values of 0.12 and 0.07 and significant disagreement indicated by the Wilcoxon signed rank test (Table 5). Agreement regarding delineation of lesions between pretreatment EPI- and HASTE-DWI for each observer separately was, respectively, 78% and 81%. Quadratic weighted κ values were only 0.12 and 0.17, and significant disagreement was assessed by the Wilcoxon signed rank test.
Interobserver Agreements of ADC Values in Lesions on EPI- and HASTE-DWI
For interobserver agreements of ADC values in lesions on pretreatment EPI- and HASTE-DWI, ICCs of 0.76 and 0.92 and associated biases of, respectively, 2.50.10−5 mm2/s and −0.92.10−5 mm2/s were found (Table 6). Corresponding Bland-Altman plots are depicted in Fig 3A, -B. Because there were more lymph nodes detected on pretreatment EPI-DWI, the interobserver agreement was recalculated for EPI-DWI by using the lesions detected on both DWI techniques. The interobserver agreement of pretreatment EPI-DWI indicated by ICC increased to 0.79. Thus, HASTE-DWI demonstrated higher interobserver agreement of ADC values in lesions compared with EPI-DWI. No association was observed between the size of lymph nodes and interobserver variation. However, 4 measurements with the highest variation (those outside the limits of agreement in the Bland-Altman plot, Fig 3A) were relatively small lymph nodes (range, 6–13 mm; mean, 10 mm).
Intraobserver Agreement between ADC Values Obtained with EPI- and HASTE Techniques
Agreements between ADC values had relatively low ICCs of 0.42 and 0.31 and, respectively, biases of −5.74.10−5 mm2/s and −8.62.10−5 mm2/s (Table 7). Corresponding Bland-Altman plots are depicted in Fig 3C, -D.
Artifacts
Susceptibility artifacts frequently caused signal-intensity loss and geometric distortion on EPI-DWI. On HASTE-DWI, no geometric distortion was observed, but signal-intensity loss at the position of the cause of the susceptibility artifacts on EPI-DWI (such as a dental filling) was seen (Fig 4).
On pretreatment EPI-DWI, 10 of 11 primary tumors (91%) showed geometric distortions within the primary tumor region and 16% (11/69) showed geometric distortions within lymph nodes. Primary tumors and lymph nodes on pretreatment HASTE-DWI showed no geometric distortions as a result of susceptibility artifacts. Finally, the presence of movement artifacts caused, for instance, by swallowing, was limited on both EPI- and HASTE-DWI.
Discussion
This study was performed to compare EPI-DWI and HASTE-DWI in primary tumors and lymph nodes in the head and neck region. Although EPI-DWI is used to determine ADC values throughout the body, the clinical use in the head and neck region is still limited.7 Our study confirms that both primary tumors and metastatic lymph nodes can be reliably detected on EPI-DWI. The EPI images, however, did have more geometric distortions and susceptibility artifacts than the HASTE images. Thus, although the interobserver agreement of ADC values within lesions by using EPI-DWI can be considered good with an ICC of 0.79, the interobserver agreement of HASTE-DWI is superior with an ICC of 0.92. According to the criteria of Nunnally,15 HASTE-DWI, with an ICC of >0.90, should be classified as clinically useful. Because both radiologists were initially inexperienced with evaluating DWI in patients with head and neck cancer, these results may be improved in the future. The ICC of EPI-DWI slightly increased from 0.76 to 0.79 by using only the lesions detected on both DWI techniques. This phenomenon may be explained by less precise ADC values of smaller lymph nodes with diameters <14 mm only visible on EPI-DWI. Unreliable ADC values of those lesions may be obtained as a result of susceptibility artifacts or partial volume effects with surrounding pixels.6,18
Poor agreement between ADC values in lesions determined with EPI- and HASTE-DWI can be explained by differences between these MR imaging techniques. Geometric distortions as a result of susceptibility artifacts by using the EPI technique were observed within the tumor region in a majority of tumors (91%), and likely had a large influence on ADC values. Our study suggests that ADC values in lesions of EPI-DWI and HASTE-DWI cannot be directly compared. This problem may be due to the independent ROI analysis of both scan types, which will cause differences in the exact position and size of the ROIs. However, direct comparison of ADC values within identical ROIs was not the aim of the current study.
Definition of anatomic structures may be difficult on DWI because of the reduced image quality of DWI compared with conventional MR imaging. The study of Baltzer et al9 reported a better visibility on EPI-DWI in breast lesions compared with HASTE-DWI. Our results concerning the visibility of lesions in the head and neck region were consistent with this study, though the differences regarding visibility between DWI techniques were more obvious in our study. Observer 1 in our study detected all lymph nodes and observer 2 detected 76 of 77 lymph nodes on pretreatment EPI-DWI, whereas they detected 62 and 52 of 77 lymph nodes on pretreatment HASTE-DWI, respectively. The difference in the detection of lymph nodes between those DWI techniques was explained by detection of only larger lymph nodes on HASTE-DWI. Baltzer et al9 used a classification of no, poor, good, and excellent visibility, and their results showed that 69% of lesions had excellent visibility on EPI-DWI compared with 37% of the total lesions on HASTE-DWI.
Likewise, visualization indicated by delineation of lesions in our study was more accurate on EPI-DWI. In our study, 91% of detected lesions on pretreatment EPI-DWI were described as good compared with 40% of the total amount of detected lesions on pretreatment HASTE-DWI. However, lesions were often scored as “moderately delineated” on HASTE-DWI, which is probably due to the unfamiliarity of the radiologists and blurriness of these types of images. If one considers “moderate and good” delineation together, the percentages increase to 99% for EPI-DWI and 89% for HASTE-DWI; this increase shows a much better correspondence with the high ICC of 0.92 regarding ADC values in lesions on HASTE-DWI. Future studies should confirm current findings and should preferably include interscan and intraobserver reproducibility of determined ADC values.
In contrast to HASTE-DWI, EPI-DWI is prone to susceptibility artifacts.9,19 Geometric distortions as a result of these susceptibility artifacts are the main reason why EPI-DWI performs less well than HASTE-DWI, considering interobserver agreement of ADC values in lesions. Furthermore, no movement artifacts were detected on either EPI-DWI or HASTE-DWI in this study, despite the relatively long acquisition times of HASTE-DWI. Previously, Kim et al5 mentioned the potential limitations of susceptibility artifacts within tumors of the head and neck region on EPI-DWI. Therefore, they evaluated treatment responses in patients with HNSCC only in lymph nodes and not in primary tumors.5 In our study, susceptibility artifacts were obvious on EPI-DWI. The primary tumors were more affected by geometric distortion compared with lymph nodes because oropharyngeal, hypopharyngeal, and laryngeal tumors are placed within the air-tissue interface; this placement may influence the interpretation of DWI.
Conclusions
Primary tumors and lymph nodes are more easily visualized on EPI-DWI compared with HASTE-DWI, probably due to the lower SNR of the latter sequence. EPI-DWI has geometric distortions, however, which have a negative effect on the interobserver agreement of ADC values in lesions. Even in this first application of HASTE-DWI in the head and neck region, it appears to be the most reproducible DWI technique. HASTE-DWI may be a promising technique for the assessment of primary and nodal disease in patients with HNSCC.
References
- Received May 14, 2011.
- Accepted after revision October 21, 2011.
- © 2012 by American Journal of Neuroradiology