Preliminary investigation into sources of uncertainty in quantitative imaging features

https://doi.org/10.1016/j.compmedimag.2015.04.006Get rights and content

Highlights

  • Features measured at end-of-exhale phase did not significantly differ from T20 to T90.

  • Features are highly correlated between end-of-exhale phase and average images.

  • The impact of tube voltage on features may not be statistically significant.

  • The impact of tube current modeled as Gaussian noise was statistically significant.

Abstract

Several recent studies have demonstrated the potential for quantitative imaging features to classify non-small cell lung cancer (NSCLC) patients as high or low risk. However applying the results from one institution to another has been difficult because of the variations in imaging techniques and feature measurement. Our study was designed to determine the effect of some of these sources of uncertainty on image features extracted from computed tomography (CT) images of non-small cell lung cancer (NSCLC) tumors. CT images from 20 NSCLC patients were obtained for investigating the impact of four sources of uncertainty: Two region of interest (ROI) selection conditions (breathing phase and single-slice vs. whole volume) and two imaging protocol parameters (peak tube voltage and current). Texture values did not vary substantially with the choice of breathing phase; however, almost half (12 out of 28) of the measured textures did change significantly when measured from the average images compared to the end-of-exhale phase. Of the 28 features, 8 showed a significant variation when measured from the largest cross sectional slice compared to the entire tumor, but 14 were correlated to the entire tumor value. While simulating a decrease in tube voltage had a negligible impact on texture features, simulating a decrease in mA resulted in significant changes for 13 of the 23 texture values. Our results suggest that substantial variation exists when textures are measured under different conditions, and thus the development of a texture analysis standard would be beneficial for comparing features between patients and institutions.

Introduction

Lung cancer is the leading cause of cancer deaths both globally and within the United States [1]. Non-small cell lung cancer (NSCLC) accounts for 85% of all lung cancer cases [2]. Survival rates in NSCLC have remained low despite progress in imaging and treatment techniques over the past forty years [1]. This problem is compounded by the substantial variability in outcomes among patients of the same stage or risk-group which can make choosing the optimal treatment strategy for any individual patient difficult. Because of these well-known statistics, a variety of research avenues have been explored to produce novel methods of predicting patient outcome or individualizing patient treatment. One method that has recently garnered a lot of attention is the use of image texture analysis to classify patients as high or low risk before they begin treatment.

Texture analysis is a computational method that assigns quantitative values to medical images of tumors. Textures are designed to assess the amount of heterogeneity or identify the patterns within a region of interest(ROI) [3], [4]. As a result, tumors with higher or lower values than a chosen threshold can be grouped and the results correlated to patient outcomes in order to predict survival. Several groups have demonstrated the usefulness of this technique in NSCLC [5], [6], [7], [8], [9], [10]. Taken together, these studies suggest image texture analysis may in future play a large role in the identification of patients with worse prognosis or high risk factors. This could substantially alter the way we select treatment regimens for NSCLC patients and potentially increase observed survival rates.

One obstacle to applying this technique in a clinical setting is the large level of uncertainty in whether features measured at different institutions can be fairly compared. Major sources of uncertainty for features measured from CT images of tumors encompass everything from imaging parameters (such as tube current, tube voltage, exposure, reconstruction algorithm, pixel dimensions, slice thickness, and CT manufacturer) to patient specific factors (such as motion artifacts and tumor size), to variations in feature measurement (such as ROI delineation, software, feature parameters, and pre-processing filters). Identifying the impact of each of these factors is nearly impossible due to the scale of the task and the fact that a ground truth does not exist for quantitative imaging features the way it might for histology or other imaging goals. However, a few of these factors could be controlled for if we knew that these factors have a substantial effect on extracted texture values. For example, while many studies have used textures measured from the entire tumor volume, certain studies used only the largest cross-sectional tumor slice [6], [11]. Whether the results from one technique can be applied to tumors delineated with the alternative technique is not known. Similarly, studies have not previously been conducted on the effect the choice of breathing phase for ROI delineation may have on an individual patient's measurement. It is also still not well understood how the choice of imaging parameters affects the values measured. Because imaging parameters are known to vary from institution to institution, current texture studies are limited to local patients imaged with the same technique. The only study attempting to probe this question found that tube voltage variations had a larger effect on measured values than changes in tube current when textures were measured from a water phantom [12]. However it is not known if this same relationship applies to heterogeneous images such as actual patient tumors.

Our study was designed in order to begin to investigate these specific issues and suggest guidelines for the comparison of texture values obtained under different or unknown conditions.

Section snippets

Patients

For this study, the 4D computed tomography (CT) thoracic scans of 20 patients with stage III NSCLC were obtained. These images are routinely collected as part of the radiation therapy treatment planning process. They are non-contrast images, as is routine in radiation therapy clinics. These images were selected because of the large number of published results demonstrating that image features calculated from non-contrast CT images may be used to predict patient outcomes [5], [6], [9], [11], [13]

Respiratory phase

Table 6 summarizes the results of the Wilcoxon signed rank tests after multiplicity correction comparing the texture values measured from the T50 phase images to the Tavg images and the other breathing phase images (T0–T90). Of the 28 total features, 12 were significantly different when they were measured from the Tavg versus the T50 phase images. The 12 significantly different features included at least one from every category but shape. Features measured from phases closer in time to T50 were

Respiratory phase

From our analysis, the choice of phase did not result in statistically significant differences in values. However, comparing texture values from the T50 and Tavg image sets did result in significant differences for half of the features, including one from each category except for shape. We suggest either the end-of-exhale (T50) or Tavg phase be used in future studies for consistency with already published results. Several features (standard deviation, kurtosis, skewness, all of the shape

Conclusion

Several studies have suggested that image texture analysis may be a useful tool for identifying non-small cell lung cancer (NSCLC) patients at high risk for poor survival. Before this technique can be clinically implemented, the effect of different approaches to measuring texture and of different imaging parameters must be analyzed to ensure features can be reliably and consistently evaluated. This paper investigated the susceptibility of texture features to four variables and may serve as a

Conflict of interest statement

We have no conflicts of interest to disclose.

Acknowledgment

Xenia Fave is a recipient of the AAPM and RSNA Doctoral Fellowship. Molly Cook is a recipient of the AAPM Summer Undergraduate Fellowship.

References (20)

There are more references available in the full text version of this article.

Cited by (77)

View all citing articles on Scopus
View full text