Correlation of the National Institutes of Health Patient Reported Outcomes Measurement Information System Scales and Standard Pain and Functional Outcomes in Spine Augmentation

BACKGROUND AND PURPOSE: The recently developed National Institutes of Health PROMIS initiative provides reliable and valid measures across many health domains. We correlated changes in pain-related PROMIS measures and changes in both an NRS and the RMDI in patients undergoing spine augmentation. MATERIALS AND METHODS: Fifty patients, composed of 26 women (40–91 years of age; mean, 72.6 years) and 24 men (42–78 years of age, mean, 67.5 years) were enrolled in the study. They were asked at initial presentation and at 30 days to rate the intensity of their pain in the past 24 hours by using a 0–10 pain NRS as well at the 23-item RMDI. Study subjects also completed 3 different PROMIS short forms, including physical function, pain behavior, and pain interference. The Spearman correlation was used to assess the correlation between the scales. The RCI × 1.96 was calculated for each measurement tool as an indicator of change. RESULTS: All instruments were responsive to detection of change during 1 month (all, P < .0001). Correlations between changes in physical function, pain interference, and pain behavior PROMIS scores and changes in RMDI scores were 0.37, 0.44, and 0.42, respectively. Direction of changes (declines versus improvements) in RMDI and other scales were the same in approximately 60% of patients. CONCLUSIONS: All measures evaluated had adequate and comparable psychometric properties. The choice of which measure to use depends on the clinical intent of the intervention.

P atient-reported pain severity represents the prime outcome in most spine augmentation studies. Various permutations of pain-reporting techniques, including the visual analog scale, ordinal NRSs, and Box Score-11 Scales have been applied to this patient population. 1 In most previous studies, substantial improvement in pain as measured by these scales following spine augmentation was found in 70%-80% of patients.
Notwithstanding the consistent positive impact on patientreported pain severity noted in most studies, serious shortcomings remain in the assessment of pain in patients treated with vertebroplasty. One recent study demonstrated markedly different pain-severity responses in a single cohort of patients treated with vertebroplasty based simply on different qualifi-ers used for pain questions. These qualifiers, including "worst or best pain" or "pain at rest or with activity," yielded mean preprocedural pain severities as great as 8/10 and as low as 2/10 in the same patient population at the same interview session. Furthermore, pain-severity responses may be impacted by psychosocial and behavioral factors that vary among patients.
The NRS and RMDI have several possible shortcomings compared with PROMIS measures. While the NRS is familiar to most patients, a single item such as an NRS measuring a complex construct like pain likely will have more measurement error than multi-item scales like the RMDI or PROMIS measures; as such, the NRS may be less effective than PROMIS at detecting clinically meaningful changes. The 23-item RMDI is longer than the PROMIS measures. Furthermore, the dichotomous response scale used in the RMDI is less discriminating than the 5-or 6-level response scale in the PROMIS measures.
The PROMIS is a National Institutes of Health infrastructure project charged with using the IRT 2-4 to develop assessments of patient-reported outcomes that are brief and maximally reliable and valid in several health domains, including those related to pain. IRT refers to a family of statistical models that describe the relationship between a person's response to an item on a questionnaire and his or her level of the construct (eg, pain, functioning) that is being measured by the questionnaire. 5 For example, a person experiencing severe pain should respond in a predictable way to questions asking about the severity of the pain. IRT models can be used to explain the relationship between the level of pain experienced by a person and the probability that he or she will answer pain questions a certain way.
As such, PROMIS measures may offer alternatives to common pain scales currently in widespread clinical use. 6 The PROMIS scales most relevant to patients receiving an intervention for back pain include "physical function," "pain interference," and "pain behavior." The scores derived from PROMIS scales have been normalized on the basis of the general population.
PROMIS measures may be useful outcome measures for patients treated with vertebroplasty. However, these measures have never been validated in this patient population. The purpose of this study was to compare the psychometric properties of selected PROMIS measures with 2 commonly used outcomes measures: the pain NRS and RMDI. We hypothesized that the psychometric properties would be comparable among measures and that PROMIS measures will be suitable options for assessing outcomes in this patient population.

Materials and Methods
Our institutional review board approved this prospective Health Insurance Portability and Accountability Act-compliant study, and written consent of all study participants was obtained. Between November 2008 and November 2011, fifty-nine consecutive patients referred to the radiology department for consideration of spine augmentation were enrolled. Potential participants were considered for enrollment in the study if they had a vertebral fracture due to osteoporosis or multiple myeloma (even with metastasis) and were able to answer the questions in English. Exclusion criteria were participation in another spine augmentation trial or not being considered appropriate for spine augmentation. Nine (11.8%) of 59 could not be contacted for the follow-up assessment and were excluded from the analysis.
In the first evaluation, after a physician visit in the evaluation room and after explaining how to answer the questions, patients were asked to complete PROMIS short forms in 3 different domains relevant to pain, including physical function, pain behavior, and pain interference. Patients who were able to complete the forms themselves did so. Patients who could not complete the forms were asked the questions in an interview style. In our practice, the average time to obtain scales is about 5 minutes for all 3 PROMIS forms and 3 minutes for the RMDI. The 10-question physical function short form is focused on the ability to perform various daily activities from self-care (eg, bathing and dressing) to vigorous physical activities (eg, running, strenuous sports; On-line Physical Functioning Form). The 6-question pain interference short form is focused on pain interfering with mental, physical, and social aspects of daily living (On-line Pain Impact Form). The 7-question pain behavior short form focuses on verbal, facial, and bodily expressions of pain (On-line Pain Behavior Form).
Responses to PROMIS questions on a given short form are combined 7 and can be reported as raw scores and/or t-scores. The t-score scale has a mean score of 50 and an SD of 10 in the general population of the United States. For example, a person who has a PROMIS pain interference score of 70 is reporting adverse pain interference 2 SDs worse than the general population average. 8 Higher t-scores indicate greater levels of the construct being measured. Thus, for pain behavior and pain interference, higher scores reflect worse pain, whereas for physical function, higher scores indicate better functioning.
Participants also rated the intensity of their pain in the past 24 hours by using a pain NRS from 0 to 10, with 0 indicating no pain and 10 indicating the worst imaginable pain. The 23-item modified RMDI, a widely used functional outcome scale, was also adminis-

ORIGINAL RESEARCH
tered. Study participants were telephoned 30 days after the initial assessment, at which time they were asked the same sets of questions.

Statistical Analysis
The Spearman correlation was determined to assess the correlation between RMDI outcome measures and PROMIS and NRS scores. The RCI ϫ 1.96 was calculated for the multi-item measurement tools (ie, PROMIS and RMDI) as an indicator of change that is greater than the measurement error of the scale. 9,10 RCI is calculated as ͌2 ϫ SEM. We used the internal consistency reliability (Cronbach ␣) of each tool [11][12][13][14] to derive the SEM. Changes exceeding 2 points on the pain NRS were considered clinically meaningful. 15

Study Participants
Fifty patients met enrollment criteria and were available for follow-up at 1 month (  Table 2 presents baseline and 1-month scores, change scores, and significance levels for the paired t test comparing baseline and 1-month scores. All instruments were responsive to the detection of change during 1 month (all P Ͻ .0001). Mean scores for all domains demonstrated improvement during 1 month (eg, less pain, better physical function). SDs of the mean difference were reduced with some instruments compared with others. A decrease in the SD was seen in PROMIS physical function (6.8 -6.1) when others had increases in this factor. The highest increment in SD was seen in the PROMIS pain interference (7.7-9.6). Correlations between either RMDI or NRS and PROMIS scores were significant in cross-sectional measurements (all P Ͻ .01). Correlations between changes with time in PROMIS scores and changes with time both in pain NRS scores and RMDI scores were significant (P Ͻ .05), except for the crosssectional correlation between NRS and pain behavior at the baseline assessment (P ϭ .10) and the longitudinal correlation between changes in NRS and changes in physical function (P ϭ .15) (Figs 1 and 2).

Rating Pain, PROMIS T-Score, and RMDI
Direction of changes (declines versus improvements) in RMDI and PROMIS physical function, pain interference, pain behavior, and pain NRS was the same in 30 (60%), 32 (64%), 31 (62%), and 26 (52%) patients, respectively. Similarly, in the direction of changes in pain, NRS and other measurements   ranged from 22 (44%) patients with physical function to 27 (54%) patients with PROMIS pain behavior. On the basis of 1.96 ϫ RCI, patients who had changes of Ͼ4.3 points for the RMDI, 5.9 points for physical function, 6.4 points for pain interference, and 4.7 points for pain behavior were considered to have experienced meaningful improvement. Across all measures, approximately 30%-50% of patients achieved clinically significant improvement (Table 3).

Discussion
Our current study demonstrated that 3 PROMIS measures relevant to patients with fracture-related back pain were correlated with measures commonly used in clinical practice, including the pain NRS and RMDI, and that these PROMIS measures very likely are appropriate tools for assessing patient experiences of pain related to vertebral compression fractures. All measurement tools had good responsiveness in patients treated with vertebroplasty. According to this study, with strong correlation between the RMDI and PROMIS scale, nearly equivalent results between the 2 measures can be obtained with PROMIS short forms. The benefit of the PROMIS short forms is that they are able to attain good reliability and responsiveness to change with fewer questions (7-10 compared with 23 in the RMDI).
The correlation between PROMIS pain behavior and RMDI scores was weaker than that for the other 2 PROMIS scales. This finding was expected, given the content of the scales. Unlike the PROMIS behavior scale, RMDI mainly measures functional disability and does not address the emotional and behavioral effects of pain. In analyzing the direction of changes for the different scales, we determined that they were in the same direction in approximately two-thirds of patients.
Several previous studies have evaluated the PROMIS scales in focused clinical populations. Fries et al 2 studied 451 patients with chronic rheumatoid arthritis and showed a strong correlation between PROMIS physical function and Health Assessment Questionnaire or Health Assessment Questionnaire Disability Index in patients with chronic rheumatoid arthritis. A recently published article by Baja et al 18 reported significant correlation between PROMIS computerized adaptive testing tools and their corresponding legacy instruments for assessing Health-Related Quality of Life in patients with cirrhosis. In our current study, correlations between RMDI and PROMIS scales were stronger than those of Fries et al. We also observed good longitudinal correlation among change scores.
Exact determination of clinically meaningful changes in patient-reported outcomes remains difficult to assess. Yost et al 17 defined MID in 6 PROMIS-Cancer scales, including physical function. Their recommended MID was 4 -6 points for physical function and pain interference short forms (an MID for the pain behavior short form was not determined). An MID of 2-3 points for low back pain had already been established for the RMDI. 19 However, because the methodology for establishing the MID differed between the PROMIS and RMDI scales, we used 1.96 ϫ RCI as a standard metric for evaluating change scores. Changes exceeding 1.96 ϫ RCI would be unlikely to occur at P Ͻ .05 in the absence of actual change. 9 This frequently used indicator of change 9 was not designed to determine clinically significant cutoff points for deterioration, focusing only on improvement as a goal of therapy, which is well-accepted as a standard and liberal way of assessing longitudinal changes.
There are several limiting factors that affected our study. The instruments are not anchored to the same recall period. Specifically, when answering the RMDI and NRS, patients are instructed to report on their pain "today," whereas the context for the PROMIS pain behavior and pain interference items is the past 7 days. The PROMIS physical function scale does not specify a recall period for the items. In addition, there is no true criterion standard for detecting meaningful improvement in pain following spine augmentation; as such, our data alone cannot indicate which measure is superior for clinical use. Finally, patients were not asked about medication use, which could have affected their pain ratings.

Conclusions
All measures used in this study offered adequate and comparable psychometric properties. The choice of which measure to use depends on the clinical intent of the intervention. If alleviating pain intensity is paramount, then the pain NRS may be the ideal choice for an outcome measure, whereas for improvement of functioning and decrease in the impact of pain on other aspects of well-being, RMDI, PROMIS physical function, or PROMIS pain interference may be better suited outcome measures.