Abstract
BACKGROUND: In recent years, clinical practice guidelines have been criticized for biased interpretations of research evidence, and interventional radiology is no exception.
PURPOSE: Our aim was to evaluate the methodologic quality and transparency of reporting in systematic reviews used as evidence in interventional radiology clinical practice guidelines for neurovascular disorders from the Society of Interventional Radiology.
DATA SOURCES: Our sources were 9 neurovascular disorder clinical practice guidelines from the Society of Interventional Radiology.
STUDY SELECTION: We selected 65 systematic reviews and meta-analyses.
DATA ANALYSIS: A Measurement Tool to Assess Systematic Reviews (AMSTAR) and Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) tools were used to assess the methodologic quality and reporting transparency of systematic reviews. Radial plots were created on the basis of average scores for PRISMA and AMSTAR items.
DATA SYNTHESIS: On the basis of AMSTAR scores, 3 (4.62%) reviews were high-quality, 28 reviews (43.08%) were moderate-quality, and 34 reviews (52.31%) were low-quality, with an average quality score of 3.66 (34.32%; minimum, 0%; maximum, 81.82%). The average PRISMA score was 18.18 (69.41%).
LIMITATIONS: We were unable to obtain previous versions for 8 reviews, 7 of which were from the Cochrane Database of Systematic Reviews.
CONCLUSIONS: The methodologic quality of systematic reviews needs to be improved. Although reporting clarity was much better than the methodologic quality, it still has room for improvement. The methodologic quality and transparency of reporting did not vary much among clinical practice guidelines. This study can also be applied to other medical specialties to examine the quality of studies used as evidence in their own clinical practice guidelines.
ABBREVIATIONS:
- ADCA
- Quality Improvement Guidelines for Adult Diagnostic Cervicocerebral Angiography
- AGREE
- Appraisal of Guidelines for Research and Evaluation
- AMSTAR
- A Measurement Tool to Assess Systematic Reviews
- CPG
- clinical practice guideline
- CS
- Clinical Expert Consensus Document on Carotid Stenting
- ECVAD
- Guideline on the Management of Patients with Extracranial Carotid and Vertebral Artery Disease
- PRISMA
- Preferred Reporting Items for Systematic Reviews and Meta-Analyses
- SIR
- Society of Interventional Radiology
- SR
- systematic review
Clinical practice guidelines (CPGs) are used by clinicians to provide patients the most appropriate care. Through the Medicare Improvements for Patients and Providers Act of 2008, the Institute of Medicine created the Committee on Standards for Developing Trustworthy Clinical Practice Guidelines to ensure that CPGs have “information on approaches that are objective, scientifically valid, and consistent.”1 In addition to these standards, CPGs should incorporate high-quality studies, especially high-quality systematic reviews (SRs) when available.
In recent years, CPGs have been criticized for biased interpretations of research evidence, and interventional radiology is no exception. One study compared the American College of Cardiology/American Heart Association guidelines (developed in collaboration with the Society of Interventional Radiology [SIR]),2 with 4 international guidelines for carotid stenosis treatment. Investigators expected recommendations across guidelines to be similar because they drew from the same literature3; however, considerable differences were found between the American College of Cardiology/American Heart Association and the 4 other guidelines concerning the recommendations for carotid artery stent placement and carotid endarterectomy. Investigators noted that the differences may have resulted from bias when interpreting the source literature, and they concluded that the American College of Cardiology/American Heart Association recommendations may be misleading or incorrect. A critical analysis of the underlying evidence in cases like this would help establish the strength and validity of guideline recommendations.
SRs synthesize results from similar studies to produce a pooled-effect estimate. The evidence presented in reviews provides clinicians a means to weigh the outcomes, safety, and efficacy of various procedures and make evidence-based recommendations.4 SRs that include low-quality studies are subject to bias that may decrease the validity of the review and result in misleading conclusions.5 Some of these biases may stem from low methodologic quality, yet it has been established that guideline developers may not always take into account the methodologic quality of the SRs they reference.6⇓–8 For example, publication bias (including only published studies in the SR) may influence the magnitude or direction of summary effect sizes. Language bias may result when only studies published in English are included in the SR.
Tools have been developed to evaluate methodologic quality and transparency in reporting of SRs. The Preferred Reporting Items for SRs and Meta-Analyses (PRISMA) checklist has been acknowledged for its use in critically appraising the reporting quality of SRs and meta-analyses even though it was originally developed for authors to improve the quality of their reviews.9 However, the quality of reporting does not necessarily equate to methodologic quality in SRs; this difference necessitates independent use of tools that assess both qualities.10
A Measurement Tool to Assess Systematic Reviews (AMSTAR) is an 11-item measure used to determine the methodologic quality of SRs.11,12 AMSTAR has been acknowledged as a valid and reliable tool with high interrater reliability, construct validity, and feasibility.13,14
Here, we evaluate the methodologic quality and transparency of reporting in SRs used as evidence in interventional radiology CPGs for neurovascular disorders from the SIR.
Materials and Methods
All search strategies, eligibility criteria, and data abstraction for this study were based on a protocol developed and piloted a priori. This study did not meet the regulatory definition of human subject research as defined in 45 CFR 46.102(d) and (f) of the Code of Federal Regulations of the Department of Health and Human Services15; therefore, it was not subject to institutional review board oversight. To adhere to best practices in reporting, we applied relevant PRISMA guidelines (checklist items 1–3, 5–11, 13, 16–18, 20, 23, 24, 26, 27)9 for SRs and “Statistical Analyses and Methods in the Published Literature” guidelines16 for reporting descriptive statistics. This study was registered on the University Hospital Medical Information Network Clinical Trial Registry (UMIN000023352) (https://upload.umin.ac.jp/cgi-open-bin/ctr_e/ctr_view.cgi?recptno=R000026909). Data and references for SRs from this study are publicly available on figshare (https://dx.doi.org/10.6084/m9.figshare.3502502.v1).
Guideline Selection and Inclusion Criteria
A priori, we used the definition for CPGs developed by the Institute of Medicine as “statements that include recommendations intended to optimize patient care that are informed by a systematic review of evidence and an assessment of the benefits and harms of alternative care options.”1 We included all current interventional radiology guidelines involving neurovascular disorders published by the SIR (http://www.sirweb.org/clinical/svclines.shtml#2). Two investigators (A.B.C. and M.T.) last accessed these guidelines on July 20, 2016. A CPG was eligible if it was recognized by a national government or professional organization, contained recommendations for best practices within neurovascular interventional radiology, and was written in English. For CPGs with multiple versions, the most recent version was included. CPGs without reference lists were excluded.
Systematic Review Selection and Inclusion Criteria
All bibliographies of CPGs within the neurovascular section of the SIR CPG were examined for SRs. Two investigators (A.B.C. and M.T.) independently screened for studies titled SRs or meta-analyses. The selected SRs were then scrutinized for eligibility per the inclusion criteria, and disagreements regarding article inclusion were resolved by consensus. An SR was included under the following conditions: 1) It was reported as an SR, meta-analysis, or both; 2) it was reported in English; 3) it was peer-reviewed and published or currently in press; and 4) it performed a systematic search and synthesis of available evidence.
Data Extraction and Quality Assessment
Before data abstraction, investigators were trained with detailed video tutorials. Three investigators (A.B.C., M.T., G.S.) independently abstracted data from eligible SRs with piloted forms. Following abstraction, each SR was validated by a second investigator, and disagreements were resolved by consensus. Study characteristics were obtained from each eligible SR, including the following: participant population, interventions, number of included studies, sample size across all studies, and study design of included studies. Sample size and study design were recorded as “unknown” if they were not reported or unclear. Investigators scored each SR by using the PRISMA checklist and AMSTAR tool and provided an explanation for each selected answer. Scoring was based on guidelines outlined by Liberati et al17 for the PRISMA checklist and the method described by Sharif et al,14 with recommended changes suggested by Burda et al18 for the AMSTAR tool.
PRISMA Checklist
We assessed the clarity of reporting in eligible SRs by using the PRISMA checklist. The assessment contains 27 items designed to evaluate reporting quality.9 Each checklist item was answered with “criteria met,” “criteria partially met,” or “criteria not met” on the basis of the completeness of reporting. Points were then awarded for each answer as follows: 1 point for “criteria met,” 0.5 points for “criteria partially met,” and 0 points for “criteria not met.” Specific items assessed with the PRISMA checklist are in On-line Table 1.
AMSTAR Tool
We used AMSTAR, instead of R-AMSTAR (the revised version), because AMSTAR is more easily applied. R-AMSTAR has also been criticized for inherent subjectivity and repetitiveness.19
We applied recommended revisions made by Burda et al18 to AMSTAR. These changes focus on improving validity, reliability, and usability in assessing methodologic quality and include changes in the order of items, wording of items and instructions, and modifications to the focus of original items 7, 8, and 11. These recommendations also address aspects noted to be problematic in numerous studies and improve specificity to methodologic quality over the quality of reporting or risk of bias.18,19 However, the additional item described by Burda et al was not included because subgroup analyses are not applicable to all SRs and meta-analyses. The addition of the item complicates scoring of the tool. Additional instructions were provided to investigators if modified instructions were unclear.
Each item was answered with “criteria met,” “criteria not met,” “criteria partially met,” or “not applicable.” The answer “not applicable” was only available on item 10 (concerning publication bias) and was selected if the SR included fewer than 10 primary studies. This modification was made because funnel plot methods lack power to detect true asymmetry when the number of primary studies is <10. Points were then awarded for each answer as follows: 1 point for “criteria met” and 0 points for other answers. Specific items assessed with the AMSTAR tool are in On-line Table 2.
Data Analysis
Total PRISMA and AMSTAR scores for each SR were recorded in separate data sheets. Scores were converted to percentages by dividing the total score by the number of applicable questions. If items 16 and 23 of the PRISMA checklist returned a response of “no additional analyses were performed,” then these items were omitted from the calculation and the score was divided by 25 instead of 27. Similarly, the AMSTAR score percentage was divided by 10 instead of 11 if the SR was rated as “not applicable” on item 10 about publication bias. The AMSTAR score percentage for each SR was used to classify its methodologic quality. To prevent inadvertently lowering the methodologic quality assessment of SRs when publication bias assessment was not applicable, we used an adjusted percentage scale instead of an integer scale.12 A percentage of 0%–33% was classified as low quality; 34%–66%, as moderate quality; and 67%–100%, as high quality.
SRs were separated on the basis of the CPG bibliography in which they were found. Any CPG that had >5 SRs was further analyzed, and the averages for PRISMA and AMSTAR items were calculated. Denominators for these calculations were not adjusted because they were for individual SRs; items 16 and 23 in PRISMA and item 10 in AMSTAR were given scores of 0 for “no additional analyses were performed” and “not applicable,” respectively.
Radial plots were created on the basis of the average score for individual PRISMA and AMSTAR items to clearly exhibit the strengths and weaknesses of each CPG. The radial plots were created by using Excel (Microsoft, Redmond, Washington).
Results
Our initial search of reference lists from 15 SIR CPGs for neurovascular disorders yielded 91 SRs from 9 eligible CPGs (Fig 1). Details of the SR selection are shown in Fig 1. Seven did not contain the terms “systematic review” or “meta-analysis” in the title. During full-text screening, we excluded 8 reviews: Five were previous versions of Cochrane SRs that have since been updated, and 3 did not have the full text available. Characteristics of the 65 included reviews are shown in On-line Table 3. A list of CPGs and included and excluded reviews is available on figshare (https://dx.doi.org/10.6084/m9.figshare.3502502.v1).
PRISMA flow diagram.
A summary of PRISMA and AMSTAR percentage scores can be seen in On-line Table 4. AMSTAR percentage scores indicated that 3 (4.62%) reviews had high methodologic quality, 28 (43.08%) had moderate methodologic quality, and 34 (52.31%) had low methodologic quality, with an average quality score of 3.66 (average percentage score of 34.32%; minimum, 0%; maximum, 81.82%) (On-line Tables 4 and 5). Of 13 SRs in the Quality Improvement Guidelines for Adult Diagnostic Cervicocerebral Angiography (ADCA),20 2 were high-quality; 7 moderate-quality; and 4 low-quality (average score, 5.15; average, 47.97%; minimum, 18.18%; maximum, 81.82%). Of 35 SRs in the Guideline on the Management of Patients with Extracranial Carotid and Vertebral Artery Disease (ECVAD),2 there were 19 low-quality, 16 moderate-quality, and 0 high-quality SRs (average score, 3.40; average, 31.79%; minimum, 0%; maximum, 63.64%). Of 14 SRs in the Clinical Expert Consensus Document on Carotid Stenting (CS),21 there were 0 high-quality, 6 moderate-quality, and 8 low-quality SRs (average score, 3.21; average, 30.13%; minimum, 9.09%; maximum, 50.00%). The average PRISMA score for all SRs was 18.18 (69.41%). PRISMA scores were highly correlated with AMSTAR scores (r = 0.73).
The average score data for AMSTAR and PRISMA items per CPG are shown in Figs 2⇓⇓⇓⇓–7. For interpretation, lines near the perimeter indicate higher performance on that item. A perfect score on all items would result in a circle around the perimeter of the plot.
Average PRISMA scores, ADCA. Refer to On-line Tables 1 and 2 for PRISMA and AMSTAR items. SOE indicates Summary of Evidence; AA, Additional Analyses; ROB, Risk of Bias; SOR, Synthesis of Results; IS, Individual Studies; SC, Study Characteristics; PB, Publication Bias; Obj, Objectives; SS, Study Selection; Conc, Conclusions.
Average AMSTAR scores, ADCA. Refer to On-line Tables 1 and 2 for PRISMA and AMSTAR items. COI indicates conflicts of interest; Lit, literature; Pub., publication; Charac., Characteristics; Comp, Comprehensive.
Average PRISMA scores, ECVAD. Refer to On-line Tables 1 and 2 for PRISMA and AMSTAR items.
Average AMSTAR scores, ECVAD. Refer to On-line Tables 1 and 2 for PRISMA and AMSTAR items.
Average PRISMA scores, CS. Refer to On-line Tables 1 and 2 for PRISMA and AMSTAR items.
Average AMSTAR scores, CS. Refer to On-line Tables 1 and 2 for PRISMA and AMSTAR items.
Discussion
Main Findings
More than half of the SRs received a low methodologic quality score with <5% receiving a high score. These scores indicate a deficiency in the methodologic quality of the SRs cited in the SIR CPGs. Furthermore, some SRs cited as evidence for specific recommendations in a CPG received AMSTAR scores of as little as 0% (On-line Table 4). The PRISMA scores were higher for all SRs, with an average percentage score of 69.41%, indicating that SRs reported information more completely relative to their methodologic quality. The radial plots reveal some consistency among items scored by using the AMSTAR criteria. Items 2 (comprehensive literature searches) and 6 (inclusion of study characteristics) were the most consistently reported items. Items 9 (data synthesis), and 10 (publication bias assessment) consistently met the criteria more often in the ADCA than in the other CPGs. The solid line tends to be close to the center of the AMSTAR radial plot for all CPGs, indicating poor methodologic quality. This is not surprising considering that the average AMSTAR score for all SRs was 34.32%. This is on the low end of moderate quality based on our scale. The PRISMA radial plots are much more varied. The only items consistently near the center of the plots are item 5, review protocol, and item 27, sources of funding. On average, the ADCA had a solid line further from the center than the other CPGs, which implies higher degrees of transparency in reporting. Most interestingly, the PRISMA radial plots for the ECVAD and the CS are nearly identical. This may be due to both of them sharing 8 of the same SRs.
Similar to our findings, the quality of SRs has been found to be low to moderate in fields such as orthopedics,22 urology,23 pulmonology,24,25 gastroenterology,26 neurology,27 psychiatry,10 gynecology,28 and orthodontics.29 Common deficits that led to lower quality scores were the lack of assessment of publication bias, lack of declaration of conflicts of interest, and lack of providing an a priori protocol.
Implications of Results
To our knowledge, this is the first study to review the quality of reporting and methodologic quality in SRs that support CPGs in radiology. There have been studies that review the quality of SRs in various fields in medicine, but few have examined how the quality of reviews impacts the quality of CPGs they support. Our review indicates that the quality of SRs is not taken into consideration in the development of CPGs in radiology. Guideline developers may not acknowledge the importance of the quality of the SRs that support their recommendations.
In addition to PRISMA and AMSTAR, a variety of tools evaluate the quality or risk of bias in CPGs and SRs. Grading of Recommendations, Assessment, Development, and Evaluation is another systematic approach that separates the assessment of quality of evidence from making recommendations and points out that the quality of evidence is not the only issue influencing the strength of recommendations.30 Appraisal of Guidelines for Research and Evaluation (AGREE and AGREE2) may be applied to evaluate the quality of CPGs.31 A newly developed risk of bias in SRs (ROBIS) tool, may be used for reviews involving interventions, etiology, diagnosis, and prognosis.32 Composite use of these tools can help guideline developers and provide their target audiences with a means of ensuring that recommendations have sufficient evidence from high-quality SRs. Thorough use of these tools can be time-intensive, but a general idea of these tools will help practitioners assess the strength of evidence in guidelines.
Guideline recommendations may adhere to the best available evidence regardless of the year this evidence was published; however, new research and advances in imaging techniques will likely require periodic updates to recommendations.33 Brook et al34 evaluated the validity of the 2010 American College of Radiology guidelines for incidental pancreatic findings on CT scans after the 2016 guideline from the American Gastroenterological Association was released. Ultimately, the study concluded that the recommendations of the American College of Radiology needed to be re-evaluated.34 In addition, a review of CPGs from the US Agency for Healthcare Research and Quality found that more than three-quarters of the guidelines were in need of updating and suggested that guidelines should be reassessed every 3 years.35 Furthermore, institutions that develop guidelines should incorporate a protocol to improve their guideline-updating process.36 The SIR guidelines we evaluated may require an update if not a statement explaining their retained validity. Of the 15 neurovascular disorder guideline documents we evaluated, 11 were older than 6 years, 4 of which were published in 2003. In addition, 3 of the 15 documents, published in 2013,37 2011,2 and 2007,21 made statements about revisions being made as needed, but there is no indication of any available updates. In order for practitioners to continue to act in the best interest of their patients, organizations will need to be vigilant to ensure that the CPGs they produce remain up-to-date.
Strengths and Limitations
Several strengths of our study design provide more solid evidence to support our conclusions. The criteria provided by PRISMA and AMSTAR are designed to be inclusive of standard methods of reporting as well as a methodologic study design in their assessment of an SR. Articles written before the inception of these tools should, theoretically, meet most of the criteria requirements. Another strength of the study is that 3 independent investigators had to come to a consensus on quality scoring; this feature ensured a consistent rating system of each SR. Also, the methods of our study are highly reproducible and can be applied to other sources of CPGs.
Our study has limitations. We were unable to obtain the full text for 8 reviews. These reviews would have increased our sample size and given an improved analysis of overall quality. Our study also used only PRISMA and AMSTAR to assess the transparency of reporting and methodologic quality. Different tools for quality assessment could have yielded different overall results. Additionally, we only searched 15 CPGs and limited our scope to SIR guidelines for neurovascular disorders. Future research is warranted to broaden the scope of our study to include guidelines from other areas of radiology.
Implications for Research Practice
To improve the methodologic quality of future research, authors can do the following: include the use and explicit reporting of PRISMA and AMSTAR guidelines, prospectively register the SR on a data base such as PROSPERO (https://www.crd.york.ac.uk/PROSPERO/), and develop a priori protocols. Journals can ensure that the SRs they publish are of higher methodologic quality by updating their submission guidelines so that they require or, at minimum, recommend authors to submit a PRISMA checklist at the time of submission. Peer reviewers may use the PRISMA checklist when completing their reviews and recommend revising components that were not adequately addressed. These efforts would be positive first steps toward improving the quality of SRs.
Conclusions
The methodologic quality of SRs needs to be improved. Although reporting clarity was much better than methodologic quality, it still has room for improvement. The methodologic quality and transparency of reporting did not vary much among CPGs. This study can also be applied to other medical specialties to examine the quality of studies used as evidence in their own CPGs.
REFERENCES
- Received August 2, 2016.
- Accepted after revision November 22, 2016.
- © 2017 by American Journal of Neuroradiology