Back to the Tower of Babel: Comparing Outcomes from Aneurysm Trials

Most therapies within the field of interventional neuroradiology have a lack of robust evidence. While several schemas describing the various tiers of “evidence” exist, most reserve the term “level 1 evidence” for that obtained in the context of randomized controlled trials (RCTs). Some

M ost therapies within the field of interventional neuroradiology have a lack of robust evidence. While several schemas describing the various tiers of "evidence" exist, most reserve the term "level 1 evidence" for that obtained in the context of randomized controlled trials (RCTs). Some scales reserve level 1 for therapies vetted in multiple RCTs and then subjected to formal meta-analysis.
During the past several years, numerous RCTs have been performed comparing bare platinum coils with "modified coils," including the HydroCoil Endovascular Aneurysm Occlusion and Packing Study (HELPS), 1 the Cerecyte Coil Trial (CCT), [2][3][4] and the Matrix and Platinum Science (MAPS) trial. 5 These relatively large studies with clearly defined, prospective end points would seem perfectly aligned to provide our field with this latter type of level 1 evidence, especially if pooled data were analyzed in a formal meta-analysis. Alas, the design and reporting of these studies likely will render it difficult or impossible to carry out such an analysis.

Questioning the Research Questions
Each of the 3 modified coil RCTs mentioned above compared the efficacy of new technologies (modified coils) with a similar control group (bare platinum coils) in the treatment of the same disease (ruptured and unruptured intracranial aneurysms). The most relevant question to be answered in these well-conducted RCTs is "What were the primary outcomes of the trials and were there any differences in outcomes between the treatment and control groups?" When examining the results of CCT, HELPS, and MAPS, we would expect that the primary outcomes of the studies were the same or at least similar. Disappointingly, however, this was not the case (Table).
In HELPS, 1 the primary outcome was composite in nature, meaning that either one or another outcome would define success or failure. The first portion of this composite end point was defined as a "major angiographic recurrence" at 18 months. This recurrence was considered as an aneurysm that could "theoretically" be re-treated. The second portion of this composite end point was related to deaths and morbidity that resulted in failure to obtain follow-up. Major angiographic recurrence did not necessarily mean that an aneurysm was re-treated; indeed, actual retreatment rates were approximately one-tenth of the "theoretically re-treatable" rates in both groups. In HELPS, the primary outcome rate was met in 36% in the control group and 29% in the modified coil group (P ϭ .13). The rate of procedure-related morbidity and mortality resulting in no angiographic follow-up between the 2 groups was minimal and nonsignificant between both groups; and as such, the imaging findings represented the major driver of outcomes. The rate of "major angiographic recurrence" was slightly lower in the HydroCoil group than the control group, 24% versus 34%, respectively (P ϭ .049). Overall, the re-treatment rate for both groups was 3% with no statistical significance. Notably, HELPS did demonstrate a difference in composite outcome for ruptured intracranial aneurysms treated with HydroCoil over bare platinum coils.
In MAPS, 5 the primary end point was "target aneurysm recurrence" (TAR) at 12 months. This composite outcome was target aneurysm re-intervention rates, aneurysm rebleeding, or death from unknown cause. Outcomes for the primary end point were similar between groups (14.6% for control, 13.3% for treatment). The re-intervention rate for both groups was approximately 10%, and rupture and death rates were similarly very low. Angiographic occlusion rates, a EDITORIALS secondary outcome, were also similar between the 2 groups. However, the occlusion score in MAPS was fundamentally different from that in HELPS. Specifically, MAPS used the modified (3-point) Raymond scale (complete, near-complete, or incomplete occlusion) comparing initial with follow-up angiography rather than the "major recurrence" scale used in HELPS. As such, there likely is no relevance in comparing the angiographic outcomes between HELPS and MAPS. Most important, however, there was a strong association between the initial modified Raymond scale, and TAR rates indicated that angiographic occlusion was a clinically relevant finding.
Last, looking at CCT, 2-4 we find that a "positive" primary outcome was the following: 1) complete angiographic occlusion at 6 months, or 2) aneurysm improvement, or 3) no change in angiographic appearance. The use of the modified Raymond scale or the "major" recurrence terminology was not part of these primary outcomes. The rate of the primary outcome between the 2 groups was similar (54% versus 59% for control and Cerecyte [Micrus Endovascular, San Jose, California] groups respectively, P ϭ .17). Retreatment rate was 7.7% in the Cerecyte group and 3.5% in the bare platinum group (P ϭ .064), and clinical outcomes were also similar.
Thus, CCT is directly comparable with neither MAPS nor HELPS.
What are the important questions we need to ask when assessing differences in aneurysm treatment modalities or in comparing outcomes among these 3 important trials? The design of HELPS suggests that aneurysm recurrence with or without retreatment is the most important clinical consideration one must make when deciding how to treat an aneurysm. The outcomes chosen for MAPS indicate that the primary outcomes one must consider are the rates of reintervention. Finally, in CCT, the primary outcome considered is the angiographic appearance of the treated aneurysm relative to the initial postprocedural angiography. These 3 different questions in 3 (apparently) similar trials are bound to create only confusion when we evaluate the evidence for guiding our clinical practice.
It is difficult to determine which of these trials was "successful" and which was not. A cursory examination of the numerous clinical trials in intracranial aneurysm treatment suggests that more confusion is on the horizon. When examining the primary outcomes of a number of trials that are currently being conducted, we see that primary outcomes vary widely (Table). Some ongoing trials use composite angiographic and clinical outcomes, and some studies use angiographic outcomes such as occlusion rates, recurrence rates, and the like. Duration of follow-up varies widely among studies. With this lack of conformity in the design of clinical trials, it appears as though we are losing an opportunity to present the medical community with high levels of evidence that our treatments are effective. Given the generally high success rates of endovascular intracranial aneurysm treatment and the low rates of recurrence, morbidity, and mortality resulting from these treatments, very large studies are needed to demonstrate any significant difference. Because it is so difficult to fund and enroll a sufficient number of subjects into such studies, ultimately, our field may depend on meta-analyses of these RCTs to produce the highest levels of evidence. With the increasing variation in measured outcomes of present and future aneurysm trials, it appears that we may be missing our chance to conduct such meta-analyses.

Homogeneity in Angiographic Outcomes
All 3 studies used different scales to simply define angiographic outcomes, rendering it difficult to assess these outcomes across trials. Future studies also present variable means of assessing angiographic outcomes. The problems from this lack of homogeneity stem largely from acknowledging that it is difficult to assess outcomes across trials and it will be difficult to compare outcomes in future metaanalyses. To address the issue of angiographic occlusion, multiple radiologic, neurologic, and neurosurgical societies produced a consensus statement offering recommendations for the conduct of future RCTs. 6 This consensus statement suggested that the obliteration rate be assessed by using the following formula: Obliteration Rate (%) ϭ (Number of Pixels in Embolic Mass)/ (Number of Pixels in Embolic Mass ϩ Number of Pixels in Neck Remnant). The obliteration rate is assessed by using one 2D angiographic projection of the embolized intracranial aneurysm; then, the aneurysm is graded on a 0 -5 scale based on the percentage obliteration rate. The main problem arising from using this new measurement is that data regarding intra-and interobserver variability studies are sparse; thus, it has not been validated. Intra-and interobserver agreement regarding the use of other scales is widely available. Furthermore, the use of one angiographic projection does present problems because we are essentially assessing the 3D property of aneurysm occlusion by using one 2D image.

Core Labs
These 3 RCTs all used core labs to assess angiographic outcomes. These core labs were independent laboratories where angiographic images were reviewed and angiographic occlusion was assessed. The ultimate objective in using core lab analysis is to ensure limited variability in the assessment of outcomes so that data are robust enough to support study findings. All 3 studies emphasized the fact that the neuroradiologists at the core labs were blind to treatment assignment. However, a number of questions should be asked regarding the core labs. First, how many assessors were at each core lab? How many years' experience do they have? How many assessors assessed each image? Was intra-and interobserver agreement assessed in the grading of angiographic occlusion? An-giographically, is there a difference in appearance between a modified coil and a bare platinum coil? Would any of the assessors be able to detect a difference in the 2 coils? Of what quality were the images provided to the core labs? These types of questions are important to minimize or clarify potential biases in the study.
Core lab validation and development of "best practices" are essential to the conduction of future RCTs. With more and more trials being conducted by using a core lab for analysis of images, it is extremely important that certain quality-control measures be applied. Industry and academic leaders in the field should provide a consensus guiding quality control in core laboratories assessing outcomes in clinical trials. The American Society of Echocardiography recently published its standards for echocardiography core laboratories in clinical trials. 7 From a quality assurance perspective, the statement proposes many relevant quality-control measures that can be applied to interventional neuroradiology. Important concepts discussed in this article include the importance of assessing inter-and intraobserver variability within and between core labs. Identifying sources of inter-and intraobserver variability is essential so that these may be addressed to improve the consistency and accuracy of data. Furthermore, trials need to make available training records and levels of experience for core lab employees. Auditing of data obtained from core labs is especially important, whether performed by independent observers or regulatory bodies.

Future Directions
With the rapid industrial and academic growth of interventional neuroradiology, it is increasingly important that industry and academia actively collaborate to produce high level evidence validating various treatment modalities. Several new devices are being placed on the market every year, and clinicians are in dire need of evidence that these devices are effective in the treatment of disease. It is especially important in the design of clinical trials that outcomes be standardized so as to be clinically relevant and allow comparison across trials. In the cardiology literature, objective imaging measurements that have real clinical consequences are often used as outcomes (ie, percentage stenosis, ejection fraction, and so forth). There is a general consensus as to how these measurements should be made, and it is easy to compare outcomes across trials. In neuroradiology however, this is not yet the case. Ultimately, standardization of measurements and outcomes across trials is necessary to produce the highest levels of evidence needed to justify our practice. This is an opportunity that must not be missed. graphic outcome of endovascular coiling in patients with ruptured and unruptured intracranial aneurysms treated with Cerecyte coils compared with bare platinum coils-results of a prospective randomized trial.