Cementing the Evidence: Time for a Randomized Trial of Vertebroplasty

Jeffrey G. Jarvik; Richard A. Deyo

Percutaneous vertebroplasty is a technique for treating low back pain that appears to be rapidly disseminating throughout the United States, and now O’Brien et al (page 1555) add their series of six patients to the literature in this issue of the AJNR. Yet, there are still no randomized, controlled trials that compare the long-term outcomes of percutaneous vertebroplasty to a control therapy.

Is it too late for a randomized trial? At last year's meeting of the American Society of Neuroradiology, it was suggested that a randomized trial for vertebroplasty would be unethical because patients would be denied the obvious benefit derived from the technique. In 1997, the World Medical Association issued the Declaration of Helsinki, which contained recommendations for physicians using human subjects in medical research (1). This declaration states, “In any medical study, every patient —including those of a control group, if any, should be assured of the best proven diagnostic and therapeutic method. This does not exclude the use of inert placebo in studies where no proven diagnostic or therapeutic method exists.” But this begs the question, what constitutes necessary and sufficient evidence to prove the efficacy of a therapy? After all, it is equally wrong to advocate the use of a new therapy that has not been shown to be more beneficial than standard treatment as it is to withhold an unproven therapy. Relying on anecdotal case reports and case series may lead to erroneous and harmful conclusions.

The power of modern Western medicine is derived in great part from its close alliance to the world of science (2) by using the scientific method to distinguish what is useful from what is not. Sir William Osler said, “The philosophies of one age have become the absurdities of the next…” (3). The history of medicine is littered with examples of treatments that went unquestioned, yet now provoke amusement or even horror. In the 16th century, surgeons treated gunshot wounds by pouring burning oil over them (4), until Ambroise Pare ran out of oil during an assault on Turin in 1537. He improvised an emulsion of eggs, rosewater, and turpentine and discovered that its use caused less swelling, the patients suffered less, and fewer patients died than when he had treated with the boiling oil. In the 19th century, the medical profession gained tremendous authority by adopting the scientific method to determine the value of medical practices. Empirical evidence showed that commonly accepted practices such as bloodletting had no therapeutic value (2). Simultaneously, emerging disciplines such as bacteriology and epidemiology began to benefit the health of the public in an indisputable and quite visible manner. The marriage between medicine and science was proving to be a great success.

As history has revealed, simple conviction that a treatment works can be horribly misleading. Such mistakes are not relegated to prior centuries. Several reviews of modern medical practice illustrate treatments that were accepted as standard and beneficial, but were found to be useless or even harmful when evaluated by a randomized trial. Although the use of empirical evidence to justify medical practice is a powerful principle, the quality of evidence for making medical decisions has been and continues to be highly variable. Using poorly acquired or incomplete evidence can result in disastrous decisions, and one does not have to look far back in history to find troubling incidents.

Perhaps the most notorious recent example is that of the antiarrhythmics encainide, flecainide, and moricizne. Their story is chronicled in the book Deadly Medicine by Thomas Moore (5). In the early 1980s these newly introduced antiarrhythmics were found to be highly successful at suppressing arrhythmias. On this basis, these drugs were widely promoted and commonly prescribed. Not until a randomized, controlled trial that looked at the ultimate outcomes of patients was performed was it realized that, although these drugs suppressed arrhythmias, they actually increased mortality. The Cardiac Arrhythmia Suppression Trial revealed that postmyocardial infarction patients with mild arrhythmias who were put on these drugs had an excess mortality of 56/1000. By the time the results of this trial were published, at least 100000 such patients had been taking these drugs (5), meaning that 5600 people had died per year because of these agents.

Although there are numerous factors responsible for this medical mishap, the most important lesson that applies to vertebroplasty is that it is dangerous to rely on surrogate outcomes when assessing the benefit of a medical intervention. Proponents of these antiarrhythmics based their favorable impressions on the ability of the drugs to suppress arrhythmias. That arrhythmia suppression would decrease mortality was a reasonable biological hypothesis, but it proved erroneous in the end. Similarly, it seems reasonable that preventing further vertebral body collapse or even possibly restoring height might reduce the pain associated with osteoporotic compression fractures, but this remains to be proven. Fleming pointed out that surrogate outcomes frequently fail to predict the clinical outcome of interest (6). More importantly, the intervention “might also affect the clinical outcome by unintended, unanticipated, and unrecognized mechanisms of action that operate independently of the disease process.” Fleming concluded that, except for rare circumstances, surrogate outcomes should be avoided for definitive phase three trials.

While short-term pain relief augurs well for long-term benefits, no well-controlled study has shown even this short-term benefit. There is the distinct possibility that these short-term benefits will not last, and that in the long run, patients who undergo vertebroplasty might do no better or even worse than a control cohort.

Are case series adequate evidence to form a conclusive opinion? Although they are valuable for providing preliminary evidence, case series are rarely sufficient for making major medical policy decisions. There are several reasons why case series may be misleading when studying low back pain treatments (7). First, the natural history of acute low back pain in general, and the pain associated with osteoporotic compression fractures specifically, is to improve, usually regardless of the type of therapy. Part of this improvement reflects “regression to the mean.” This is a statistical concept that emphasizes that extreme values at one measurement of a variable tend to regress back toward a mean value when measured again. Patients with back pain tend to seek care when their pain is extreme. Regression to the mean implies that when such patients are seen on a follow-up visit, their pain will have improved (regressed to some average level), regardless of interventions. Second, because case series do not have a control group, the placebo effect may play a role in improvement. This effect applies not only to the technique of vertebroplasty, but also to the powerful influence of the enthusiasm and conviction of the physician performing vertebroplasty (8).

The ethical basis for conducting randomized trials relies on the uncertainty as to whether the intervention will be beneficial or harmful. If there is no uncertainty, then there is no need for a trial. If uncertainty does exist, however, then not only is it ethical to perform a trial, but it is necessary to choose the methodology most likely to eliminate the uncertainty. Proponents of a new technology that has been disseminated before it has been rigorously evaluated commonly argue that scientific evaluation would be unethical. In the example of antiarrhythmics cited above, many proponents thought that controlled trials would be unethical because these drugs were effective at suppressing arrhythmias (5). Dixon pointed out that this kind of specious argument is predictable and standard (9). He argued that social forces are more important than scientific forces in determining clinical policies, and that characteristic errors occur in the formation of these policies. One of these errors is the defense of unproven, prematurely disseminated technologies with the argument that it is unethical to stop and rigorously evaluate them.

Nonetheless, ethics insists that we do stop. Vertebroplasty may well be an effective and even cost-effective method for treating low back pain. If the technique is as good as its promoters suggest, then it should be straightforward to demonstrate its efficacy in a well-designed, controlled trial. Whereas reports such as the one by O’Brien et al add to our knowledge of how vertebroplasty can be performed, such articles cannot address if and when vertebroplasty should be done. The time is right to demonstrate the technique's advantages and convince the scientific community, as well as the public, of its worth.

References

↵
World Medical Association Declaration of Helsinki. Recommendations guiding physicians in biomedical research involving human subjects [see comments]. JAMA 1997;277:925-926
CrossRef PubMed
↵
Starr P. The Social Transformation of American Medicine.. New York: Basic Books 1982;
↵
Osler W. Aequanimitas: With Other Addresses to Medical Students, Nurses and Practitioners of Medicine.. Philadelphia: Blakiston 1905;
↵
Gordon R. The Alarming History of Medicine: Amusing Anecdotes from Hippocrates to Heart Transplants.. New York: St. Martin's Press 1993;
↵
Moore TJ. Deadly Medicine: Why Tens of Thousands of Heart Patients Died in America's Worst Drug Disaster.. New York: Simon & Schuster 1995;
↵
Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? [see comments]. Ann Intern Med 1996;125:605-613
CrossRef PubMed
↵
Pocock SJ. Current issues in the design and interpretation of clinical trials. Br Med J (Clin Res Ed) 1985;290:39-42
↵
Turner JA, Deyo RA, Loeser JD, Von Korff M, Fordyce WE. The importance of placebo effects in pain treatment and research [see comments]. Jama 1994;271:1609-1614
CrossRef PubMed
↵
Dixon AS. The evolution of clinical policies. Med Care 1990;28:201-220
CrossRef PubMed