Lumbar Puncture Test in Normal Pressure Hydrocephalus: Does the Volume of CSF Removed Affect the Response to Tap?

BACKGROUND AND PURPOSE: There is limited evidence to support the use of high-volume lumbar taps over lower-volume taps in the diagnosis of normal pressure hydrocephalus. The purpose of this study is to detect whether the volume of CSF removed from patients undergoing high-volume diagnostic lumbar tap test for normal pressure hydrocephalus is significantly associated with post–lumbar tap gait performance. MATERIALS AND METHODS: This retrospective study included 249 consecutive patients who underwent evaluation for normal pressure hydrocephalus. The patients were analyzed both in their entirety and as subgroups that showed robust response to the lumbar tap test. The volume of CSF removed was treated as both a continuous variable and a discrete variable. Statistical tests were repeated with log-normalized volumes. RESULTS: This study found no evidence of a relationship between the volume of CSF removed during the lumbar tap test and subsequent gait test performance in the patient population (Pearson coefficient r = 0.049–0.129). Log normalization of the volume of CSF removed and controlling for age and sex failed to yield a significant relationship. Subgroup analyses focusing on patients who showed greater than 20% improvement in any of the gait end points or who were deemed sufficiently responsive clinically to warrant surgery also yielded no significant relationships between the volume of CSF removed and gait outcomes, but there were preliminary findings that patients who underwent tap with larger-gauge needles had better postprocedure ambulation among patients who showed greater than 20% improvement in immediate time score (P = .04, n = 62). CONCLUSIONS: We found no evidence to support that a higher volume of CSF removal impacts gait testing, suggesting that a high volume of CSF removal may not be necessary in a diagnostic lumbar tap test.

I diopathic normal pressure hydrocephalus (NPH) is a debilitating disorder characterized by a triad of gait disturbance, cognitive impairment, and urinary incontinence. 1,2 Patients correctly diagnosed with NPH who undergo ventricular shunt placement may have marked improvement of their symptoms. 1 Selection of patients who will undergo ventricular shunt may be challenging. When based solely on patient history, basic neuroimaging with MR imaging or CT, and neurologic testing, the positive predictive value of patient improvement after shunting ranges from less than 50% to 61%. 3 The diagnosis of NPH is characterized by several neuroimaging features, including ventricular dilation in the presence of normal gray matter volume, but despite a very large number of publications since 1964, the role of neuroimaging in predicting the response to shunting remains uncertain. 4,5 Tarnaris et al 6 performed a literature review of 69 studies published between 1980 and 2006 to examine the role of structural as well as functional imaging in providing biomarkers of favorable surgical outcome in NPH. The papers reviewed included studies of structural CT and MR imaging features; 4,6-9 phase-contrast MR imaging studies of aqueductal CSF velocity and stroke volumes; [10][11][12] and functional studies including xenon-enhanced CT, 13 FDG-PET, 14,15 singlephoton emission CT, 16 and MR imaging spectroscopy. 17 Studies showing the value of the individual techniques are often coun-tered by studies showing contradictory results. The authors conclude that at present, no single imaging technique may assist clinicians in selecting patients for shunt placement and that invasive studies will remain the mainstay of the process of selecting patients for CSF diversion procedures. 6 The high-volume lumbar tap test (LTT) is one of the most widely used invasive tests to predict shunt response. It consists of a lumbar puncture wherein a large volume (typically 40 -50 mL) of CSF is removed, with gait testing occurring before, 1-4 hours after, and 24 hours after the LTT. Transient recovery in gait after the LTT has been considered a positive prognostic indicator for surgery, but no response after the LTT warrants further investigation. [18][19][20][21][22][23][24] A 2016 review by Mihalj et al 25 found the LTT to have an average published sensitivity of 58%, specificity of 75%, and accuracy of 62% in predicting positive response to shunt over 8 included studies. Complications that may compromise the effectiveness of high-volume LTT include headache and pain that may be pronounced enough to compromise gait testing. 21 External lumbar drainage testing requires hospital admission and placement of a lumbar intrathecal catheter for CSF drainage at 10 mL per hour for 72 hours. Studies evaluating external lumbar drainage routinely exclude patients who responded to the LTT, making it difficult to directly compare external lumbar drainage and LTT. 24 External lumbar drainage has a reported sensitivity between 60% and 100% and a specificity between 80% and 100%. 26 There are several protocols for invasive testing of Ro, the impedance of CSF flow by absorption pathways, to predict shunt response. Given the heterogeneity of techniques and limited studies, sensitivity ranges from 58%-100% and specificity ranges from 44%-92% between protocols and research series. 24,27,28 Complication rates similarly vary, with up to 20% in 1 series of 107 patients reporting headaches after fluid infusion for impedance testing. 24,29 Given the relative ease of administration and low incidence of complications, the LTT is routinely performed as the first test to determine whether a patient will respond to shunting. Those patients who do not respond to the LTT can progress to external lumbar drainage for further evaluation. 26 Although the LTT is commonly performed with large-volume CSF removal, to our knowledge, no study has addressed the optimal amount of CSF required to be removed to have an accurate test. Contrary to the high-volume LTT used today, in the original description of NPH by Hakim and Adams, 1 they noted improvement the next day in patients after an LTT removing only 10 -15 mL of CSF. In 1982, Wikkelsø et al 30 proposed the drainage of 50 mL of CSF, but future studies did not corroborate that this was an optimum volume. We examined the relationship between the volume of CSF removed and change in gait patterns among patients with clinically diagnosed NPH.

Institutional Review Board Status
This anonymous, retrospective, single-center study was exempt from institutional review board approval and patient informed consent.

Patients and Protocol
We analyzed 249 consecutive patients with a clinical diagnosis of NPH who were treated by a standard protocol at an academic center that specializes in brain aging, between 2000 and 2013.
Patients were referred from a variety of academic and outside clinical sources to an academic specialty clinic that has focused on NPH for more than 20 years. Patients were referred for evaluation because of symptoms related to NPH, especially gait impairment with associated urinary incontinence or gait impairment with associated enlarged ventricles on MR or CT studies concerning for hydrocephalus. Patients received detailed neurologic and medical evaluation by a neurologist who specialized in NPH diagnosis and management. Patients were referred for an LTT if the clinical judgment suggested NPH to determine whether the spinal tap resulted in gait improvement. Image analysis was performed by neuroradiologists who were asked to evaluate for NPH and imaging evidence of other etiologies of dementia and were not blinded to patient history. Lumbar taps were performed by attending neuroradiologists or neuroradiology fellows under direct supervision. All taps were performed under fluoroscopic guidance. The neuroradiologists performing the LTTs were instructed to remove up to 50 mL of CSF and had no stated minimum volume of CSF to remove; the designated spinal needle was 18G as per procedure protocol. In a subset of cases, a 20G needle was used as per the performing neuroradiologist's preference. The volume of CSF removed was determined by summing the fluid in each of the vials used to collect the fluid; the vials were calibrated with volume markers 1-8 mL.
The volume of CSF removed was measured during the procedure and recorded in the clinical notes. We excluded patients who required multiple attempts to obtain a lumbar tap, including those patients who required 2 or more appointments because of unsuccessful taps and those patients who required multiple attempts within the same appointment. The number of attempted taps was based on postprocedure medical notes. Clinical diagnosis was based on the presence of ventriculomegaly on imaging and clinical symptoms of incontinence, gait disturbance, and dementia. Patients received gait testing before, 1-2 hours after, and 24 hours after the LTT. Gait testing consisted of the time it took to walk 30 meters (time score) and a composite functional ambulation performance (FAP) score calculated by a Gaitrite machine (CIR Systems, Franklin, New Jersey). FAP ranged from 0 (unable to walk) to 100 (optimal walking ability). 31 Both time score and FAP score were measured 2-5 times at each gait test; we used the mean score of each test. Thus, we had a total of 4 assessment values for gait after LTT: time and FAP scores immediately after the LTT and time and FAP scores 24 hours after the LTT. Both velocity 32 and Gaitrite FAP 31 have been found to improve in patients with NPH after LTT, with those patients showing larger improvement more likely to respond to shunt surgery.

Statistics
We used IBM SPSS 20 (IBM, Armonk, New York) and R Studio (http://rstudio.org/download/desktop) for our statistical analyses. We analyzed all 249 patients as a group. We created 4 additional overlapping subgroups consisting of patients who showed a 20% or greater improvement in any of the 4 LTT measures. We created an additional subgroup of patients who were deemed clinically responsive to the LTT and subsequently advanced to shunt surgery. Clinical response reflected the treating neurologist's judgment on whether the patient showed sufficient improvement in measured outcomes like gait, as well as in patient-reported measures such as urinary incontinence, to warrant placement of a shunt. The current literature is ambiguous about defining what merits an effective response to an LTT, with no definitive guidelines followed by clinicians. 19,21,32 We defined our cutoffs to maximize the magnitude of improvement while maintaining a sufficient sample size to do meaningful statistical analyses.
We used regression analysis on our total sample and our subgroups to look for a correlation between the volume of CSF removed in LTT and the percentage change in any of our 4 gait assessments (relative to baseline time and FAP scores recorded before the LTT.) A partial correlation was run to determine the relationship between the volume of CSF removed and a patient's gait improvement measures while controlling for age and sex. We used Student t tests to examine whether needle gauge (18G versus 20G) had any association with the gait assessments. We used 2 analysis to look for an association between LTT volume ("high" or "low," where high was defined as Ն40 mL) and response to LTT ("response" or "no response," using 20% improvement as a cutoff for response).

Patient Characteristics
Patients had an average age of 77.4 years (range, 52-91 years) at the time of LTT; 45% were female and 55% were male.

Entire Sample Analysis
Regression analysis was performed to assess the relationship between the volume of CSF removed and the percentage change in each of the following tests: time and FAP scores immediately after the LTT and time and FAP scores 24 hours after the LTT. Pearson coefficients ranged from r ϭ 0.049 -0.129. Given the skew in the volume of CSF removed, the analysis was repeated with logarithmic normalization of the volume of CSF removed. This correction yielded no significant changes in these results. Using partial correlations that control for age and sex, we found no relationship between CSF removed and the 4 measures of gait changes: ⌬T% (r ϭ 0.006, P ϭ .935), ⌬T24% (r ϭ Ϫ0.034, P ϭ .658), ⌬FAP% (r ϭ 0.121, P ϭ .131), and ⌬FAP24% (r ϭ 0.128, P ϭ .126). Next, we assessed the importance of the needle gauge used for the LTT (18 versus 20). There was no significant association between needle gauge and improvement (P ϭ .283-.410).

Subset Analysis
Given that not all patients in our cohort responded to the LTT, the analyses of the possible effects of the volume of CSF removed on response were repeated on subsets of patients who showed robust response. Thus, subsets of patients were created based on either showing a 20% improvement in any of the 4 gait tests or proceeding to shunt surgery. The 5 overlapping subsets, therefore, were patients who showed a response to time score immediately after the LTT (n ϭ 62), patients who showed a response 24 hours after the LTT (n ϭ 60), patients who showed a response to FAP score immediately after the LTT (n ϭ 31), patients who showed a response to FAP score 24 hours after the LTT (n ϭ 23), and patients who underwent shunt surgery (n ϭ 97).
Regression analysis yielded no significant correlation between volume of CSF removed and any of the 4 end points, with no significant change in results with log normalization of the volume of CSF removed. The Figure shows 4 plots of improvement in the end point versus the volume of CSF removed of the nonnormalized data. There was a significant finding, however, between needle gauge and certain end points in some of the subgroups. Among patients who showed an improvement in time score immediately after the LTT, patients whose taps involved a largerbore needle showed a statistically significantly greater improvement in immediate time score (95% CI, 33.1% Ϯ 5.0% for 18G versus 27.0% Ϯ 2.5% for 20G; P ϭ .04). An analysis of subgroup characteristics showed no significant difference between the 18G and 20G groups in terms of the volume of CSF removed (P ϭ .43), sex (P ϭ .17), and age (P ϭ .18). Among patients who showed improvement in time score 24 hours after the LTT, patients whose taps involved a larger-bore needle had a nonsignificant tendency to have greater improvement in 24-hour time score (95% CI, 32.3% Ϯ 3.7% for 18G, 26.5% Ϯ 4.4% for 20G; P ϭ .06). Again, groups were similar in the volume of CSF removed (P ϭ .72) and sex (P ϭ .99), but this time differed significantly in age (mean of 20G patients was 87.5 years versus 82.1 years for 18G patients; P ϭ .021).
In our final subgroup analysis, we divided patients into those who received higher-volume LTTs (Ն40 mL; n ϭ 176) or lowervolume (Ͻ40 mL; n ϭ 73) LTTs. There was no difference between these groups in terms of age (P ϭ .93) or sex (P ϭ .40). 2 analysis was performed to assess for a relationship between higher or lower volume of CSF removed and response to the LTT ("response" or "no response"). No statistically significant relationship was found (P ϭ .19 -.90) for any of the 4 LTT end points.

DISCUSSION
This study examined the relationship between the volume of CSF removed in an LTT and changes in patient gait. As prior studies noted, gait disturbance is the symptom that shows the most dramatic improvement after an LTT and shunting. 19 The main conclusion of the current study is that within the 28 -50 mL range (values between the 5th and 95th percentile), there is no significant association between the volume of CSF removed and gait outcomes.
Although high-volume spinal tap has been established as an accurate method to diagnose NPH and to identify patients who will likely benefit from shunt surgery, to our knowledge, there is no consensus on the amount of CSF required to be removed. The first formal study of the LTT by Wikkelsø et al 30 in 1982 removed 40 -50 mL of CSF in LTT, but the authors noted that there was no prior evidence that this was the ideal range. Prior reports, including the original description by Hakim and Adams 1 , frequently removed much smaller volumes of CSF.
In the current study, there was no association between the volume of CSF removed and improvement of gait testing after an LTT. The range of CSF removed in patients who showed a Ն20% improvement in a gait test was 15-55 mL. Our results indicate that as little as 15 mL removed in an LTT may be enough to have a good outcome.
This study took several steps to reduce the probability that a significant relationship between the volume of CSF removed and gait outcomes was missed. First, this study looked at a sample (n ϭ 249) that was much larger than typical NPH studies. In choosing a large sample, we had significantly more statistical power to detect weak relationships.
Second, several steps were taken to ensure that the distribution of the volume of CSF removed did not impact results. Given that practitioners currently aim for LTTs to remove 50 mL of fluid, the distribution of volume removed was skewed toward higher volumes of CSF. This skew, in turn, could impact regression analysis. Log normalization of the data, however, failed to produce significant results. Furthermore, the 2 analysis treated the volume of CSF removed as a categoric variable, which would minimize the effects of skew on results. The 2 analysis also failed to show a significant relationship.
Third, given that the data from patients who did not respond to an LTT may mask trends in patients who did respond, a subset analysis was performed that looked only at patients who showed response to an LTT. There is no consensus on what cutoff should be used to determine response, so this study chose 20%, the highest cutoff the authors found in the literature. Because of the high cutoff, these subsets consisted only of patients who showed a very robust response to an LTT. These were the patients we suspected would most likely show a relationship between the volume of CSF removed and LTT outcomes if such a relationship existed. Regardless, there was no such evidence from these subsets.
Fourth, given that some patients may show a response to an LTT that manifested in ways that were appreciated clinically but not necessarily demonstrated in gait testing, this study looked at a subset of patients who went on to receive shunts. As with the other subsets, this subset failed to demonstrate a significant relationship between the volume of CSF removed and any of the gait outcomes.
Given the lack of evidence that the volume of CSF removed by an LTT correlates to gait outcomes, an alternative explanation for why patients show improved gait after an LTT is that there is passive flow of CSF from the puncture site created by the needle. If this were the case, then the data would show that an 18G needle (which has twice the surface area as a 20G needle) might lead to better gait outcomes. In line with this theory, we found preliminary evidence for a potential role of needle gauge in determining LTT outcomes. There was a relationship between larger-bore needles and improved immediate time score among patients who showed a Ն20% improvement in immediate time score (P ϭ .04) and between larger-bore needles and 24-hour time score in patients who showed a Ն20% improvement in 24-hour time score (P ϭ .06). Of note, there was no difference in baseline features between the 18G and 20G patients who showed a significant response in immediate time score. Among the patients who showed response at 24 hours, the 20G patients were significantly older (87.5 years versus 82.1 years).
The major weakness in this study is its retrospective nature and lack of randomization. Although we demonstrated no difference in age or sex between the groups that had higher or lower volumes of LTT, other differences may exist between the subgroups (eg, failed taps) that may confound the data. The ideal follow-up study should have a randomized, prospective design to demonstrate the lack of clinical benefit in draining 50 mL rather than 30 -35 mL of CSF. In our data analysis, however, we did find that no patient characteristics such as age, sex, or needle gauge correlated with CSF removed, so the volume of CSF removed was essentially a random variable in this retrospective investigation.