The value of the cerebrospinal fluid tap test for predicting shunt effectiveness in idiopathic normal pressure hydrocephalus

Background The cerebrospinal fluid (CSF) tap test (TT) has been regarded as an important test for the prediction of shunt effectiveness in patients with suspected idiopathic normal pressure hydrocephalus (iNPH). Although its specificity and sensitivity are reportedly high, there remains some disagreement over this point. Herein, the TT as a test for predicting shunt effectiveness was investigated in our multicenter prospective study named SINPHONI and strategies to increase its predictability were examined. Methods One hundred suspected iNPH patients with the following entry criteria were enrolled in the study: (1) 60 to 85 years old, (2) one or more of the NPH triad signs, (3) ventriculomegaly (Evans index > 0.3), (4) high convexity tightness in coronal-section MRI, and (5) no antecedent disorders. Changes in NPH triad symptoms were assessed using the iNPH grading scale and other measures before and after removal of 30 ml lumbar CSF. A positive response to TT was pre-defined by specific improvements on the grading and other scales. A ventriculoperitoneal shunt was performed with a programmable valve. The sensitivity and specificity of the TT was calculated with a contingency table. A decision tree analysis was performed to increase the predictability of the TT. Results Among 100 patients, 80 were shunt responders. A statistically-significant variable between shunt responders and non-responders was CSF pressure. The changes in single variables in the iNPH grading scale after TT showed high specificity with low sensitivity. In contrast, change of the total score in the iNPH grading scale showed a relatively high sensitivity of 71.3% with specificity of 65%. A decision tree analysis revealed that using the iNPH grading scale total score and pre-shunt CSF pressure ≥ 15 cmH20, sensitivity increased to 82.5%, without a decrease in specificity. Conclusions The sensitivity and specificity of the TT for predicting shunt responsiveness were optimum when improvement on any iNPH grading scale was combined with CSF pressure ≥ 15 cmH20. To increase the sensitivity of the TT, further effort is necessary. Trial Registration This study is registered with ClinicalTrials.gov, with the number NCT00221091.


Background
Idiopathic normal pressure hydrocephalus (iNPH) is a cerebrospinal fluid (CSF) shunt-responsive syndrome involving gait disturbance, dementia and urinary incontinence without antecedent disorders, in the elderly. Hakim and Adams first reported improvement of NPH symptoms by removal of 15 ml CSF using a lumbar tap [1]. Wikkelsø et al. reported that the tap test (TT) with removal of 40-50 ml CSF was useful for diagnosis and the prediction of shunt response in NPH patients [2]. Since then, there have been a number of studies using removal of CSF volumes via a lumbar tap to predict shunt effectiveness in iNPH patients [3][4][5][6][7][8][9][10]. Because it is easy to perform in neurosurgical and neurological clinics, the Japanese guidelines for management of iNPH recommended TT as an initial invasive test [11,12]. The specificity and sensitivity are reportedly high, but there is some disagreement regarding this between different reports [3][4][5][6]. Continuous lumbar drainage for several days with removal of a large CSF volume has been reported to have high sensitivity and specificity [13][14][15][16][17], but it is more invasive for elderly patients that have difficulty in gait, cognition and/or urination. From a clinical standpoint, the effort in performing a TT to increase the predictability of shunt effectiveness is worthwhile, but there has been no prospective validation study in a large number of iNPH patients. In this study, the predictive value of TT was investigated in patients with iNPH using data from a multicenter, prospective study named "Study of idiopathic normal pressure hydrocephalus on neurological improvement; SINPHONI [18]. Special attention was paid to sensitivity and specificity for a number of variables measured before and after the TT.
This study is registered with ClinicalTrials.gov, with the number NCT00221091.

Patients
In 2004, a multicenter, prospective study of idiopathic normal pressure hydrocephalus (SINPHONI) was conducted in Japan [17]. Briefly, it was designed to validate the diagnostic importance of high-convexity tightness in coronal-section MRI [19] with the results of shunt surgery using a programmable valve. The entry criteria were as follows; (1) 60 to 85 years old, (2) one or more of the NPH triad symptoms, (3) ventriculomegaly (Evans Index > 0.3), (4) high-convexity tightness in coronal-section MRI, and (5) no antecedent disorders. The study consisted of one-year registration and one-year followup, and was completed in 2006. Data were obtained from 100 patients. The study was a multicenter prospective cohort study conducted in compliance with the Guidelines for Good Clinical Practice and the Declaration of Helsinki (2002) of the World Medical Association. The institutional review board at each site approved the study protocol, and all participants (or their representatives when applicable) gave written informed consent for participation.

Tap test
A lumbar tap with removal of 30 ml of CSF was performed in all patients. CSF pressure (CSFP) was measured at the site of puncture. Before and after the tap, all patients were evaluated using the iNPH grading scale (GS) [8], the Mini-Mental State Examination (MMSE) and the 3-meter timed up-and-go test (TUG). The iNPHGS is a clinician-rated scale to rate separately the severity of each of the triad symptoms of iNPH (disturbances of gait, cognition and urination). The score of each domain ranges from 0 to 4. Grade 0 indicates normal and grade 1 indicates subjective symptoms but no objective disturbance. Grade 2, 3 and 4 indicate mild, moderate and severe disturbances, respectively. The change of gait was evaluated 1 or 2 days after the tap, while change of cognition and urination was evaluated at one week. Assessment was done by neurosurgeons in most cases. Response to the TT was pre-defined by three major scales: iNPHGS, TUG and MMSE. An improvement in one point or more on the iNPH grading scale (each domain and their total), more than 10% improvement in time on TUG, or more than 3 points improvement in the MMSE was regarded as TT-positive. Improvement in any of the total scores of iNPHGS, TUG or MMSE was defined as positive with an additional variable of Tap-any. The sensitivity and specificity of these pre-defined variables as predictors of a response to shunt surgery were calculated. Furthermore, to increase predictability in the responders during clinical practice, a decision tree analysis was applied.

Shunt surgery
A ventriculo-peritoneal shunt with a Codman-Hakim programmable valve™ (Codman, Johnson and Johnson, Raynham, MA, USA), with the initial pressure setting determined from a quick reference table [20] was installed in all patients within two months after registration. The modified Rankin scale (a scale for measurement of disability) [21] was used as the primary outcome measure, and iNPHGS, TUG and MMSE as secondary outcome measures. Assessment was performed before, and repeated at 3, 6, and 12 months after surgery to determine which patients were shunt responders. A shunt responder was defined as someone who showed an improvement of one point or more on the modified Rankin scale over 12 months.

Data analysis and statistics
Statistical analysis was performed using JMP statistical software version 9 (SAS Institute, Cary, USA). Statistical comparison was made between shunt responders and non-responders on baseline data, and pre-tap state of iNPHGS, TUG and MMSE (Table 1). Baseline variables include age, Evans index, and CSFP. Pre-tap variables included scores of the three iNPHGS domains (GS-Gaitpre, GS-Cogn-pre, GSs-Urin-pre) and their total scores (GS-Total-pre), MMSE scores (MMSE-pre), and TUG completion times (TUG-pre). These variables were compared between shunt responders and non-responders using chi-squared test. TT-positive patients were counted for each of the variables (GS-Gait-change, GS-Cogn-change, GS-Urin-change, GS-Total-change, MMSE-change and TUG-change), and their sensitivity (%) and specificity (%) were calculated using contingency table. Positive predictive values were not calculated, since they would have been affected by the high prevalence of iNPH in the patient group. Furthermore, a decision tree analysis was performed to determine a practical method for selecting shunt responders with higher sensitivity and specificity. The variables included age, Evans index, CSFP, GS-Total-change, TUG ≥ 10% and MMSE ≥ 3. The former and latter three variables were regarded as continuous and nominal data, respectively. The level of statistical significance was set to p < 0.05.
In this study on the diagnostic performance of TT in a total of 100 patients, 80% were shunt responders during the one-year follow-up. Among the 80 shunt responders, improvement of one, two, three or four points on the modified Rankin scale was found in 43, 27, 8 and 2 patients, respectively. Comparison of the preoperative variables between shunt responders and non-responders showed a statistically significant difference for CSFP in that the CSFP was higher in shunt responders, p < 0.05 (Table 1). There were no significant differences in Evans index or severity of GS symptoms, TUG or MMSE. The incidence of severe adverse events (SAE) was statistically higher in the non-responders (p < 0.005). Among the non-responders, pneumonia was noted in three and surgery-related complications in two (shunt malfunction and bowel injury), while vascular events including cerebral and cardiac infarction in three and femoral fracture in two, occurred among the responders ( Table 1).
The sensitivity and specificity for each of the variables were calculated from the number of true positives, true negatives, false positives and false negatives ( Table 2). The highest sensitivity was for Tap-any at 92.5%, but its specificity was low at 20%. The highest specificity of 85% was noted on GS-Cogn-change and GS-Urinechange. However, their sensitivity was below 40%. GS-Total-change showed 71.3% sensitivity and 65% specificity. Thus, the sensitivity and specificity changed with different variables and improvement of total score in iNPHGS, which showed sensitivity of 71.3% and specificity of 65%, was most promising among the pre-defined variables.
To increase predictability of the TT, a decision tree analysis was applied using the variables of age, Evans index, CSFP, GS-Total-change, TUG ≥ 10% and MME ≥ 3 ( Figure 1). GS-Total-change was selected as the first node followed by CSFP ≥ 15 cm H 2 O as the second node for differentiating the remaining patients. Using this calculation, the sensitivity was 82.5% and the specificity was 65%.

Discussion
The response to a lumbar tap test (TT) is considered to be useful for predicting a favourable response to shunt surgery, particularly in iNPH patients. In previous studies, the volume of CSF removed has varied from 30 ml [6,8], 40 ml [4,7], to 50 ml [2], or until pressure was lowered to zero [5]. In the present study, 30 ml CSF was selected because it was less invasive for the elderly patients. One of purposes in the SINPHONI study was to clarify the sensitivity and specificity of the removal of 30 ml CSF for predicting the response to shunt surgery. The present study was designed to detect the change of symptoms as efficiently as possible, after one or two days after the TT for gait and after one week for cognition and urination. Improvement of gait after removal of CSF, was most commonly seen and it could be observed within one or two days after the tap. Recently, Virhammar et al. recommended assessment of gait within 24 hours [10]. Improvement of cognition and/or urination is usually more delayed, which was experienced through our preliminary studies including the report by Kubo et al. [8]. One disadvantage of the study design was that assessment of iNPHGS, TUG and MMSE was not performed by the same person throughout. This may have caused some inconsistency in the results. This is in contrast to the report by Kubo et al. [8]. The MMSE alone would not have been adequate to assess the response to TT. However, it is popular for assessment of cognition in general. Examining the prognostic value of the MMSE was one of objectives in the SINPHONI study.
The sensitivity and specificity of the TT have been reported previously as ranging from 72% to 100% for the former and from 33% to 100% for the latter [3,5,7,8]. The specificity of the TT was reported to be high with low sensitivity [3,7,8], but another report was contradictory [5]. In the present study, the specificity of gait domain was 80% but sensitivity was 51.3%. The cognition and urination domains showed a specificity of 85% in both, but a low sensitivity of 25% and 37.5%, respectively. Thus, the present study revealed a high specificity with low sensitivity in each domain of iNPHGS, which agrees with previous reports [3,7,8]. In contrast to each domain of the iNPHGS, the total GS score showed higher sensitivity of 71.3% but lower specificity of 65%. Among pre-defined variables, the calculated variable of Tap-any showed the highest sensitivity of 92.5%, but the specificity was only 20%. Thus, the sensitivity and specificity of the TT depended on the variable under consideration. In clinical practice, a higher sensitivity would be more preferable for a diagnostic test, although higher specificity is also important to reduce the false positive cases. To increase, both sensitivity and specificity, a decision tree analysis was applied in the present study, which revealed a first node of GS-Total-change. Among the remaining patients, CSF pressure at 15 cm H 2 O was the best threshold for differentiation. This increased the sensitivity to 82.5%, while the specificity remained at 65%. This suggested that patients with higher CSF pressure would be shunt responders even if their symptoms did not improve by one point or more in the iNPHGS after TT.
In contrast with TT, continuous CSF drainage has been reported to provide higher sensitivity and specificity, ranging from 50% to 100% and from 60% to 100%, respectively [7,[14][15][16]. As Marmarou stated, the advantage of continuous CSF drainage is increased sensitivity [13]. Drainage of a larger CSF volume simulates a closer intracranial situation to that following CSF shunt   surgery. However, it must be highlighted that most studies involving larger volume drainage, defined shunt responders with symptomatic improvement [7,[14][15][16], not with improvement of daily life activity. In SIN-PHONI, shunt responders on the iNPHGS, i.e., symptom-basis, were 89% in contrast with 80% on the modified Rankin scale, i.e., function-basis [18]. Thus, caution is needed when comparing the present data with those obtained after larger volume drainage. Although complications were reportedly very low in larger volume drainage [14,15,19], there is potentially more risk for complications in patients who are elderly with a greater or lesser degree of disturbances in gait, cognition, and/or urination. The SINPHONI study revealed high achievement in the treatment of iNPH patients without support of the TT [18]. The SINPHONI study showed the high predictability and diagnostic importance of MRI features of tight high convexity and enlarged Sylvian fissure with ventricular dilatation, which was designated as "Disproportionately Enlarged Subarachnoid-space Hydrocephalus (DESH)" [18]. However, Iseki et al. reported there were asymptomatic people with MRI features of iNPH in their population-based study [22]. They may have been potential candidates for developing iNPH in the future. Because NPH symptoms are often difficult to differentiate from those of other senile disorders, it is important to see the changes of symptoms after the TT or larger volume drainage. To increase the sensitivity of the TT, further effort is necessary.

Conclusions
The value of the TT for predicting shunt effectiveness was investigated in iNPH patients using the SINPHONI data. The sensitivity and specificity changed with different variables and improvement in any iNPH grading scale showed a sensitivity of 71.3% and specificity of 65%. A decision tree analysis revealed that any improvement on iNPHGS followed by inclusion of patients with CSFP higher than 15 cm H 2 0 increased the sensitivity up to 82.5% without a decrease in specificity. Thus, the TT is valid as an initial invasive test to predict the response to shunt for elderly patients having disturbances of gait, cognition and/or urination.