| | Manual Examination of the Spine: A Systematic Critical Literature Review of ReproducibilityReceived 15 September 2005; received in revised form 2 February 2006 Abstract ObjectivePoor reproducibility of spinal palpation has been reported in previously published literature, and authors of recent reviews have posted criticism on study quality. This article critically analyzes the literature pertaining to the inter- and intraobserver reproducibility of spinal palpation to investigate the consistency of study results and assess the level of evidence for reproducibility. MethodsSystematic review and meta-analysis were performed on relevant literature published from 1965 to 2005, identified using the electronic databases MEDLINE, MANTIS, and CINAHL and checking of reference lists. Descriptive data from included articles were extracted independently by 2 reviewers. A 6-point scale was constructed to assess the methodological quality of original studies. A meta-analysis was conducted among the high-quality studies to investigate the consistency of data, separately on motion palpation, static palpation, osseous pain, soft tissue pain, soft tissue changes, and global assessment. A standardized method was used to determine the level of evidence. ResultsThe quality score of 48 included studies ranged from 0% to 100%. There was strong evidence that the interobserver reproducibility of osseous and soft tissue pain is clinically acceptable (κ ≥ 0.4) and that intraobserver reproducibility of soft tissue pain and global assessment are clinically acceptable. Other spinal procedures are either not reproducible or the evidence is conflicting or preliminary. Biomechanical dysfunction is thought to be an important contributor to spinal pain, and manual palpation is a widely used procedure for the diagnosis of such dysfunctions among providers of manual medicine.1, 2, 3 Contrary to the expectations of many clinicians, unacceptable levels of reproducibility have been shown in the majority of the previously published literature, and authors of newer reviews have questioned the utility of manual examination procedures in spinal diagnosis altogether.4, 5, 6, 7 Severe criticism has been posted on the design of the original studies, including the use of asymptomatic subjects,4, 5 inexperienced observers,5 parallel testing,4 unclear definitions of positive findings and rating scales,4, 6 weak description of study results,4, 5, 7 and the need for improvement in overall study quality.4, 7 Furthermore, the dependence of Cohen's κ (the most widely statistical method used in studies on reproducibility) on the prevalence of positive findings, and the composition of the study population has been the subject of discussion.8, 9 Unfortunately, these reviews themselves have important limitations. For instance, some deal with only a minority of manual examination procedures such as chiropractic procedures only,4 1 spinal region,4, 6, 10 or motion palpation only.5 In only 3 reviews were a predefined quality system applied to assess study quality,4, 6, 7 and in none of the reviews were both the number of studies, the methodological quality, and the consistency of the outcomes considered, as recommended by van Tulder and others.11, 12, 13 Finally, in none of these reviews was the impact of the predefined criteria on the conclusions tested. Therefore, the value of palpation as a diagnostic tool is, at present, still unknown and so are the abilities of practitioners of manual therapy to reliably diagnose spinal dysfunctions using palpation. We therefore decided that another systematic review taking into account the above issues was warranted. Furthermore, a meta-analysis including comparable studies of adequate methodological standard and assessment of the consistency of study outcomes would be highly useful. The purpose of this paper is therefore to systematically review and critically assess the design and statistical methodology of the literature pertaining to reproducibility of spinal palpation adopting standardized criteria for judging diagnostic studies. A meta-analysis was conducted to evaluate consistency of study outcomes. Finally, the level of evidence for the reproducibility of spinal palpation was determined. Methods  Definitions Palpation was defined according to Bergmann and Petersen,1 and results of the original articles were analyzed according to the palpation procedure, using the following annotations: motion palpation (MP), static palpation (SP) (palpation for alignment and/or structure), osseous pain (OP) (pain generated from palpation of osseous structures), soft tissue pain (STP), soft tissue changes (STC), and global assessment (GA) (the latter was introduced to describe the use of 2 or more of the above procedures to make 1 single judgement on the presence/absence of mechanical dysfunction). Each palpation procedure could be by applied under 5 conditions—standing, sitting, prone, supine, or side lying—and at different segmental levels. Consequently, a palpation procedure applied under a specific condition at 1 or more segmental level is denoted a test. A paper could consider a single test or several tests and only 1 palpation procedure or several palpation procedures. Reproducibility refers to the ability of a single observer to find the same result using the same diagnostic procedure in the same patient on 2 separate moments in time (intraobserver agreement) and/or the ability of 2 observers to find the same result of a given diagnostic procedure in a patient (interobserver agreement).14 Study Selection Studies were identified by a comprehensive search of the MANTIS (1966-2005), CINAHL (1982-2005), and MEDLINE (1965-2005) databases using the index terms reproducibility, reliability, or observer variation in combination with palpation, motion palpation, physical examination procedures, or spine in text and abstracts. Bibliographies of retrieved documents were checked for any additional studies. The principal investigator (MJS) screened the documents retrieved from this search twice to determine eligibility according to inclusion and exclusion criteria, as listed in Figure 1. Data Extraction Using a checklist, data from included documents were extracted and recorded independently by 2 of the authors (MJS and HWC). Completed checklists were then compared, and discordances were resolved by discussion until consensus was reached. If consensus could not be reached, a third investigator (JH) was available to mediate. Assessment of Methodological Quality of Trials No standardized and validated method for assessing the quality of reproducibility studies exists. Therefore, a 6-point scale was constructed based on recognized requirements for clinical trials of reproducibility and standard recommendations for systematic reviews of test accuracy.12, 15, 16 The operational definitions of the quality criteria are described in Figure 2. A study was considered high-quality if the methodological quality score, expressed as a percentage of the maximum score, was 50% or higher and low-quality if the score was less than 50%. The quality score reflects the relevance and appropriateness of 3 separate dimensions that may affect interpretation of results, study population, study design, and statistical analysis. The quality scoring of the trials was performed independently by 2 reviewers (MJS and HWC). Differences in scores were resolved through consensus by the 2 reviewers. The quality scores of the individual trials were used as part of the evidence determination. Meta-Analysis To assess the consistency of study outcomes in articles included in the systematic review, a meta-analysis was conducted. Not eligible for inclusion in the meta-analysis were (1) low quality studies (<50%), (2) studies not using a binary classification of the test outcome, (3) studies not reporting any results at all, (4) studies using a binary outcome but not reporting κ values, and (5) studies not reporting an adequate description of the palpation procedure. When possible, single results from included studies (κ and confidence intervals [CI]) were drawn directly from the original articles. If CIs were not reported in the original studies, CIs were calculated according to Altman17 if the necessary information (prevalence and sample size) was available. Results for individual segmental levels not in sequence were included separately in the analysis. In case of multiple reproducibility results reported for several pairs of observers or several spinal segments in sequence, we took the average of the reported κ values and computed a CI, again by applying the Altman formula with the original sample size. This is a conservative approach ignoring a possible gain in precision due to taking the average. We displayed all available original results in a forest plot. No formal modeling and analysis of heterogeneity was performed because (1) information on the precision of the single results was not available in all studies, (2) we used partially a conservative assessment in the single studies, and (3) multiple results within a study cannot be regarded as independent. Overall κ values were computed by taking first the mean κ value within each study and then by averaging these mean κ values. Confidence intervals for the overall κ values are based on the empirical variation of the mean κ values, and were only computed if at least 4 studies constituted a mean κ value. In a secondary analysis, the association between several study characteristics and the mean κ value of the study was tested by an analysis of covariance, including the type of palpation, separately for the intra- and interobserver results. The study characteristics were as follows: publication year, definition of positive findings, segmental region, standardization (ie, agreement on procedure, written instructions, and training sessions), application condition, occupation, experience, symptomatic status of test population, multiple tests. Assessment of the Level of Evidence Criteria for determining the level of evidence for reproducibility of spinal palpation were adapted from the Agency for Health Care Policy and Research's guidelines for acute low back pain.18This method has been used to assess the level of evidence of risk factors for low back pain in systematic reviews of epidemiological studies.13, 19 The method takes into account all available included studies which describe a palpation procedure, report results, and use a valid statistical method (κ or κw) or intraclass correlation coefficient [ICC]).8 The system evaluates the evidence by taking into account (1) the number of studies, (2) the methodological quality expressed by quality scores, and (3) the consistency of the study outcomes. Consistency was checked by visual inspection of the forest plots. The rating system was applied to each palpation procedure. Five categories were used to describe evidence levels: -Strong evidence: provided by generally consistent findings in multiple (≥2) high-quality studies -Moderate evidence: provided by generally consistent findings in 1 high-quality study and 1 or more low-quality studies or in multiple (≥2) low-quality studies -Preliminary evidence: only 1 study available -Conflicting evidence: inconsistent findings in multiple (≥2) studies -No evidence: no studies were identified The level of acceptable reproducibility has traditionally, and somewhat arbitrarily, been set at κ > 0.4 in studies of manual medicine,8, 20, 21, 22, 23, 24, 25 and thus, a κ value above 0.4 was considered clinically acceptable reproducibility in this review. Levels of clinically acceptable reproducibility expressed in κw or ICC were arbitrarily chosen at 0.4 and 0.8, respectively. Sensitivity Analysis To test the robustness of the assumptions behind the weighting of the evidence, the prespecified cut points for adequate methodological quality (50%) and minimal clinically acceptable reproducibility (κ ≥ 0.4) were subjected to increases and decreases of the cut points of ±25% in the quality score and ± .1 in reproducibility. Results  Results of the Literature Search More than 900 publications were retrieved, and 48 original articles published between 1980 and 2005 were included according to the inclusion criteria.20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67 In all 48 studies interobserver reproducibility were reported, and in 19 studies, intraobserver reproducibility was also reported (Appendix A, Appendix B, available online at www.mosby.com/jmpt). All predefined categories of palpation, spinal segments, and application conditions were evaluated. In 25 articles, a single test was evaluated, and in 22 articles, multiple tests (parallel testing) were assessed. Classification of the palpation procedure was not possible in 1 study due to insufficient description.63 Altogether, 58 tests were considered for interobserver reproducibility and 26 tests for intraobserver reproducibility (Table 1). Motion palpation was the most frequently investigated palpation procedure, followed by studies of palpation for pain. Methodological Quality The methodological quality of the studies ranged from 0% to 100% (Appendix C, Appendix D, available online at www.mosby.com/jmpt). Overall, 30 studies (63%) were of high quality; however, only 8 of 19 studies (42%) investigating intraobserver reproducibility were high-quality. The proportion of high quality was higher among articles investigating the cervical and thoracic spine than the articles investigating the lumbar spine and the sacroiliac (SI) joints (67% vs 59%). A trend for increasing quality was seen for more recent articles. The average quality score increases from 27% in articles published before 1988, to 48% in articles published between 1988 and 1995, and to 54% in articles published after 1996. Meta-Analysis Of 48 original studies addressing interobserver reproducibility, 22 were considered both high-quality and eligible for inclusion in the meta-analysis according to the predetermined criteria. Twenty-six articles were not included (Fig 3). Fig 4, Fig 5 give an overview of the single results available for the meta-analysis. Eight original studies addressing intraobserver reproducibility were included in the meta-analysis (Fig 4). Eleven studies were not eligible. Ten studies were low-quality,34, 37, 48, 53, 60, 61, 63, 64, 65, 66 and 1 paper did not use a binary classification of the test outcome.55 Results were only available for 4 procedures (STP, OP, MP, and GA). Within each procedure, results seem to be comparable and point to midrange to high-range κ values, except of the study of Meijne et al.39 With respect to interobserver reproducibility, most of the results for STP indicate midrange reproducibility (Fig 5). Excepted are results from Boline,58 which showed low-range reproducibility; however, the κ estimate was very imprecise here (large CI). For STC, the results suggest low-range reproducibility, whereas SP shows inconsistent results. Results of OP all suggest mid- to high-range κ values. Most of the results for MP suggest low reproducibility. κ Values were inconsistent for GA but had wide, overlapping confidence intervals. We found no significant effect of year of publication, segmental region, standardization of procedures, observer profession or experience, symptomatic status of test population, or number of tests performed on the κ values (data not shown). Thus, our investigation showed that most study characteristics had little influence on the study results. A notable exception was seen when comparing the application conditions, where sitting palpation was associated with slightly smaller κ values and standing palpation was associated with distinctly smaller κ values. These differences were significant (P = .042) for the interobserver studies, but the tendency could be also seen in the intraobserver studies (nonsignificant). We would also like to note that we could observe in the intraobserver analysis a tendency to low mean κ values in studies without parallel testing (κ = 0.23), compared with studies with parallel testing (κ = 0.61) (nonsignificant). Evidence of Reproducibility Thirty-one articles were available for the assessment of level of evidence, including 6 studies not reporting a binary outcome (Fig 6).20, 21, 25, 40, 47, 49 Results from the 6 studies using weighted κ or ICC were not directly comparable to the studies using κ, but all 6 studies showed results with similar trends of low interobserver agreement on MP and higher interobserver agreement on evaluation of pain (Table 2). Similarly, we also included 5 low-quality studies, which showed similar trends (Table 2).33, 36, 37, 56, 57 | ⁎ In-text reference number. |
Taking all 31 studies together, strong evidence of clinically acceptable intraobserver reproducibility (κ ≥ 0.4) was found for STP and GA (Table 3). Strong evidence for clinically acceptable interobserver reproducibility was found for OP and STP according to the predefined criteria for assessment of levels of evidence. Strong evidence of clinically unacceptable reproducibility was found for intraobserver MP and interobserver MP and STC. Conflicting evidence was found for interobserver reproducibility of SP and GA. Preliminary evidence of clinically acceptable reproducibility was found for intraobserver OP, and no evidence was found for intraobserver SP and STC. | ⁎ Calculated if 4 or more results were available. |
Sensitivity Analysis In the meta-analysis, only high-quality studies were included. If low-quality studies reporting binary outcomes and κ values or high-quality studies using κw or ICC had been included, the results would have been unaffected (data not shown). Raising the cut point for adequate methodological quality from 50% to 75%, or any amount of decrease in the cut point, did not effect the weight of the evidence or the overall conclusions, except for intraobserver MP and intraobserver GA, where an increase to 75% would result in conflicting evidence derived from only 2 studies for intraobserver MP and moderate evidence for clinically acceptable intraobserver GA. Raising the cut point for clinical acceptability has an obvious impact, with results for pain being most robust due to high overall κ values. Discussion  Summary of Results After reviewing studies dealing with reproducibility of manual palpation of the entire spine, including the SI joints, we found strong evidence for clinically acceptable reproducibility both within and between observers for palpation of osseous and STP and within the same observer for GA. Strong evidence for clinically unacceptable levels of reproducibility for intra- and interobserver MP and STC was found. Intraobserver reproducibility was consistently higher than interobserver reproducibility, and reproducibility of palpation for pain response was consistently higher than reproducibility of palpation for motion. The most recent and comprehensive review evaluating the reproducibility of spinal palpation by Seffinger et al7 applied different inclusion and general review criteria, and thus, only 27 of 44 articles and 9 of 19 high-quality articles included in this review were evaluated. Furthermore, we included several more recent publications and articles dealing with the SI joints, GA, and evaluated single results from multiple test regimens. Our conclusions are based on predefined criteria and an evaluation of consistency of high-quality studies, a method not previously applied, whereas the conclusions by Seffinger et al7 were based on both high- and low-quality studies without an evaluation of consistency. The authors concluded that pain provocation tests are most reliable, and soft tissue paraspinal palpatory diagnostic test is not reliable. Among the 12 highest-quality articles, pain provocation, motion, and landmark location tests were reliable within the same observer, but not always among observers under similar conditions. Overall, examiner' discipline, experience level, consensus on procedures used, training, or the use of symptomatic subjects did not improve reliability. This is in agreement with our findings. Furthermore, we conclude that palpation of pain is reproducible both within and among observers, whereas MP may be reproducible within the same observer. Methodological and Clinical Considerations The experimental design of reproducibility studies has been criticized in previous reviews,4, 5, 6, 7, 68, 69, 70, 71 and we found that 26 of 48 articles were of low methodological quality, had invalid statistical methods, or insufficient reporting of palpation procedures or test results. Comparability of the studies included in a review is the important requirement to ensure valid generalizations. We ensured comparability with respect to the palpation procedures used, but the studies were rather heterogeneous with respect to characteristics such as definition of positive findings, segmental region, standardization, occupation, experience, symptomatic status of test population, and parallel testing. However, our investigation showed that most study characteristics had little influence on the study results, with the exception of the application condition. Especially, standing palpation was associated with very low κ values. Among the reviewed studies, standing palpation is used solely in the “Gillet test” of SI biomechanical dysfunction, and only 2 studies reporting this condition were included in our analysis.39, 59 However, both contributed to the evaluation of the inter- and intraobserver agreement of MP. If we remove these 2 studies, then the average κ for the interobserver agreement increases to 0.19 (0.13-0.26), and the intraobserver agreement increases to 0.44 (0.14-0.73), such that the intraobserver agreement of MP can be regarded as acceptable. Poor reproducibility of MP may reflect the design of reproducibility studies, rather than the quality of the palpation procedure.29, 30, 72 Greater reproducibility may be attained by allowing positive findings in a neighboring spinal segment to count in assessing agreement.29 However, this implies that we define a new, different diagnostic test which, then, requires a clinical rationale of test meaningfulness, beyond just an increase in κ values.8 Further, parallel testing (test regimens) seems to aid the observer in making the clinical decision, thus enhancing reproducibility;30, 42 a tendency we could also observe in our data. The acceptable intraobserver reproducibility for GA is also in line with this finding. However, when evaluating a combination of tests, information is only given about the reproducibility of the single test as part of this exact combination of tests.14, 73Moreover, we must be aware that conclusions on a single test from a study involving several tests may be only valid if the test is applied as part of this exact combination of tests. From a clinical perspective, increased reproducibility with parallel testing indicates that at this point, clinicians should not base their diagnosis on a single clinical examination finding such as palpation but, rather, conduct a range of tests. It is, however, premature to make clinical guidelines on how to use palpation because many aspects of palpation, such as the validity, still need to be investigated. The reproducibility of palpation for pain response is consistently higher than palpation for motion and, consistently, substantially higher within an observer than among different observers. However, both palpatory pain studies and intraobserver studies in general have inherent problems with blinding of observers. In intraobserver studies, conscious and unconscious cues may render blinding of the observers impossible, and the independence of measures can not be guaranteed. In palpatory pain studies, blinding of subjects is impossible. Both situations imply the risk of overestimating reproducibility. It should also be noted that intraobserver reproducibility is somewhat higher than interobserver reproducibility by definition (depending on the magnitude of observer by subject interaction).74 A dilemma between high internal validity and clinical applicability arises when designing studies of reproducibility. For example, training studies contrast maximal (ideal) reproducibility with actual reproducibility in practice. To enhance the internal validity, rigid testing conditions should be set up with considerations to blinding, randomization, standardization and training, and parallel testing. However, rigid enforcement of testing condition often diverges from the clinical situation and, hence, may reduce the external validity. In a clinical situation, a mix of both asymptomatic and symptomatic patients will most likely present to practitioners of manual medicine. Therefore, the study population should consist of a mix of both symptomatic and asymptomatic subjects so that the reproducibility of the testing procedure has a relation to the characteristics of the study population.14 Finally, in spite of the use in every day clinical routines, test procedures do not always necessarily evaluate the clinical entity it is intended to evaluate, and it is therefore important to discuss the content of the test procedure.14, 75 Statistical Considerations κ is widely accepted as the statistical method of choice for evaluating agreement between 2 observers for a binary classification.8 It is, however, not without problems to use κ as the sole measure of observer agreement because information is lost when a 4-fold table is summarized into 1 number. Consequently, we do not know whether it is due to a difference in prevalence estimates between observers, or whether observers lack agreement in spite of similar prevalence if a moderate κ value is obtained in a study of reproducibility. κ has been criticized for its dependence on the prevalence of positive findings, which limits its usefulness in meta-analyses, because studies with varying prevalence are typically compared. However, the composition of the study population may have greater impact on κ than the prevalence of positive findings.9 Both a binary outcome and a reported κ value were required for studies to be part of our meta-analysis. However, binary outcomes may vary according to the definition of positive findings (ie, prevalence is directly dependent on the definition of positive findings). For example, if the observer is asked to identify any hypomobile segment(s) in a spinal region, the prevalence can vary from 0% to 100%, depending on the study population. If the observer is to identify the most hypomobile segment, the overall prevalence of positive findings will be 100%, but at any particular segment under investigation, the prevalence of the most hypomobile can be 0% to 100%. However, we found no association between the prevalence of positive findings and κ values. This supports that the composition of the study populations is probably of greater importance than the prevalence of positive findings, as suggested by Vach.9 Different words and schemes have been used to evaluate the strength of reproducibility, but there are no definitive guidelines for interpreting good concordance.8, 76 Moreover, little research has been done to establish minimal, clinically acceptable reproducibility, and perhaps more important than qualifying the strength of concordance, the quantitative reproducibility indices need to be evaluated in terms of their clinical application.8 Limitations of this Review Different methodologies have been advocated for systematic reviews of trials addressing therapeutic efficacy,12 but little consensus exists when it comes to assessing the quality of reproducibility studies. We have chosen to evaluate the strength of evidence based on a best-evidence synthesis method, and this is one of the main differences between this review and previously published reviews on the same topic. Heterogeneity across studies, in terms of test procedures, inclusion criteria, study design and presentation of results, may be masked by the best-evidence approach. Considerable heterogeneity in study characteristics was noted across studies included in this review. However, despite this heterogeneity, the meta-analysis showed very consistent overall findings and only moderate impact of the specific design characteristics on the study outcomes. The exclusion from the meta-analysis of studies that did not report a binary outcome is another important difference between this and previous reviews. To compare studies of reproducibility, the same type of outcome and method of statistics must be applied. On this account, we had to exclude 5 high-quality studies from the meta-analysis. Results from these studies are not directly comparable to the included studies, but all 5 articles show results with similar trends of low interobserver agreement on MP and higher interobserver agreement on evaluation of pain; they were included in the level of evidence assessment. The restricted number of articles causes the strength of evidence to be preliminary or nonexistent in 3 categories. In return, the power of the conclusions with respect to pain and motion testing is compelling. However, results were, in some categories, based on a relatively small number of original studies, making the conclusions very sensitive to just a few future high-quality studies with different results. A κ value was reported in all high-quality studies using a binary classification. Hence, there was no need to calculate these from a published 4-fold table. No attempts were made to retrieve additional, original results or materials from the primary authors. Although every effort was made to find all published reproducibility studies, selection bias may have occurred because we included only English-language articles. Publication bias may have resulted in an overestimation of test reproducibility because studies arriving at positive conclusions are more likely to get published.77, 78 Furthermore, reviewer bias is also a possible limitation of this review. Reviewers were not blinded to the authors or the results of the individual trials when the methodological scoring was performed because of our familiarity with the literature. Despite acceptable study quality according to our criteria, many trials still had methodological limitations or, at best, inadequate reporting of methods. Nonetheless, reproducibility of spinal manual palpation has been very thoroughly investigated and more than 40 original articles have been evaluated in this review. However, to shed light on the clinical usefulness of palpation, the validity needs to be investigated, and new innovative research that addresses the concomitant problems of selecting a golden standard in motion testing is warranted. Future research should also address the question of palpation in the overall assessment of neck and back pain patients and the importance of palpation as part of the complete clinical evaluation of patients. Conclusions  Palpation for pain is reproducible at a clinically acceptable level, both within the same observer and among observers. Palpation for GA is reproducible within the same observer but not among different observers. The level of evidence to support these conclusions is strong. The reproducibility of MP, STC, and SP is not clinically acceptable. The level of evidence is strong for interobserver reproducibility of MP and STC, whereas no evidence or conflicting evidence exists for SP and intraobserver reproducibility of STC. Results are overall robust with respect to the predefined levels of acceptable quality. However, the results are sensitive to changes in the preset level of clinically acceptable reproducibility and to the number of included studies. Practical Applications  •Palpation for pain is reproducible between observers at a clinically acceptable level. •Most spinal palpatory procedures investigated is reproducible within the same observer but not between observers. Appendix A.  | | |  | Reference | Test procedure | Segmental level/patient position | Study population (no. [M/F], category, symptomatic status) | Examiners (no., occupation, experience) | Standardization | Additional procedures | Definition of positive findings/acceptable reliability | Statistics (type, prevalence/CI reported) | Summary of results/κ (PA) | Quality score |  |
|---|
 | Christensen et al29 | MP STP | T1-T8 Sitting + prone | 107 (68/39) Outpatient Sympt + Asympt | 2 Chiropractors; experience NR | + | – | Abnormality κ > 0.5 | κ (expanded κ): +/+ | MP: 0.13-0.45 (0.60-0.68) (82%-88%); STP: 0.34-0.57 (0.63-0.77) (81%-88%) | 100% |  |  | Horneij et al30 | MP STP | T7-L5 prone | 84 (sex, NR) Gen pop Sympt + Asympt | 3 Physiotherapists, 18-25 y | + | Muscle length | Pain | κ: −/+ | MP: 0.56-0.78 (78%-89%); STP: 0.64-0.78 (83%-89%) | 50.0% |  |  | French et al34 | GA | T11-L5 + SI observers own choice | 19 (14/5) Recruitment NR Sympt | 5 Chiropractors 5-18 y | − | History posture x-ray Neuro Clin | Joint in need of adjustment; allows ± 1 segment | κ: −/− | −0.21 to 1.00 (30%-100%) | 25.0% |  |  | Vincent-Smith and Gibbons37 | MP | SI standing | 9 (5/4) Edu/staff Asympt | 9 Osteopathic stud 4-5 y | + | – | Unsymmetrical movement, L> < R | κ: −/− | 0.46 (42%) | 25.0% |  |  | Hawk et al38 | GA | T12-S1 Observers own choice | 18 (14/4) Edu/staff Sympt + Asympt | 4 Chiropractors 2 > 20 y 2 < 3 y | − | Manual examination | Joint in need of adjustment (segment and functional unit) | κ: +/− | segment: −0.1 to 0.85 unit: −0.1 to 0.77 | 50.0% |  |  | Meijne et al39 | MP | SI Standing | 41 (41/0) Edu/staff Sympt + Asympt | 2 Physiotherapy stud experience NR | + | – | Fixation | κ: −/+ | 0.03-0.08 (71%-83%) | 75.0% |  |  | Cattrysse et al41 | GA | Cx supine + sitting | 11 (sex NR) Research Status NR | 4 Manual practitioners 1.5-13 y | − | 3 tests of instability | Instability | κ: −/− | −0.27 to 1.0 (63.6%-100%) | 75.0% |  |  | Inscoe et al48 | MP | T12-S1 Side posture | 6 (2/4) Edu/staff Sympt | 2 Physiotherapists 4-5 y | + | – | Mobility | Percent agreement | – | 0% |  |  | Paydar et al51 | MP OP | SI Sitting | 32 (17/15) Edu/staff Asympt | 2 Chiropractic stud 1 y | + | Posture | Restriction tenderness | κ: −/se | MP: 0.29 (58%) OP: 0.91 (97%) | 50.0% |  |  | Mior et al53 | MP | SI | >15 (sex NR) Recruitment NR Status NR | 74 Chiropractic stud Experience NR 2 Chiropractors >5 y | +/− | – | Fixation | κ: −/− | NR | 25.0% |  |  | Leboeuf54 | MP OP STP | Lx + SI sitting | 45 (29/16) Gen pop Sympt | 4 Chiropractic stud Experience NR | NR | – | NR | Percent agreement | – | 25.0% |  |  | Herzog et al55 | MP | SI Standing | 11 (sex NR) Prim Care Sympt + Asympt | 10 Chiropractors 1-11 y | + | Gait analysis | Fixation, 3-point scale | Percentage agreement, χ2 | – | 50.0% |  |  | Mootz et al57 | MP | Lx Sitting | 60 (sex NR) Edu/staff Status NR | 2 Chiropractors 7 + 10 y | + | – | Fixation | κ: +/− | −0.09 to 0.48 | 25.0% |  |  | Love and Brodeur60 | MP | T1-L5 Sitting | 32 (32/0) Edu/staff Status NR | 8 Chiropractic stud 1 y | − | – | Most hypomobile motor unit | Pearson | – | 0% |  |  | Carmichael59 | MP | SI Standing | 54 (sex NR) Edu/staff Asympt | 10 stud. 1-3 y | + | – | Fixation | κ: +/se | 0.31 (90%) | 50.0% |  |  | Bergstrøm and Courtis61 | MP | Lx Sitting | 100 (sex NR) Edu/staff Status NR | 2 Chiropractic stud. Experience NR | − | – | Fixation | Percent agreement | – | 0% |  |  | Deboer et al63 | Insuff descrip | Cx Sitting | 40 (40/0) Research + Edu/staff Asympt | 3 Chiropractors | − | – | Fixation Pain Muscle | κ | – | 25.0% |  |  | Mior and King62 | MP | C1 Supine | 62 (sex NR) Edu/staff Status NR | 2 Chiropractic stud Experience NR | NR | – | Fixation | κ: +/− | 0.37-0.52 (71%-79%) | 50.0% |  |  | Gonella et al66 | MP | T12-S1 | 5 (0/5) Edu/staff Asympt | 5 Physiotherapists 3-20 y | + | – | Mobility, 7-point scale | Mean, SD | – | 0% |  | | | |
Appendix B.  | | |  | Reference | Test procedure | Segmental level/ patient position | Study population (number (m/f), category, symptomatic status) | Examiners (number, occupation, experience) | Standardization | Additional procedures | Definition of positive findings/ acceptable reliability | Statistics (type, prevalence/ CI reported) | Summary of results/κ (PA) | Quality score |  |
|---|
 | Pool et al20 | MP OP | Cx Supine | 32 (12/20) Primary care Sympt | 2 Physiotherapists Experience NR | + | Clin | Mobility Pain, 11-point scale κ > 0.4, ICC >0.75 | κ and ICC (2.1) +/− | MP: -0.09-0.63 (48%-90%) OP: 0.22-0.80 (40.6%-87.4%) | 50% |  |  | Hicks et al27 | MP OP | Lx Prone | 63 (25/38) Outpatient + Research Sympt | 3 Physiotherapist 1 Physiotherapist/ chiropractor 3-8 y | + | Clin General mobility test | Mobility Pain | κ: +/+ | MP: -0.02-0.26 (52%-69% ) OP: 0.25-0.55 (65%-87%) | 50% |  |  | Downey et al28 | MP | Lx Prone | 60 (28/32) Prim Care Sympt | 6 Physiotherapists 3-11 y | - | History Clin | Most symptomatic level | κ: +/+ | 0.37 | 50% |  |  | Sebastian and Chovvath26 | MP | L5 Sitting + prone | 31 (sex NR) Recruitment NR Sympt | 2 Physiotherapists 5-8 y | + | - | Dysfunction | κ: +/− | 0.69 | 16.7% |  |  | Christensen et al29 | MP STP | T1-T8 Sitting + prone | 107 (68/39) Outpatient Sympt + Asympt | 2 Chiropractors Experience NR | + | - | Abnormality κ > 0.5 | κ (expanded κ): +/+ | MP: −0.03-0.0 (0.22-0.24) (68%-80%) STP: 0.38 (0.67-0.70) (77%-79%) | 100% |  |  | Horneij et al30 | MP STP | T7-L5 Prone | 84 (sex NR) Gen pop Sympt + Asympt | 3 Physiotherapists 18-25 y | + | Muscle length | Pain | κ: −/+ | MP: 0.12-0.49 (61%-77%) STP: 0.31-0.88 (80%-95%) | 66.7% |  |  | Marcotte et al31 | MP | Cx Supine | 3 (sex NR) Edu/staff Asympt | 24 Chiropractic stud + 1 Chiropractor Experience NR | + | – | Fixation Inclination = 6° | κ: +/se | 0.337-0.682 (81%-90%) | 16.7% |  |  | Comeaux et al32 | MP STC | C2-T8 Sitting | 54 (27/28) Gen pop Status NR | 3 Occupation NR >10 y | − | – | The most dysfunctional segment | κ: +/− | NR | 50.0% |  |  | Ghoukassian et al33 | STC | Tx Sitting | 19 (19/0) Recruitment NR Asympt | 10 Osteopathic Stud 2 y | + | – | The most significant area of tissue tension | κ: −/− | 0.07 | 33.3% |  |  | French et al34 | GA | T11-L5 + SI Observers own choice | 19 (14/5) Recruitment NR Sympt | 5 Chiropractors 5-18 y | − | History Posture X-ray Neuro Clin | Joint in need of adjustment Allows ± 1segment | κ: −/− | −0.16 to 0.25 (48%-64%) | 50.0% |  |  | Smedmark and Wallin35 | MP | C1-3 + C7-T1 Sitting + prone + side lying | 61 (15/46) Prim. care Sympt | 2 Physiotherapists >25 y | + | 4 tests of mobility | Stiffness (reduced mobility) | κ: −/− | 0.28-0.43 (79%-87%) | 66.7% |  |  | Van Suijlekom et al36 | SP OP STP | Cx Position NR | 24 (13/11) Outpatient + Research Sympt | 2 Neurologists Experience NR | − | History Clin Tender points | Facet joint pain Impairment | κ: −/− | SP: 0.14-0.37 OP: 0.0-1.0 STP: 0.35-0.87 | 33.3% |  |  | Vincent-Smith and Gibbons 37 | MP | SI Standing | 9 (5/4) Edu/staff Asympt | 9 Osteopathic stud. 4-5 y | + | – | Unsymmetrical movement, L> < R | κ: −/− | 0.05 (42%) | 16.7% |  |  | Hawk et al38 | GA | T12-S1 Observers own choice | 18 (14/4) Edu/staff Sympt + Asympt | 4 Chiropractors 2 > 20 y 2 < 3 y | − | Manual examination | Joint in need of adjustment (segment and functional unit) | κ: +/− | segment: −0.42 to 0.44 unit: −0.39 to 0.54 | 66.7% |  |  | Meijne et al39 | MP | SI Standing | 41 (41/0) Edu/staff Symptom + Asympt | 2 Physiotherapy stud. Experience NR | + | – | Fixation | κ: −/+ | −0.05 to 0.0 (76%-77%) | 66.7% |  |  | Fjellner et al21 | MP | C0-C5 Sitting + supine | 48 (8/40) Edu/staff + Gen pop Asympt | 2 Physiotherapists 6 + 12 y | + | Clin | If not normal κ >0.4 | κ(w): +/+ | −0.16 to 0.49 (41%-92%) | 66.7% |  |  | Lundberg and Gerdle40 | MP | Lx Side posture | 156 (0/156) Gen pop Status NR | 3 Physiotherapists Experience NR | + | Posture Clin | Mobility, 5-point scale | κ(w): −/+ | 0.42-0.75 | 66.7% |  |  | Strender et al22 | MP SP OP STP STC | C0-C3 Supine | 50 (13/37) Gen pop Sympt + Asympt | 2 Physiotherapists 21 + 23 y | + | Clin | Mobility Consistency Pain Difference between L/R, the most pronounced side κ > 0.4 | κ: +/+ | MP: 0.05-0.15 (26%-44%) SP: 0.24 (70%) OP: 0.37 (58%) STP: 0.31-0.52 (62%-68%) STC: −.18 (36%) | 75.0% |  |  | Strender et al23 | MP STP | Lx Prone | 71 (28/43) Outpatient + Prim Care Sympt | 2 Physiotherapists 2 Physicians Experience NR | + | Clin Neuro | Mobility Normality versus pathology κ > 0.4 | κ: +/+ | MP: PT: 0.38-0.75 (72%-88%) MD: -0.08-0.24 (48%-62%) STP PT: 0.27-0.56 (72%-86%) MD: 0.22-0.40 (71%-76%) | 66.7% |  |  | Cattrysse et al41 | GA | Cx Supine + sitting | 11 (sex NR) Research Status NR | 4 Manual practitioners 1.5-13 y | − | 3 tests of instability | Instability | κ: −/− | −0.64 to 1.0 (18%-100%) | 83.3% |  |  | Jull and Zito42 | GA | C0-C3 Position NR | 40 (12/28) Out patient Sympt + Asympt | 7 Physiotherapists Experience NR | − | Manual examination | Most dysfunctional segment Order of magnitude | κ: −/− | 0.25-1.0 | 66.7% |  |  | McPartland and Goodridge43 | MP SP STC | C0-C3 Position NR | 7 + 11 (1/6 + 5/6) Research + Edu/staff Sympt + Asympt | 2 Osteopaths 10 + 40 y 36 Osteopathic stud | NR | – | Dysfunction. Facet joint tenderness. Tissue texture. (Rating 0-10) | κ: −/− | MP: 0.34 (67%) SP: 0.53 (77%) STC: 0.19 (70%) | 58.3% |  |  | Tuchin et al44 | GA | C1-C7 Position NR | 53 (sex NR) Edu/staff Sympt + Asympt | 8 Chiropractors 2-14 y | − | Manual examination | Vertebral dysfunction | Logistic regression χ2 | – | 16.7% |  |  | Haas 45 | MP | T3-T12 Sitting | 73 (2/3 males) Edu/staff Sympt/ Asympt | 2 Chiropractors >15 y | + | – | End play restriction | κ: −/SE | 0.14 | 100% |  |  | Lindsay 46 | MP | Lx + SI Supine + prone | 8 (sex NR) Gen pop Asympt | 2 Physiotherapists 6 + 10 y | − | Posture Clin Muscle length | Beyond slight anomaly | κ: +/− | Lx: −0.30 to 0.0 (14%-50%) SI: 0.0-0.60 (75%-86%) | 66.7% |  |  | Binkley et al 47 | MP | L1-S1 Prone | 18 (9/9) Outpatient Sympt | 6 Physiotherapists 6-13 y | + | - | Motion, 9-point scale | ICC −/+ | 0.09-0.25 | 33.3% |  |  | Inscoe et al48 | MP | T12-S1 Side posture | 6 (2/4) Edu/staff Sympt | 2 Physiotherapists 4-5 y | + | – | Mobility | Percent Agreement | – | 16.7% |  |  | Maher and Adams49 | MP OP | Lx Prone | 90 (34/56) Prim Care Sympt | 6 Physiotherapists 8-21 y | − | – | Stiffness, 11-point scale Pain, 11-point scale | ICC (1,1) +/+ | MP: −0.40 to 0.73 OP: 0.27-0.85 | 58.3% |  |  | Hubka and Phelon50 | SP | C0-C7 Sitting | 30 (11/19) Private Clinic Sympt | 2 Chiropractors 1 + 5 y | − | – | The most tender spot | κ: +/+ | 0.68 (77%) | 75.0% |  |  | Paydar et al51 | MP OP | SI Sitting | 32 (17/15) Edu/staff Asympt | 2 Chiropractic stud. 1 y | + | Posture | Restriction Tenderness | κ: −/se | MP: 0.09 (34%) OP: 0.73 (91%) | 50.0% |  |  | Boline et al52 | OP STP | Lx prone | 28 (+/+)Prim Care Sympt | 3 Chiropractors Experience NR | NR | Posture Dermothemography Surface electromyography | Presence of abnormality | κ: +/− | OP: 0.48-0.90 (75-96%) STP: 0.40-0.78 (89%) | 50.0% |  |  | Keating et al24 | MP SP OP STP STC | Lx Prone + sitting | 46 (20/26) Recruitment NR Sympt + Asympt | 3 Chiropractors 2 -10 y | + | Posture Dermothemography Temperature | Misalignment Pain Fixation κ > 0.4 | κ: +/− | MP: 0.07-0.09 SP: 0.0 OP: 0.48 STP: 0.30 STC: 0.07 | 75.0% |  |  | Mior et al53 | MP | SI | >15 (sex NR) Recruitment NR Status NR | 74 Chiropractic stud. Experience NR 2 Chiropractors >5 y | +/− | – | Fixation | κ: −/− | NR | 16.7% |  |  | Leboeuf54 | MP OP STP | Lx + SI Sitting | 45 (29/16) Gen pop Sympt | 4 Chiropractic stud Experience NR | NR | – | NR | Percent agreement | – | 16.7% |  |  | Herzog et al55 | MP | SI Standing | 11 (sex NR) Prim Care Sympt + Asympt | 10 Chiropractors 1-11 y | + | Gait analysis | Fixation, 3-point scale | Percentage agreement, χ2 | – | 50.0% |  |  | Nansel et al56 | MP | Middle + lower Cx Sitting + supine | 270 (Approximately 50% males) Edu/staff Asympt | 4 Chiropractors Experience NR | + | – | The side of greatest resistance (L> <R) - marked segment. | κ: +/− | 0.01 (46%-54%) | 16.7% |  |  | Mootz et al57 | MP | Lx Sitting | 60 (sex NR) Edu/staff Status NR | 2 Chiropractors 7 + 10 y | + | - | Fixation | κ: +/− | −0.17 to 0.17 | 33.3% |  |  | Boline58 | MP STP STC | Lx Sitting | 50 (27/23) Edu/staff + outpatient + Prim Care Sympt + Asympt | 2 Chiropractors Experience NR | + | – | Presence of severe abnormality, fixation | κ: +/− | MP: −0.05 to 0.31 (78-91%) STP: −0.03 to 0.49 (90-96%) STC: 0.10-0.31 (70%) | 66.7% |  |  | Carmichael59 | MP | SI Standing | 54 (sex NR) Edu/staff Asympt | 10 stud. 1-3 y | + | – | Fixation | κ: +/se | 0.02 (85%) | 50.0% |  |  | Love and Brodeur60 | MP | T1-L5 Sitting | 32 (32/0) Edu/staff Status NR | 8 Chiropractic stud 1 y | − | – | Most hypomobile motor unit | Pearson | – | 16.7% |  |  | Viikari-Juntura25 | OP STP | Cx Seated | 69 (29/23) Outpatient Sympt | 1 Physician 1 Physiotherapist Experience NR | + | Neuro Clin | Tendersness Rating (0-3) κ > 0.4 | κ(w): +/− | OP: 0.47-0.52 STP: 0.24-0.56 | 50.0% |  |  | Bergstrøm and Courtis61 | MP | Lx Sitting | 100 (sex NR) Edu/staff Status NR | 2 Chiropractic stud. Experience NR | − | – | Fixation | Percent agreement | – | 0% |  |  | Mior and King 62 | MP | C1 Supine | 62 (sex NR) Edu/staff Status NR | 2 Chiropractic stud Experience NR | NR | – | Fixation | κ: +/− | 0.15 (61%) | 50.0% |  |  | Deboer et al63 | Insuff descrip | Cx Sitting | 40 (40/0) Research + Edu/staff Asympt | 3 Chiropractors Experience NR | − | – | Fixation Pain Muscle | κ | – | 50.0% |  |  | Potter and Rothstein64 | MP | SI Standing + sitting + side posture + prone | 17 (10/7) Outpatient Sympt | 8 Physiotherapists 2-18 y | + | 13 SI joint tests | Restriction | Percentage agreement, χ2 | – | 33.3% |  |  | Johnston et al65 | STC | C7-T12 Standing | 30 (sex NR) Edu/staff Status NR | 1 Osteopaths 5 Osteopathic stud Experience NR | NR | – | Decreased rebound/dullness | Percent Agreement | −(79%-86%) | 0% |  |  | Gonella et al66 | MP | T12-S1 | 5 (0/5) Edu/staff Asympt | 5 Physiotherapists 3-20 y | + | – | Mobility, 7-point scale | Mean, SD | – | 16.7% |  |  | Wiles 67 | MP | SI | 46 (sex NR) Edu/staff Asympt | 12 Chiropractors average 2.75 y | NR | – | Restriction, 5-point scale | Percentage agreement, Pearson | – | 0% |  | | | |
Appendix C. Intra-observer reproducibility studies  Appendix D. Inter-observer reproducibility studies  References  1. 1Bergmann TF, Petersen DH. Joint principles and procedures. In: Bergmann TF, Petersen DH, Lawrence DJ editor. Chiropractic technique: principles and procedures. New York: Churchill Livingstone Inc; 1993;p. 51–121. 2. 2Schafer RC, Faye LJ. Introduction to the dynamic chiropractic paradigm. In: Schafer RC, Faye LJ editor. Motion palpation and chiropractic technique. 1st ed.. Huntington Beach, Calif: The motion palpation institute; 1989;p. 1–41. 3. 3Maitland GD. Vertebral manipulation. 3rd ed.. London: Butterworths; 1977;. 4. 4Hestbaek L, Leboeuf-Yde C. Are chiropractic tests for the lumbo-pelvic spine reliable and valid? A systematic critical literature review. J Manipulative Physiol Ther. 2000;23:258–275. Abstract | Full Text |
Full-Text PDF (91 KB)
|
CrossRef
5. 5Huijbregts PA. Spinal motion palpation: a review of reliability studies. J Man Manip Ther. 2002;10:24–39. 6. 6van der Wurff P, Hagmeijer RH, Meyne W. Clinical tests of the sacroiliac joint. A systemic methodological review. Part 1: reliability. Man Ther. 2000;5:30–36.
CrossRef
7. 7Seffinger MA, Najm WI, Mishra SI, Adams A, Dickerson VM, Murphy LS, et al. Reliability of spinal palpation for diagnosis of back and neck pain: a systematic review of the literature. Spine. 2004;29:E413–E425. 8. 8Haas M. Statistical methodology for reliability studies. J Manipulative Physiol Ther. 1991;14:119–132. MEDLINE 9. 9Vach W. The dependence of Cohen's kappa on the prevalence does not matter. J Clin Epidemiol. 2005;58:655–661. Abstract | Full Text |
Full-Text PDF (244 KB)
|
CrossRef
10. 10Vaughan B. Inter-examiner reliability in detecting cervical spine dysfunction: a short review. J Osteopath Med. 2002;5:24–27. 11. 11van Tulder MW, Assendelft WJ, Koes BW, et al. Method guidelines for systematic reviews in the Cochrane collaboration back review group for spinal disorders. Spine. 1997;22:2323–2330. MEDLINE |
CrossRef
12. 12Clarke M, Oxmann AD. Cochrane reviewers' handbook 4.2.0. Oxford: Cochrane Collaboration; 2003;. 13. 13Hoogendoorn WE, van Poppel MN, Bongers PM, Koes BW, Bouter LM. Systematic review of psychosocial factors at work and private life as risk factors for back pain. Spine. 2000;25:2114–2125. MEDLINE |
CrossRef
14. 14Patijn J. Reproducibility and validity studies of diagnostic procedures in manual/musculoskeletal medicine. In: International Federation for Manual/Musculoskeletal Medicine Scientific committee. Protocol Formats. 2004;. 15. 15Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ. 2001;323:157–162. 16. 16Irwig L, Macaskill P, Glasziou P, et al. Meta-analytic methods for diagnostic test accuracy. J Clin Epidemiol. 1995;48:119–130. Abstract |
Full-Text PDF (1161 KB)
|
CrossRef
17. 17Altman DG. Some common problems in medical research. In: Altman DG editors. Practical statistics for medical research. London: Chapman & Hall; 1991;p. 396–439. 18. 18Bigos S, Bowyer O, Braen G, et al. Acute low back problems in adults. Clinical Practice Guideline No. 14. Rockville (Md): Agency for Health Care Policy and Research, Public Health Service, U.S. Department of Health and Human Services; 1994;. 19. 19Hartvigsen J, Lings S, Leboeuf-Yde C, Bakketeig L. Psychosocial factors at work in relation to low back pain and consequences of low back pain; a systematic, critical review of prospective cohort studies. Occup Environ Med. 2004;61:e2. 20. 20Pool JJ, Hoving JL, De Vet HC, van Mameren H, Bouter LM. The interexaminer reproducibility of physical examination of the cervical spine. J Manipulative Physiol Ther. 2004;27:84–90. Abstract | Full Text |
Full-Text PDF (155 KB)
|
CrossRef
21. 21Fjellner A, Bexander C, Faleij R, Strender LE. Interexaminer reliability in physical examination of the cervical spine. J Manipulative Physiol Ther. 1999;22:511–516. Abstract | Full Text |
Full-Text PDF (41 KB)
|
CrossRef
22. 22Strender LE, Lundin M, Nell K. Interexaminer reliability in physical examination of the neck. J Manipulative Physiol Ther. 1997;20:516–520. MEDLINE 23. 23Strender LE, Sjoblom A, Sundell K, Ludwig R, Taube A. Interexaminer reliability in physical examination of patients with low back pain. Spine. 1997;22:814–820. MEDLINE |
CrossRef
24. 24Keating JC, Bergmann TF, Jacobs GE, Finer BA, Larson K. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality. J Manipulative Physiol Ther. 1990;13:463–470. MEDLINE 25. 25Viikari-Juntura E. Interexaminer reliability of observations in physical examinations of the neck. Phys Ther. 1987;67:1526–1532. MEDLINE 26. 26Sebastian D, Chovvath R. Reliability of palpation assessment in non-neutral dysfunctions of the lumbar spine. Orthop Phys Ther Pract. 2004;16:23–26. 27. 27Hicks GE, Fritz JM, Delitto A, Mishock J. Interrater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil. 2003;84:1858–1864. Abstract | Full Text |
Full-Text PDF (142 KB)
|
CrossRef
28. 28Downey B, Nicholas T, Niere K. Can manipulative physiotherapists agree on which lumbar level to treat based on palpation?. Physiotherapy. 2003;89:74–81. Abstract | Full Text |
Full-Text PDF (128 KB)
|
CrossRef
29. 29Christensen HW, Vach W, Manniche C, Haghfelt T, Hartvigsen L, Høilund-Carlsen PF. Palpation of the upper thoracic spine—an observer reliability study. J Manipulative Physiol Ther. 2002;25:285–292. Abstract | Full Text |
Full-Text PDF (84 KB)
|
CrossRef
30. 30Horneij E, Hemborg B, Johnsson B, Ekdahl C. Clinical tests on impairment level related to low back pain: a study of test reliability. J Rehabil Med. 2002;34:176–182. MEDLINE |
CrossRef
31. 31Marcotte J, Normand MC, Black P. The kinematics of motion palpation and its effect on the reliability for cervical spine rotation. J Manipulative Physiol Ther. 2002;25:E7. MEDLINE 32. 32Comeaux Z, Eland D, Chila A, Pheley A, Tate M. Measurement challenges in physical diagnosis: refining interrater palpation, perception and comminication. J Bodyw Mov Ther. 2001;5:245–253. 33. 33Ghoukassian M, Nicholls B, McLaughlin P. Inter-examiner reliability of the Johnson and Friedman percussion scan of the thoracic spine. J Osteopath Med. 2001;4:15–20. 34. 34French SD, Green S, Forbes A. Reliability of chiropractic methods commonly used to detect manipulable lesions in patients with chronic low-back pain. J Manipulative Physiol Ther. 2000;23:231–238. Abstract | Full Text |
Full-Text PDF (64 KB)
|
CrossRef
35. 35Smedmark V, Wallin M. Inter-examiner reliability in assessing passive intervertebral motion of the cervical spine. Man Ther. 2000;5:97–101.
CrossRef
36. 36van Suijlekom HA, de Vet HC, van den Berg SG, Weber WE. Interobserver reliability in physical examination of the cervical spine in patients with headache. Headache. 2000;40:581–586. MEDLINE |
CrossRef
37. 37Vincent-Smith B, Gibbons P. Inter-examiner and intra-examiner reliability of standing flexion test. Man Ther. 1999;4:87–93.
CrossRef
38. 38Hawk C, Phongphua C, Bleecker J, Swank L, Lopez D, Rubley T. Preliminary study of the reliability of assessment procedures for indications for chiropractic adjustments of the lumbar spine. J Manipulative Physiol Ther. 1999;22:382–389. Abstract | Full Text |
Full-Text PDF (124 KB)
|
CrossRef
39. 39Meijne W, van Neerbos K, Aufdemkampe G, van der Wurff P. Intraexaminer and interexaminer reliability of the Gillet test. J Manipulative Physiol Ther. 1999;22:4–9. Full Text |
CrossRef
40. 40Lundberg G, Gerdle B. The relationships between spinal sagittal configuration, joint mobility, general low back mobility and segmental mobility in female homecare personnel. Scand J Rehabil Med. 1999;31:197–206. MEDLINE |
CrossRef
41. 41Cattrysse E, Swinkels RAH, Oostendorp RAB, Duquet W. Upper cervical instability: are clinical tests reliable?. Man Ther. 1997;2:91–97.
CrossRef
42. 42Jull G, Zito G. Inter-examiner reliability to detect painful upper cervical joint dysfunction. Aust J Physiother. 1997;43:125–129. 43. 43McPartland JM, Goodridge JP. Counterstrain and traditional osteopathic examination of the cervical spine compared. J Bodyw Mov Ther. 1997;1:173–178. 44. 44Tuchin P, Hart J, Colman R, Johnson C, Gee A, Edwards I, et al. Interexaminer reliability of chiropractic evaluation for cervical spine problems—a pilot study. Chiropr J Aust. 1996;5:23–29. 45. 45Haas M. Reliability of manual end-play palpation of the thoracic spine. Chiropr Tech. 1995;7:120–124. 46. 46Lindsay DM. Interrater reliability of manual therapy assessment techniques. Phys Ther Can. 1995;47:173–180. 47. 47Binkley J, Stratford PW, Gill C. Interrater reliability of lumbar accessory motion mobility testing. Phys Ther. 1995;75:786–792. MEDLINE 48. 48Inscoe EL, Witt PL, Gross MT, Mitchell RU. Reliability in evaluating passive intervertebral motion of the lumbar spine. J Man Manip Ther. 1995;3:135–143. 49. 49Maher C, Adams R. Reliability of pain and stiffness assessments in clinical manual lumbar spine examination. Phys Ther. 1994;74:801–809. MEDLINE 50. 50Hubka MJ, Phelan SP. Interexaminer reliability of palpation for cervical spine tenderness. J Manip Physiol Ther. 1994;17:591–595. 51. 51Paydar D, Thiel H, Gemmell H. Intra- and interexaminer reliability of certain pelvic palpatory procedures and the sitting flexion test for sacroiliac joint mobility and dysfunction. J Neuromusculoskel Syst. 1994;2:65–69. 52. 52Boline PD, Haas M, Meyer JJ, Kassak K, Nelson C, Keating JC. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality: part II. J Manipulative Physiol Ther. 1993;16:363–374. MEDLINE 53. 53Mior SA, McGregor M, Schut B. The role of experience in clinical accuracy. J Manipulative Physiol Ther. 1990;13:68–71. MEDLINE 54. 54Leboeuf C. Chiropractic examination procedures: a reliability and consistency study. J Aust Chiropr Assoc. 1989;19:101–104. 55. 55Herzog W, Read LJ, Conway PJ, Shaw LD, McEwen MC. Reliability of motion palpation procedures to detect sacroiliac joint fixations. J Manipulative Physiol Ther. 1989;12:86–92. MEDLINE 56. 56Nansel DD, Peneff AL, Jansen RD, Cooperstein R. Interexaminer concordance in detecting joint-play asymmetries in the cervical spines of otherwise asymptomatic subjects. J Manipulative Physiol Ther. 1989;12:428–433. MEDLINE 57. 57Mootz RD, Keating JC, Kontz HP, Milus TB, Jacobs GE. Intra- and interobserver reliability of passive motion palpation of the lumbar spine. J Manipulative Physiol Ther. 1989;12:440–445. MEDLINE 58. 58Boline PD. Interexaminer reliability of palpatory evaluations of the lumbar spine. Am J Chiropr Med. 1988;1:5–11. 59. 59Carmichael JP. Inter- and intra-examiner reliability of palpation for sacroiliac joint dysfunction. J Manipulative Physiol Ther. 1987;10:164–171. MEDLINE 60. 60Love RM, Brodeur RR. Inter- and intra-examiner reliability of motion palpation for the thoracolumbar spine. J Manipulative Physiol Ther. 1987;10:1–4. MEDLINE 61. 61Bergstrøm E, Courtis G. An inter- and intra-examiner reliability study of motion palpation of the lumbar spine in lateral flexion in the seated position. Eur J Chiropr. 1986;34:121–141. 62. 62Mior SA, King R. Intra and interexaminer reliability of motion palpation in the cervical spine. J Can Chiropr Assoc. 1985;29:195–199. 63. 63Deboer KF, Harmon R, Tuttle CD, Wallace H. Reliability study of detection of somatic dysfunctions in the cervical spine. J Manipulative Physiol Ther. 1985;8:9–16. MEDLINE 64. 64Potter NA, Rothstein JM. Intertester reliability for selected clinical tests of the sacroiliac joint. Phys Ther. 1985;65:1671–1675. MEDLINE 65. 65Johnston WL, Allan BR, Hendra JL, Neff DR, Rosen ME, Sills LD, et al. Interexaminer study of palpation in detecting location of spinal segmental dysfunction. J Am Osteopath Assoc. 1983;82:839–845. MEDLINE 66. 66Gonella C, Paris SV, Kutner M. Reliability in evaluating passive intervertebral motion. Phys Ther. 1982;62:436–444. MEDLINE 67. 67Wiles MR. Reproducibility and interexaminer correlation of motion palpation findings of the sacroiliac joints. J Can Chiropr Assoc. 1980;24:59–69. 68. 68Oldreive WL. Manual therapy rounds. A critical review of the literature on tests of the sacroiliac joint. J Man Manip Ther. 1995;3:157–161. 69. 69Keating JC. Inter-examiner reliability of motion palpation of the lumbar spine: a review of quantitative literature. Am J Chiropr Med. 1989;2:107–110. 70. 70Panzer DM. The reliability of lumbar motion palpation. J Manipulative Physiol Ther. 1992;15:518–524. MEDLINE 71. 71Haas M. The reliability of reliability. J Manipulative Physiol Ther. 1991;14:199–208. MEDLINE 72. 72Humphreys K, Delahaye M, Peterson CK. An investigation into the validity of cervical spine motion palpation using subjects with congenital block vertebrae as a “gold standard”. BMC Musculoskelet Disord. 2004;5:19. MEDLINE |
CrossRef
73. 73van Deursen L, Patijn J, Ockhuysen A, Vortman BJ. The value of some clinical tests of the sacro-iliac joint. Man Med. 1990;5:96–99. 74. 74Feldt LS, McKee ME. Estimation of the reliability of skill tests. Res Q. 1958;29:279–293. 75. 75Haas M, Groupp E, Panzer D, Partna L, Lumsden S, Aickin M. Efficacy of cervical endplay assessment as an indicator for spinal manipulation. Spine. 2003;28:1091–1096.
CrossRef
76. 76Landis JR, Koch GC. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174.
CrossRef
77. 77Huxley R, Neil A, Collins R. Unravelling the fetal origins hypothesis: is there really an inverse association between birthweight and subsequent blood pressure?. Lancet. 2002;360:659–665. Abstract | Full Text |
Full-Text PDF (104 KB)
|
CrossRef
78. 78Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA. 2004;291:2457–2465.
CrossRef
a Research Fellow, Nordic Institute of Chiropractic and Clinical Biomechanics, Part of Clinical Locomotion Science, Odense, Denmark b Senior Researcher, Nordic Institute of Chiropractic and Clinical Biomechanics, Part of Clinical Locomotion Science, Odense, Denmark c Senior Researcher, Nordic Institute of Chiropractic and Clinical Biomechanics, Part of Clinical Locomotion Science, Odense, Denmark; and Associate Professor, Institute of Sports Science and Clinical Biomechanics, Part of Clinical Locomotion Science, University of Southern Denmark, Denmark d Professor, The Department of Statistics, University of Southern Denmark, Denmark e Professor, Center for Outcomes Studies, Western States Chiropractic College, Portland, Ore f Senior Researcher, The Back Research Center, Backcenter Funen; and Part of Clinical Locomotion Science, University of Southern Denmark, Denmark g Professor, Texas Chiropractic College, Pasadena, Tex h Professor, Department of Research, Wolfe-Harris Center for Clinical Studies, Northwestern Health Sciences University, Bloomington, Minn Submit requests for reprints to: Mette Jensen Stochkendahl, DC, Nordic Institute of Chiropractic and Clinical Biomechanics, Research Department, Klosterbakken 20, DK-5000 Odense C, Denmark.
This study was funded by the Nordic Institute of Chiropractic and Clinical Biomechanics, Odense, Denmark and the Foundation for Chiropractic Education and Research, grant no. 03-09-01. PII: S0161-4754(06)00155-2 doi:10.1016/j.jmpt.2006.06.011 © 2006 National University of Health Sciences. Published by Elsevier Inc. All rights reserved. | |
|