Baixe o app para aproveitar ainda mais
Prévia do material em texto
Tracy Middleton-Duff, Keith George and Alan Batterham experience of the tester may impinge on the reliability of the estimation of LLD thus careful attention must be paid to training staff appropriately. # 2000 Harcourt Publishers Ltd *c 2000 Harcourt Pub Tracy Middleton- Duff Senior I Podiatrist, Podiatry Department, Castleford, Normanton and District Hospital, Castleford, West Yorkshire, WF10 5LT, UK Keith George PhD, Department of Exercise and Sport Science, Manchester Metropolitan University, Alsager Campus, Alsager, Cheshire, ST7 2HL, UK Alan Batterham PhD, FACSM, School of Social Sciences, University of Teesside, Borough Road, Middlesborough, TS1 3BA, UK Correspondence to: Keith George, Tel: 44 (0) 161 247 5527; Fax: 44 (0) 161 247 6375; E-mail: k.p.george@ mmu.ac.uk Introduction This is the third paper of the `Research Without Tears' section and is inextricably linked to the previous reviews of validity (George et al. 2000) and reliability (Batterham & George 2000) in clinical research. Whilst the previous papers discussed de®nitions and general concepts within validity and reliability, the current article will attempt to illuminate some of the two commonly used methods of assessing leg length discrepancy (LLD). Some degree of discrepancy in the length of the lower limbs has been reported in 65±90% of the general population (Baylis & Rzonca 1988). Varying degrees of leg length discrepancy have long been implicated in a range of sports injuries. For example, Klein (1983) reported a consistent relationship between LLD and knee injuries and McCaw (1992) suggested that LLD This study, following on from previous review papers, determined test reliability and validity within a clinical context. The chosen topic was the assessment of intra and inter-tester reliability as well as criterion validity (X-ray) for `Tape' (TP) and `Block' (BK) methods of leg length discrepancy (LLD) estimation. Four different testers using both TP and BK methods on two occasions within the same working day assessed 25 subjects. Two testers were designated as experienced (EX) and two as non- experienced (NEX). Intra-tester reliability was perfect for the BK method but demonstrated a range of variability, as assessed by typical error, for the TP method (range 0.17±0.32 cm). One EX tester was more reliable than the other EX and both NEX testers. Inter-tester variability was assessed on the ability of different pairs of testers to categorize any left or right LLD as 4 or 50.5 cm. Kappa coef®cients were only moderate throughout but were generally larger for the BK method. Criterion validity for the most reliable tester (EX1) was assessed for both TP and BK methods on a sub-sample of subjects using regression analysis and suggested a closer match between TP and X-ray than between BK and X-ray measures. Whilst the intra-tester reliability data for the BK method is better than the TP method both approaches may be sensitive enough to differentiate `large' clinically signi®cant LLD with some con®dence. Inter-tester reliability data suggests that the same tester should perform serial LLD estimations. Data for criterion validity must be viewed cautiously because of the sample size but suggests that the BK estimation produced a greater degree of error. The The reliability an `Tape' and `Block assessing anatom discrepancy Research Without Tears practical issues raised within a speci®c clinical context. This original research study, therefore, addresses the determination of criterion validity as well as intra and inter-tester reliability for lishers Ltd d validity of the ' methods for ical leg-length was associated with hip injuries. The association between LLD and a range of sports injuries/clinical syndromes has led to LLD estimation becoming routine in lower limb Physical Therapy In Sport (2000) 1, 91±99 91 bio LL im tec and un det by me has of Fri Wo 199 im ass con dev Tw (TP (e.g me val val (Be Bra Gr con des (W int et bot des me app an me no lim reg int ICC pre bei est top var and acc clin sug in n nd e ss T et e nd e i e e l p th nd o ro articipate and were fully briefed as to the at u w th re i e h o m ro c T at i b n e m st i 92 Physical Therap Physical Therapy in mechanical analysis. If the measurement of D is important within the clinical setting it is portant that any LLD is estimated by valid hniques that demonstrate acceptable intra inter-tester reliability. However, there is no iversally accepted clinical method for the ermination of LLD. Determination of LLD X-ray is considered the `gold-standard' asurement. Indeed, X-ray assessment of LLD been adopted as the criterion measurement LLD in past research (Beattie et al. 1990; berg et al. 1988; Gogia & Braatz 1986; erman & Binder-McLeod 1984; Gross et al. 8). However, the regular use of X-ray is practicable due to cost and the risks ociated with radiation exposure. As a sequence of this, other methods have been eloped for the clinical evaluation of LLD. o commonly used approaches are the `Tape' ) (e.g. Beattie et al. 1990) and `Block' (BK) . Woerman & Binder-McLeod 1984) thods. Some efforts have been made to idate both procedures as well as report ues for intra and inter-tester reliability attie et al. 1990; Friberg et al. 1988; Gogia & atz 1986; Woerman & Binder-McLeod 1984; oss et al. 1998). However, research to date is tradictory and has generally used criptive statistics with small sample sizes oerman & Binder-McLeod 1984) or the ra-class correlation coef®cient (ICC) (Gross al. 1998) for the statistical determination of h validity and reliability. Whilst ICC can cribe the degree of association between asures it has limited direct clinical lication as no data is provided that provides index or value of agreement between asures in the unit (i.e. mm or cm) that LLD is rmally assessed in. The use of typical error, its of agreement and, for criterion validity, ression analysis data will hopefully facilitate erpretation of any LLD data. Because, unlike , typical error and limits of agreement sent error values in the units of the variable ng assessed it allows the clinician to directly imate, based on their understanding of the ic, what may be acceptable limits for iability in the determination of both validity cl co a th a d m a m th m el re ex M E a C p p n S T in ei p w m w n co p fa p w su in A th co o sp Sport reliability. This issue becomes critical in the urate assessment of what constitutes a ically signi®cant LLD. Subotnick (1981) gested that LLD as low as 0.3 cm may be sym fun A reg y in Sport (2000) 1, 91±99 mbers of staff (n 12) from the hospital o were asymptomatic for any problems rmally related to LLD. This group position was adopted, as it would likely vide a large range of LLD, which would ilitate statistical procedures. he subjects who previously had been ients all presented symptoms associated th LLD as described by Chambers (1996). All jects met the criteria of full range of motion the joints of the lower limb (McRae 1990). y subject, patient or staff was excluded if y were suffering from any disease that promised lower limb function (e.g. eoarthritis) or who presented with pelvic or nal problems that can often mimic LLD ure of the project. bjects and testers enty-®ve subjects volunteered to participate the study (age range 21±53). Subjects were er patients (n 13) who had been referred viously to the podiatric biomechanics clinic th symptoms usually associated with LLD or ically relevant. Whilst this area is troversial it is easy to see that if reliabilityvalidity data are not within a 0.3 cm band n the whole practice of clinical LLD essment may need to be re-evaluated. herefore the purpose of this study was to ermine, using appropriate statistical thods, the criterion validity as well as intra inter-tester reliability of the BK and TP thods of estimating LLD. For the purposes of s study X-ray was adopted as the criterion thod for determining validity. A secondary ment of this research was the comparison of iability between experienced and non- erienced practitioners. ethod ical approval was obtained from Wake®eld Pontefract Community NHS Trust Ethical mmittee. All subjects who volunteered vided written informed consent to ptoms. Subjects that demonstrated a ctional LLD were also excluded. ll subjects were assessed by four state- istered podiatrists working within the *c 2000 Harcourt Publishers Ltd practised. were blind to any TP data and could not be *c 2000 Harcourt Pub Testing procedures The TP method was the same as that adopted in the study by Beattie et al. (1990). Subjects wore a pair of shorts so that their lower limbs were exposed and such that their hip joint and pelvis could be palpated and accessed with the tape measure. In a supine position, on a couch, the subject's hips were placed in neutral hip rotation as determined by observation. The tester placed the medial malleoli together so that they met in a plane, subjectively assessed to be the mid-sagittal line of the body. The tester then stood on the same side of the couch as the limb they were assessing. The tester palpated the anterior superior iliac spines (ASIS). One end of a tape measure was placed on the ASIS where the tester believed they could palpate the origin of the sartorius on the inferior portion of the ASIS. The opposite end of the tape measure was gradually guided down the antero-medial aspect of the subject's thigh, patella and lower leg until the point biomechanics clinic. None of these testers had a vested interest in the study. The ®rst author acted as the project coordinator but was not a tester or data recorder. Two of the podiatrists were subjectively designated as `experienced' (EX) and two as `non-experienced' (NEX). The EX podiatrists completed daily biomechanical assessments including the use of both TP and BK methods for LLD assessment. Both of these testers had a minimum of 8 years of post- registration experience. The two NEX podiatrists were not involved in daily biomechanical assessment and did not use any procedure for LLD assessment. Both subjects were aware of LLD and possible assessment methods only through their undergraduate training. The two NEX testers had between 3 and 4 years of post-registration podiatric experience. The NEX testers were provided with written instructions for both TP and BK assessment procedures 10 minutes prior to the start of the study. These were read but not where the medial malleolus sloped inferiorly and laterally. The tester held the tape measure taught. The side of the tape facing the tester was blank and the distance measured was read lishers Ltd biased in subsequent measurements. The same procedure was then performed on the contra-lateral limb. Limb order was randomized. The difference in distance measured between the two legs was calculated as the LLD. The BK method has been described previously by Woerman and Binder-McLeod (1984). The subject adopted a comfortable standing position with feet placed hip-width apart. The tester crouched directly in front of the subject and palpated the most superior portion of the left and right iliac crests. Visual estimation was made of any asymmetry in the height of the left and right iliac crests. If no asymmetry could be seen the tester asked the reader to record a `level' score. If one side was perceived to be lower then the tester placed a range of blocks directly under the foot of the lower side. Multiple, non-deforming, blocks of 0.1, 0.2, 0.3, 0.4, 0.5 and 1 cm were available. The tester used a progressive combination of blocks until believing the iliac crests to be level. The cumulative size of blocks used was recorded. It was impossible to employ any blinding process to this procedure. The recorder noted the LLD measurement. The X-ray assessment was performed by the Radiography Department at Pontefract General In®rmary and followed the exact guidelines stated by Beattie et al. (1990). Because of the logistical dif®culties with X-ray this procedure was only completed in a subset of 10 subjects for the calculation of criterion validity. For all assessment procedures, measurements were recorded to the nearest 0.1 cm. For both the TP and BK method, if no LLD was assessed a value of zero was recorded. In the case of the left leg being shorter a positive LLD was recorded and if the right leg was shorter a negative LLD was recorded. Study design All subjects attended the Podiatric from the opposite side of the tape, by an independent recorder. In this way the testers `Tape' and `Block' methods Biomechanics Clinic on the same day at 9.00 am and then again at 2.00 pm. Subjects and testers stayed in separate rooms throughout the procedure except for when an assessment was Physical Therapy In Sport (2000) 1, 91±99 93 tak sub sub fro Nu and the ass ran per BK the eac the no sub wh mo of att the Sta Ov method on two occasions was recorded for each tes sys for me we any F bot cal squ AN tap der (Ta leg Co for Lim the (ty nu of aga het F con e g h n n av u 98 e nd nd .5 as a i r ss ± E ro as o t B P ass p e ve q s C a n g e h llo ar e m -i nd l n eg z 94 Physical Therap Physical Therapy in ter. Firstly to determine the potential for any tematic bias test-retest values were analysed each tester and method via repeated asures ANOVA. No signi®cant outcomes re apparent suggesting no systematic bias for repeated observations by tester or method. or the TP method random typical error for h the right and left leg measures were culated via the square root of the mean ares error from the repeated measures OVA. Error propagation for the difference in e scores for left and right legs (TP-LLD) was ived from the `root sum square' method ylor 1997). Typical error rightp typical error2 left leg typical error2: n®dence intervals (+95%) were also recorded the TP LLD typical errors. Subsequently, 95% it of Agreement (LOA) were derived from typical error data, where LOA 2�06 2p pical error). The value 2.06 was selected as the mber from the t-distribution with 24 degrees freedom. A separate check of mean scores re th a (e te d li re th T a v m si y a ca o d si ing place. All testers examined individual jects in the absence of other testers and jects. All subjects were assigned a number m 1 to 25 and asked to sit in a waiting room. mbered cards were placed in an opaque box drawn randomly by the recorder. When ir number was drawn the subject entered the essment room. Each tester was then domly called into the assessment room and formed the TP assessment followed by the assessment. When all testers had ®nished subject re-entered the waiting room. Within h assessment all measures were recorded by same person who was not a podiatrist and t familiar with LLD assessment. All staff and jects re-assembled for the afternoon session ich followed exactly the same format as the rning session and again a randomized order subject assessment was employed in an empt to reduce the in¯uence of memory on BK method. tistical analysis erall, an LLD for every subject by each m si T co co h (S 1 th a a 0 h le sp th a 0 Y p b C in T Sport inst differences revealed no pattern of eroscedasticity to thedata. or inter-tester reliability, the ability to sistently differentiate `small' biologically A the Illi at y in Sport (2000) 1, 91±99 aningless and `larger' potentially clinically ni®cant LLD between testers was checked. e exact nature or magnitude of what stitutes a clinically signi®cant LLD is troversial (Gross et al. 1998). Researchers e suggested LLD as small as 0.3 cm botnick 1981) and as large 4.0 cm (Ingram 0) may have clinical implications. Based on likely distribution of LLD from this sample the interest in accuracy of measurement how this may in¯uence clinical practice the cm threshold was selected. Friberg (1983) stated that LLD of 0.5 cm or greater can d to biomechanical compensation in the ne. Therefore, LLD was categorized into ee potential descriptors (based on the ®rst essment) as being either `small' (NO-LLD: 0.05 cm) or large (YESRIGHT-LLD, SLEFT-LLD: 40.5 cm). The inter-tester portion of agreement (corrected for chance) ed on this classi®cation was assessed using hen's kappa coef®cient (+95% con®dence ervals). etween method variability (comparing the and BK method within the same tester) was essed using typical error and LOA as orted earlier for intra-tester reliability. For se purposes the data selected were the rage TP score and the ®rst BK score uivalent to the average BK score) for each ter. riterion validity, using X-ray, of TP/BK ta from one EX tester was evaluated via ear regression analysis. Available options for ression analysis included the least-squares or least-product approach (Ludbrook 1997). e least-product approach was adopted as it ws for measurement error in both x and y iables. In reality, with this data set, both thods of regression analysis produced ilar outcomes and thus interpretation. The ntercept, slope (+95% con®dence intervals) standard error of the estimate (SEE) were culated. This data was only calculated for e tester and must be treated with some ree of caution, because of the limited sample e (n 10). ll statistical analyses were performed using SPSS 7.0 Statistical Package (Chicago, nois). Alpha level (where appropriate) was set 0.05. *c 2000 Harcourt Publishers Ltd *c 2000 Harcourt Pub `Tape' and `Block' methods ur testers and the two methods BK-1 (cm) BK-2 (cm) o 1.0) 0.30 + 0.69 (ÿ1.0 to 2.0) 0.30 + 0.69 (ÿ1.0 to 2.0) o 1.0) 0.15 + 0.56 (ÿ1.0 to 2.0) 0.15 + 0.56 (ÿ1.0 to 2.0) o 1.2) 0.15 + 0.62 (ÿ1.0 to 2.5) 0.15 + 0.62 (ÿ1.0 to 2.5) o o e w Results A range of LLD data was recorded across individuals and between testers and methods. The group mean + S.D. (and range values) for LLD for each tester using both methods are presented in Table 1. It is important to remember when interpreting Table 1 that when subjects presented with shorter left legs the LLD was denoted as positive and with shorter right legs the LLD was designated as negative. Thus the group mean is likely to tend towards zero with a S.D. and range either side of zero. The variability in range, mean and S.D. scores suggests that some measurement error occurred both within and between testers and methods. A good exemplar may be seen with NEX-1. This tester using the TP method reported a maximal LLD (with left leg longer) of 1.2 cm on the ®rst assessment and 2.8 cm on the second assessment. This difference, 1.6 cm, would likely be viewed as signi®cant in any clinical assessment and unacceptable in terms of reproducibility. Table 1 Group LLD data, mean+ S.D. (range), for the fo TP-1 (cm) TP-2 (cm) EX-1 ÿ0.22 + 0.65 (ÿ2.0 to 1.2) ÿ0.10 + 0.61 (ÿ1.9 t EX-2 ÿ0.10 + 0.56 (ÿ1.2 to 1.0) ÿ0.03 + 0.64 (ÿ2.0 t NEX-1 ÿ0.19 + 0.55 (ÿ1.2 to 0.9) ÿ0.25 + 0.77 (ÿ2.8 t NEX-2 0.01 + 0.61 (ÿ1.6 to 0.8) 0.02 + 0.66 (ÿ1.9 t TP-1 Tape method ®rst assessment, TP-1 Tape method sec Block method second assessment, EX-1 ± Experienced test experienced tester one, NEX-2 ± Non-experienced tester t Another point that is clear from Table 1 is that the BK data for all testers is exactly the same from test one to test two. Group data, as presented in Table 1, could be misleading in that individual comparisons of LLD data are `hidden' within group scores. However, when the raw data were evaluated for the BK method it was obvious that for all four testers the test- retest LLD data were exactly the same. This effectively means that the intra-tester reliability of the BK method was perfect within this study, irrespective of the experience of the tester. This was not the case for intra-tester reliability using the TP method. Intra-tester data for the TP method is presented in Table 2. The typical error and LOA data suggest a range of lishers Ltd intra-tester variability. The two NEX testers demonstrated similar variability with c. 0.3 cm as a minimum limit of reliability. Tester EX-2 is also similar to this. However, the ®rst EX tester had reduced variability (c. 0.2 cm) to the point that con®dence intervals only just overlapped between this and the other three testers. Inter-tester reliability kappa coef®cients are presented in Table 3. It is apparent from Table 3 that for the BK method there is similar but moderate inter-tester reliability whether the tester is experienced or not. For the TP method there is greater variability in kappa coef®cient (0.01±0.69). Slightly better agreement between the two EX testers (0.49) was noted compared to the two NEX testers (0.19). However, the reliability on the whole would likely be considered as only moderate at best. The kappa coef®cient is interpreted much like other correlation coef®cients (possible range 0±1). With the clinical impact being the primary factor for interpreting the strength of the kappa coef®cient it is clear that the values obtained for 1.2) 0.09 + 0.62 (ÿ1.0 to 2.0) 0.09 + 0.62 (ÿ1.0 to 2.0) nd assessment, BK-1 Block method ®rst assessment, BK-2 r one, EX-2 ± Experienced tester two, NEX-1 ± Non- o. this study (0.01±0.69) do not approach what are `classically' considered high or good correlation coef®cients (0.8±1.0). Between method variability within each of the four testers produced a range of typical error and limit of agreement data. Typical error data Physical Therapy In Sport (2000) 1, 91±99 95 Table 2 Intra-tester reliability for the TP method Typical Error (95%CI) (cm) Limits of Agreement* (cm) EX-1 +0.17 (0.13±0.24) +0.50 EX-2 +0.32 (0.25±0.45) +0.93 NEX-1 +0.29 (0.23±0.40) +0.84 NEX-2 +0.31 (0.24±0.43) +0.90 CI ± con®dence intervals; *assumed mean difference of zero (when using ANOVA calculation). for lim 2.1 wa pro ran rel L pro BK of sam ( fu reg tha are reg com (ÿ no slo 96 Physical Therap Physical Therapy in Tab n LLD EX- EX- NEX EX- EX- NEX C rit er io n X -r ay L LD ( cm ) Fig. Sport le 3 Inter-tester reliability kappa coef®cient (with 95% co assessment EX-1 TP EX-2 TP 1 TP ± 0.49 (0.16±0.82) 2 TP ± -1 TP EX-1 BK EX-2 BK 1 BK ± 0.53 (0.22±0.84) 2 BK ± -1 BK all testers ranged from 0.72 to 0.85 cm and its of agreement for all testers ranged from to 2.5 cm. Thus between method variability s fairly consistent between testers but duced typical error and limits of agreement ges that were far in excess of intra-tester iability data for one single method. east-product linear regression plots were duced for EX-1 to compare both the TP and methods against the criterion X-ray process LLD estimation. These plots, of the sub- ple data, are presented in Figures 1 and 2 ll line line of identity, dashed line the ression line). It is quite clear from the ®gures t the regression line and the line of identity fairly close together. This is borne out by the ression analysis statistics. For the parison of TP and X-ray the y-intercept 0.11 0.02 to 0.32, 95% con®dence intervals) wast signi®cantly different from zero and the pe 1.17 (0.94 to 1.44, 95% con®dence int on res X-r con dif 1.3 sig we me lim of Ho wa clo Di Th int rel y in Sport (2000) 1, 91±99 Estimation TP −2.5 −2 −1.5 −1 1 Criterion (X-ray) validity assessment of TP method of LL NEX-1 TP NEX-2 TP 0.69 (0.41±0.97) 0.17 (ÿ0.13±0.47) 0.43 (0.13±0.73) 0.01 (ÿ0.31±0.32) ± 0.19 (ÿ0.10±0.48) NEX-1 BK NEX-2 BK 0.68 (0.40±0.96) 0.54 (0.24±0.84) 0.64 (0.34±0.94) 0.40 (0.06±0.74) ± 0.65 (0.34±0.96) 1 0.5 0 ®dence intervals) for both the TP and BK methods of ervals) was not signi®cantly different from e. The SEE and r2 were 0.17 cm and 0.95, pectively. For the comparison of BK and ay the y-intercept, ÿ0.08 (ÿ0.36 to 0.30, 95% ®dence intervals), was not signi®cantly ferent from zero, and the slope, 1.05 (0.68 to 1, 95% con®dence intervals), was not ni®cantly different from one. The SEE and r2 re 0.59 cm and 0.45, respectively. Both thods produced fairly broad con®dence its that overlapped each other and the value 1.0 for the slope and 0 for the y-intercept. wever, the SEE was lower and the r2 value s higher for the TP method suggesting a ser prediction of the criterion value. scussion e analysis of data provided a range of eresting outcomes that have clinical evance as well as highlighting the value of *c 2000 Harcourt Publishers Ltd Line of Identity Regression line LLD (cm) −0.5 −1 −1.5 −2 −2.5 −0.5 0 0.5 1 D estimation. *c 2000 Harcourt Pub n o speci®c statistical approaches to the determination of test reliability and validity. Hopefully, this data provides a good exemplar or template for other clinical researchers attempting to produce reliability and validity data for a range of assessment or measurement procedures. Intra-tester reliability data was different for the two testing methods. The BK method provided perfect intra-tester reliability within this sample and, comparatively, it might be suggested that the BK method of LLD Estimatio C rit er io n X -r ay L LD ( cm ) −2.5 −2 −1.5 −1 Fig. 2 Criterion (X-ray) validity assessment of BK method assessment is preferable to TP. The dif®culty in interpreting this outcome, however, is that it was impossible to blind the tester for the BK method. Despite the randomized order of testing and the delay between individual subject re-tests (up to 8 h) some LLD recall was likely for all testers using the BK method. Conversely, the blinded TP method did produce some degree of intra-tester variability for all testers. This variability may be due to a number of factors associated with the measurement process. Primarily this may be related to the precision of the identi®cation of surface anatomical landmarks and how these represent internal anatomy (Gofton & Trueman 1971). Likewise the deformation of the tape over the surface of the lower extremity may introduce random error (Hoyle et al. 1991). Whilst some of these anatomical factors could affect both TP and BK method (Friberg et al. 1988) it is likely that the non-blinded approach to the BK testing may have alleviated these lishers Ltd issues. The fact that the subject was supine in the TP method (compared to standing for BK) may also in¯uence error propagation. The difference in TP intra-tester reliability for EX tester one (c. 0.2 cm) and the other three testers (c. 0.3 cm) may re¯ect the in¯uence of experience and training in coping with such sources of error. This data would suggest that for some individuals, though not all, training and experience will improve reliability of TP assessment of LLD and should therefore be considered carefully within the clinical setting. `Tape' and `Block' methods Line of Identity Regression line BK LLD (cm) 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 −0.5 0.5 10 f LLD estimation. Despite some individual differences in intra- tester reliability limits for the TP method the range of 0.2±0.3 cm means that most clinically signi®cant LLD may be estimated with an acceptable degree of reliability. Typical error and limits of agreement for intra-tester reliability are dif®cult to directly compare with ICC or other statistical methods adopted in previous research. However, the perfect data for the BK method compare well with ICC data (0.87 and 0.84) from Jonson and Gross (1997) and Gross et al. (1998). Previous research data for intra-tester reliability for TP assessment of LLD are dif®cult to interpret because of issues such as small sample sizes (Woerman & Binder-MacLeod 1984). The clinical suggestion from this analysis is that although the BK method may appear to demonstrate better intra-tester reliability it is likely that both methods display adequate intra-tester reliability to be used to determine `signi®cant' LLD (0.3 cm and above) in serial Physical Therapy In Sport (2000) 1, 91±99 97 ass ap wo I deg hig the the bet Wh rel tw up rat uti of exe rel the `lar wh pre ind wr Ag cur hin sta bro int tha Th cle inv sam D wh dif tha like allu var Lim ser der me to mo and et `M do h a d h m t n T al n es en u e al e P l ¯ z g s r e m r i c ri s a n C h e et p p e in i s n T a p ro 98 Physical Therap Physical Therapy in essments. An adjunct to this is that propriate training for either technique uld seem sensible. nter-tester reliability data mirrored, to some ree, the intra-tester data in that generally her kappa coef®cients were calculated for BK compared to the TP method. However, coef®cients were variable for both methods ween testers and were only moderate at best. ilst some data would suggest a difference in iability when comparing two EX testers vs o NEX testers the general effect of experience on inter-tester reliability for both methods is her mixed. Inter-tester reliability in this study lized Cohen's kappa coef®cient for analysis categorical data. This was selected as an mplar of a different statistical approach to iability. It is also important to remember that clinical categorization of LLD as `small' or ge' is an important and contentious issue en interpreting any investigation. Indeed vious research suggests a sizeable portion of ividual assessments may even designate the ong limb as being longer (Friberg et al. 1988). ain the comparison of the data from the rent study with previous research is dered because of the use of different tistical approaches. However, there is some ad agreement with past research in that er-tester reliability introduces more error n intra-tester reliability (Hoyle et al. 1991). e clinical suggestion from this analysis is ar and simple. In any clinical situation olving multiple LLD assessments then the e tester should be used wherever possible. ata calculated for the error in assessment en the same tester estimated LLD with ferent methods in quantitatively much larger n either intra or inter-tester reliability. This is ly due to many of the factors already ded to as well as the propagation of iability from each assessment procedure. its of agreement reported (2.0±2.5 cm) iously question the clinical value of any data ived from a combination of the BK and TP thods. This approach has not been reported any great extent in previous literature as st research limited comparisons to reliability T th a T sa a e v co d g m th v th T re re si si re er m co fo th re F a et ® IC W in th d re a th cl w re co d a p Sport criterion validity assessment. Whilst Hoyle al. (1991) compared the TP method to a etrocom' method, with good agreement, this es not have direct relevance to this study. da int of cri y in Sport (2000) 1, 91±99 e clinicalsuggestion from this analysis is t one method of LLD estimation should be opted consistently by the same single tester. is is pertinent to serial assessments on the e individual as well as for the approach of eam of clinicians working in the same vironment. he assessment of TP and BK criterion idity, via regression analysis, must be sidered within the limitations of the study ign. Firstly, the data reported were erated from a small sub-sample and we st assume that the procedures employed in X-ray measurements were reliable and id. Having taken these points into account data presented does not discredit either the or BK approach. In both instances, primarily ated to the wide con®dence intervals ecting low precision due to small sample e, slope and intercept values are not ni®cantly different from 1 and 0, pectively. This suggests that the degree of or associated with the reliability of both thods is not substantial enough to pletely invalidate these indirect processes the estimation of LLD. The interpretation of s study is similar to the cautious ommendations of Beattie et al. (1990) and berg et al. (1988), for TP estimations of LLD, well as Jonson and Gross (1997) and Gross l. (1998) for BK methods. In contrast to these dings Gogia and Braatz (1986) reported an of 0.99 between TP and X-ray methods. ether this re¯ects study speci®c differences the subjects, the assessment procedures or statistical approach adopted is impossible to ermine. The higher r2 and lower SEE orted for the TP method might favour this proach to LLD estimation. However, from a oretical and practical point of view, any ical recommendation must be tempered thin the limits of the study and further earch employing larger samples should be ducted on both methods. his research has added to the available tabase by the data presented and the proach taken to a range of statistical cedures. However, as with any scienti®c ta the outcomes of this study must be erpreted alongside the recognized limitations the research. Care in interpretation of the terion validity ®ndings and the BK *c 2000 Harcourt Publishers Ltd the sample size investigated in the reliability procedure should be adopted in the estimation *c 2000 Harcourt Pub of LLD and the fundamental question of what magnitude constitutes a clinically signi®cant LLD, are still to be conclusively answered. In conclusion, the current data suggest that the estimation of LLD, via either TP or BK method, is affected by a degree of error in the measurement process. Data ranges for reliability and criterion validity of both methods have been produced that can help the interpretation of any LLD estimation. Neither method should be abandoned but emphasis may be placed upon careful use and interpretation of any data. Some speci®c clinical suggestions have been made throughout the discussion that may be debated by clinical professionals. As well as suggesting some sample speci®c advice for LLD assessment this paper has, hopefully, provided some useful exemplars of how to assess reliability and criterion validity within a clinical setting. It is hoped that both the process and the clinical context of the study will be valuable to practising clinicians and especially those involved in research or those investigating aspects of evidence-based practice. Acknowledgement The authors would like to gratefully acknowledge the support and time provided for this study by the staff and patients at the Podiatry Department of the Castleford, Normanton and District Hospital. References Batterham A, George K 2000 Reliability in evidence-based study is larger than most previous research in this area continued investigation of this topic with larger and more varied populations is warranted. Future research may wish to investigate ways of blinding the BK method as well as investigating the impact of supine vs standing TP assessments. The issue of what intra-tester reliability data have already been alluded to. It is also worthy of note that whilst clinical practice: A primer for allied health professions. Physical Therapy in Sport 1: 54±61 Baylis W J, Rzonca E C 1988 Functional and structural limb length discrepancies: Evaluation and treatment. Clinics in Podiatric Medicine and Surgery 5: 509±519 lishers Ltd Beattie P, Isaacson K, Riddel D L, Rothstein J M 1990 Validity of derived measurements of leg length differences obtained by the use of a tape measure. Physical Therapy 70: 150±157 Chambers M R C 1996 Leg length inequality: Types, aetiologies, pathomechanics, values and incidence. The Journal of British Podiatric Medicine 51 (5): 74±81 Friberg O 1983 Clinical symptoms and biomechanics of lumbar spine and hip joint in leg length inequality. Spine 8 (6): 643±651 Friberg O, Nurminen M, Korhonene K, Soininen E, Manttari T 1988 Accuracy and precision of clinical estimation of leg length inequality and lumbar scoliosis: Comparison of clinical and radiological measurements. International Disability Studies 10 (1): 49±53 George K, Batterham A, Sullivan I 2000 Validity in clinical research: a review of basic concepts and de®nitions. Physical Therapy in Sport 1 (1): 19±27 Gofton J P, Trueman G E 1971 Studies in osteoarthritis of the hip: Part II: Osteoarthritis of the hip and leg-length inequality. Journal of the Canadian Medical Association 104: 791±799 Gogia P P, Braatz J H 1986 Validity and reliability of leg length measurements. Journal of Orthopaedic and Sports Physical Therapy 8 (4): 185±188 Gross M T, Burns C B, Chapman S W, Hudson C J et al. 1998 Reliability and validity of rigid lift and pelvic levelling device method in assessing leg length inequality. Journal of Orthopaedic and Sports Physical Therapy 27 (4): 285±294 Hoyle D A, Latour M, Bohannon R W 1991 Intra-examiner, inter-examiner and interdevice comparability of leg length measurements obtained with measuring tape and Metrocom. Journal of Orthopaedic and Sports Physical Therapy 14 (6): 263±268 Ingram A J 1980 Anterior poliomyelitis. In: Edmonson A S, Crenshaw A H (eds). Campbell's operative orthopaedics. Mosby, St-Louis p 1550±1560 Jonson S R, Gross M T 1997 Intraexaminer reliability, interexaminer reliability and normal values for nine lower extremity skeletal measures. Journal of Orthopaedic and Sports Physical Therapy 25 (4): 253±263 Klein K K 1983 Developmental asymmetries and knee injury. Physician and Sportsmedicine 11 (8): 67±72 Ludbrook J 1997 Comparing methods of measurement. Clinical and Experimental Pharmacology and Physiology 24: 193±203 McCaw S T 1992 Leg length inequality. Implications for running injury prevention. Sports Medicine 14 (6): 422±429 McRae R 1990 Clinical orthopaedic assessment, 3rd edn. Churchill Livingstone, New York Subotnick S I 1981 Limb length discrepancies of the lower extremity (the short leg syndrome). Journal of Orthopaedic and Sports Physical Therapy 3 (1) 11±16 Taylor J R 1997 An introduction to error analysis: the study of uncertainties in physical measurements, 2nd edn. `Tape' and `Block' methods University Science Books, California Woerman A L, Binder-McLeod S A 1984 Leg length discrepancy assessment: Accuracy and precision in ®ve clinical methods of evaluation. Journal of Orthopaedic and Sports Physical Therapy 5 (5): 230±239 Physical Therapy In Sport (2000) 1, 91±99 99 The reliability and validity of the `Tape' and `Block' methods for assessing anatomical leg-length discrepancy Introduction Method Subjects and testers Testing procedures Study design Statistical analysis Results Discussion Acknowledgement References Figures Figure1 Figure2 Tables Table1 Table2 Table3
Compartilhar