Buscar

LEG_LENGTH_DISCREPANCY

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você viu 3, do total de 9 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você viu 6, do total de 9 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você viu 9, do total de 9 páginas

Prévia do material em texto

Tracy Middleton-Duff, Keith George and Alan Batterham
experience of the tester may impinge on the reliability of the estimation of LLD thus careful
attention must be paid to training staff appropriately. # 2000 Harcourt Publishers Ltd
*c 2000 Harcourt Pub
Tracy Middleton-
Duff Senior I
Podiatrist, Podiatry
Department,
Castleford,
Normanton and
District Hospital,
Castleford, West
Yorkshire, WF10
5LT, UK
Keith George PhD,
Department of
Exercise and Sport
Science, Manchester
Metropolitan
University, Alsager
Campus, Alsager,
Cheshire, ST7 2HL,
UK
Alan Batterham
PhD, FACSM, School
of Social Sciences,
University of
Teesside, Borough
Road,
Middlesborough,
TS1 3BA, UK
Correspondence to:
Keith George, Tel:
‡44 (0) 161 247
5527; Fax: ‡44 (0)
161 247 6375;
E-mail: k.p.george@
mmu.ac.uk
Introduction
This is the third paper of the `Research Without
Tears' section and is inextricably linked to the
previous reviews of validity (George et al. 2000)
and reliability (Batterham & George 2000) in
clinical research. Whilst the previous papers
discussed de®nitions and general concepts
within validity and reliability, the current
article will attempt to illuminate some of the
two commonly used methods of assessing leg
length discrepancy (LLD).
Some degree of discrepancy in the length of
the lower limbs has been reported in 65±90% of
the general population (Baylis & Rzonca 1988).
Varying degrees of leg length discrepancy have
long been implicated in a range of sports
injuries. For example, Klein (1983) reported a
consistent relationship between LLD and knee
injuries and McCaw (1992) suggested that LLD
This study, following on from previous review papers, determined test reliability and validity within
a clinical context. The chosen topic was the assessment of intra and inter-tester reliability as well as
criterion validity (X-ray) for `Tape' (TP) and `Block' (BK) methods of leg length discrepancy (LLD)
estimation. Four different testers using both TP and BK methods on two occasions within the same
working day assessed 25 subjects. Two testers were designated as experienced (EX) and two as non-
experienced (NEX). Intra-tester reliability was perfect for the BK method but demonstrated a range
of variability, as assessed by typical error, for the TP method (range 0.17±0.32 cm). One EX tester
was more reliable than the other EX and both NEX testers. Inter-tester variability was assessed on
the ability of different pairs of testers to categorize any left or right LLD as 4 or 50.5 cm. Kappa
coef®cients were only moderate throughout but were generally larger for the BK method. Criterion
validity for the most reliable tester (EX1) was assessed for both TP and BK methods on a sub-sample
of subjects using regression analysis and suggested a closer match between TP and X-ray than
between BK and X-ray measures. Whilst the intra-tester reliability data for the BK method is better
than the TP method both approaches may be sensitive enough to differentiate `large' clinically
signi®cant LLD with some con®dence. Inter-tester reliability data suggests that the same tester
should perform serial LLD estimations. Data for criterion validity must be viewed cautiously because
of the sample size but suggests that the BK estimation produced a greater degree of error. The
The reliability an
`Tape' and `Block
assessing anatom
discrepancy
Research Without Tears
practical issues raised within a speci®c clinical
context. This original research study, therefore,
addresses the determination of criterion validity
as well as intra and inter-tester reliability for
lishers Ltd
d validity of the
' methods for
ical leg-length
was associated with hip injuries. The
association between LLD and a range of sports
injuries/clinical syndromes has led to LLD
estimation becoming routine in lower limb
Physical Therapy In Sport (2000) 1, 91±99 91
bio
LL
im
tec
and
un
det
by
me
has
of
Fri
Wo
199
im
ass
con
dev
Tw
(TP
(e.g
me
val
val
(Be
Bra
Gr
con
des
(W
int
et
bot
des
me
app
an
me
no
lim
reg
int
ICC
pre
bei
est
top
var
and
acc
clin
sug
in
n
nd
e
ss
T
et
e
nd
e
i
e
e
l
p
th
nd
o
ro
articipate and were fully briefed as to the
at
u
w
th
re
i
e
h
o
m
ro
c
T
at
i
b
n
e
m
st
i
92 Physical Therap
Physical Therapy in
mechanical analysis. If the measurement of
D is important within the clinical setting it is
portant that any LLD is estimated by valid
hniques that demonstrate acceptable intra
inter-tester reliability. However, there is no
iversally accepted clinical method for the
ermination of LLD. Determination of LLD
X-ray is considered the `gold-standard'
asurement. Indeed, X-ray assessment of LLD
been adopted as the criterion measurement
LLD in past research (Beattie et al. 1990;
berg et al. 1988; Gogia & Braatz 1986;
erman & Binder-McLeod 1984; Gross et al.
8). However, the regular use of X-ray is
practicable due to cost and the risks
ociated with radiation exposure. As a
sequence of this, other methods have been
eloped for the clinical evaluation of LLD.
o commonly used approaches are the `Tape'
) (e.g. Beattie et al. 1990) and `Block' (BK)
. Woerman & Binder-McLeod 1984)
thods. Some efforts have been made to
idate both procedures as well as report
ues for intra and inter-tester reliability
attie et al. 1990; Friberg et al. 1988; Gogia &
atz 1986; Woerman & Binder-McLeod 1984;
oss et al. 1998). However, research to date is
tradictory and has generally used
criptive statistics with small sample sizes
oerman & Binder-McLeod 1984) or the
ra-class correlation coef®cient (ICC) (Gross
al. 1998) for the statistical determination of
h validity and reliability. Whilst ICC can
cribe the degree of association between
asures it has limited direct clinical
lication as no data is provided that provides
index or value of agreement between
asures in the unit (i.e. mm or cm) that LLD is
rmally assessed in. The use of typical error,
its of agreement and, for criterion validity,
ression analysis data will hopefully facilitate
erpretation of any LLD data. Because, unlike
, typical error and limits of agreement
sent error values in the units of the variable
ng assessed it allows the clinician to directly
imate, based on their understanding of the
ic, what may be acceptable limits for
iability in the determination of both validity
cl
co
a
th
a
d
m
a
m
th
m
el
re
ex
M
E
a
C
p
p
n
S
T
in
ei
p
w
m
w
n
co
p
fa
p
w
su
in
A
th
co
o
sp
Sport
reliability. This issue becomes critical in the
urate assessment of what constitutes a
ically signi®cant LLD. Subotnick (1981)
gested that LLD as low as 0.3 cm may be
sym
fun
A
reg
y in Sport (2000) 1, 91±99
mbers of staff (n ˆ 12) from the hospital
o were asymptomatic for any problems
rmally related to LLD. This group
position was adopted, as it would likely
vide a large range of LLD, which would
ilitate statistical procedures.
he subjects who previously had been
ients all presented symptoms associated
th LLD as described by Chambers (1996). All
jects met the criteria of full range of motion
the joints of the lower limb (McRae 1990).
y subject, patient or staff was excluded if
y were suffering from any disease that
promised lower limb function (e.g.
eoarthritis) or who presented with pelvic or
nal problems that can often mimic LLD
ure of the project.
bjects and testers
enty-®ve subjects volunteered to participate
the study (age range 21±53). Subjects were
er patients (n ˆ 13) who had been referred
viously to the podiatric biomechanics clinic
th symptoms usually associated with LLD or
ically relevant. Whilst this area is
troversial it is easy to see that if reliabilityvalidity data are not within a 0.3 cm band
n the whole practice of clinical LLD
essment may need to be re-evaluated.
herefore the purpose of this study was to
ermine, using appropriate statistical
thods, the criterion validity as well as intra
inter-tester reliability of the BK and TP
thods of estimating LLD. For the purposes of
s study X-ray was adopted as the criterion
thod for determining validity. A secondary
ment of this research was the comparison of
iability between experienced and non-
erienced practitioners.
ethod
ical approval was obtained from Wake®eld
Pontefract Community NHS Trust Ethical
mmittee. All subjects who volunteered
vided written informed consent to
ptoms. Subjects that demonstrated a
ctional LLD were also excluded.
ll subjects were assessed by four state-
istered podiatrists working within the
*c 2000 Harcourt Publishers Ltd
practised.
were blind to any TP data and could not be
*c 2000 Harcourt Pub
Testing procedures
The TP method was the same as that adopted in
the study by Beattie et al. (1990). Subjects wore
a pair of shorts so that their lower limbs were
exposed and such that their hip joint and pelvis
could be palpated and accessed with the tape
measure. In a supine position, on a couch, the
subject's hips were placed in neutral hip
rotation as determined by observation. The
tester placed the medial malleoli together so
that they met in a plane, subjectively assessed
to be the mid-sagittal line of the body. The
tester then stood on the same side of the couch
as the limb they were assessing. The tester
palpated the anterior superior iliac spines
(ASIS). One end of a tape measure was placed
on the ASIS where the tester believed they
could palpate the origin of the sartorius on the
inferior portion of the ASIS. The opposite end
of the tape measure was gradually guided
down the antero-medial aspect of the subject's
thigh, patella and lower leg until the point
biomechanics clinic. None of these testers had a
vested interest in the study. The ®rst author
acted as the project coordinator but was not a
tester or data recorder. Two of the podiatrists
were subjectively designated as `experienced'
(EX) and two as `non-experienced' (NEX). The
EX podiatrists completed daily biomechanical
assessments including the use of both TP and
BK methods for LLD assessment. Both of these
testers had a minimum of 8 years of post-
registration experience. The two NEX
podiatrists were not involved in daily
biomechanical assessment and did not use any
procedure for LLD assessment. Both subjects
were aware of LLD and possible assessment
methods only through their undergraduate
training. The two NEX testers had between
3 and 4 years of post-registration podiatric
experience. The NEX testers were provided
with written instructions for both TP and BK
assessment procedures 10 minutes prior to the
start of the study. These were read but not
where the medial malleolus sloped inferiorly
and laterally. The tester held the tape measure
taught. The side of the tape facing the tester
was blank and the distance measured was read
lishers Ltd
biased in subsequent measurements. The
same procedure was then performed on the
contra-lateral limb. Limb order was
randomized. The difference in distance
measured between the two legs was
calculated as the LLD.
The BK method has been described
previously by Woerman and Binder-McLeod
(1984). The subject adopted a comfortable
standing position with feet placed hip-width
apart. The tester crouched directly in front of
the subject and palpated the most superior
portion of the left and right iliac crests. Visual
estimation was made of any asymmetry in the
height of the left and right iliac crests. If no
asymmetry could be seen the tester asked the
reader to record a `level' score. If one side was
perceived to be lower then the tester placed a
range of blocks directly under the foot of the
lower side. Multiple, non-deforming, blocks of
0.1, 0.2, 0.3, 0.4, 0.5 and 1 cm were available.
The tester used a progressive combination of
blocks until believing the iliac crests to be level.
The cumulative size of blocks used was
recorded. It was impossible to employ any
blinding process to this procedure. The recorder
noted the LLD measurement.
The X-ray assessment was performed by the
Radiography Department at Pontefract General
In®rmary and followed the exact guidelines
stated by Beattie et al. (1990). Because of the
logistical dif®culties with X-ray this procedure
was only completed in a subset of 10 subjects
for the calculation of criterion validity.
For all assessment procedures, measurements
were recorded to the nearest 0.1 cm. For both
the TP and BK method, if no LLD was assessed
a value of zero was recorded. In the case of the
left leg being shorter a positive LLD was
recorded and if the right leg was shorter a
negative LLD was recorded.
Study design
All subjects attended the Podiatric
from the opposite side of the tape, by an
independent recorder. In this way the testers
`Tape' and `Block' methods
Biomechanics Clinic on the same day at 9.00 am
and then again at 2.00 pm. Subjects and testers
stayed in separate rooms throughout the
procedure except for when an assessment was
Physical Therapy In Sport (2000) 1, 91±99 93
tak
sub
sub
fro
Nu
and
the
ass
ran
per
BK
the
eac
the
no
sub
wh
mo
of
att
the
Sta
Ov
method on two occasions was recorded for each
tes
sys
for
me
we
any
F
bot
cal
squ
AN
tap
der
(Ta
leg
Co
for
Lim
the
(ty
nu
of
aga
het
F
con
e
g
h
n
n
av
u
98
e
nd
nd
.5
as
a
i
r
ss
±
E
ro
as
o
t
B
P
ass
p
e
ve
q
s
C
a
n
g
e
h
llo
ar
e
m
-i
nd
l
n
eg
z
94 Physical Therap
Physical Therapy in
ter. Firstly to determine the potential for any
tematic bias test-retest values were analysed
each tester and method via repeated
asures ANOVA. No signi®cant outcomes
re apparent suggesting no systematic bias for
repeated observations by tester or method.
or the TP method random typical error for
h the right and left leg measures were
culated via the square root of the mean
ares error from the repeated measures
OVA. Error propagation for the difference in
e scores for left and right legs (TP-LLD) was
ived from the `root sum square' method
ylor 1997). Typical error ˆ …rightp
typical error2 ‡ left leg typical error2†:
n®dence intervals (+95%) were also recorded
the TP LLD typical errors. Subsequently, 95%
it of Agreement (LOA) were derived from
typical error data, where LOA ˆ 2�06 2p
pical error). The value 2.06 was selected as the
mber from the t-distribution with 24 degrees
freedom. A separate check of mean scores
re
th
a
(e
te
d
li
re
th
T
a
v
m
si
y
a
ca
o
d
si
ing place. All testers examined individual
jects in the absence of other testers and
jects. All subjects were assigned a number
m 1 to 25 and asked to sit in a waiting room.
mbered cards were placed in an opaque box
drawn randomly by the recorder. When
ir number was drawn the subject entered the
essment room. Each tester was then
domly called into the assessment room and
formed the TP assessment followed by the
assessment. When all testers had ®nished
subject re-entered the waiting room. Within
h assessment all measures were recorded by
same person who was not a podiatrist and
t familiar with LLD assessment. All staff and
jects re-assembled for the afternoon session
ich followed exactly the same format as the
rning session and again a randomized order
subject assessment was employed in an
empt to reduce the in¯uence of memory on
BK method.
tistical analysis
erall, an LLD for every subject by each
m
si
T
co
co
h
(S
1
th
a
a
0
h
le
sp
th
a
0
Y
p
b
C
in
T
Sport
inst differences revealed no pattern of
eroscedasticity to thedata.
or inter-tester reliability, the ability to
sistently differentiate `small' biologically
A
the
Illi
at
y in Sport (2000) 1, 91±99
aningless and `larger' potentially clinically
ni®cant LLD between testers was checked.
e exact nature or magnitude of what
stitutes a clinically signi®cant LLD is
troversial (Gross et al. 1998). Researchers
e suggested LLD as small as 0.3 cm
botnick 1981) and as large 4.0 cm (Ingram
0) may have clinical implications. Based on
likely distribution of LLD from this sample
the interest in accuracy of measurement
how this may in¯uence clinical practice the
cm threshold was selected. Friberg (1983)
stated that LLD of 0.5 cm or greater can
d to biomechanical compensation in the
ne. Therefore, LLD was categorized into
ee potential descriptors (based on the ®rst
essment) as being either `small' (NO-LLD:
0.05 cm) or large (YESRIGHT-LLD,
SLEFT-LLD: 40.5 cm). The inter-tester
portion of agreement (corrected for chance)
ed on this classi®cation was assessed using
hen's kappa coef®cient (+95% con®dence
ervals).
etween method variability (comparing the
and BK method within the same tester) was
essed using typical error and LOA as
orted earlier for intra-tester reliability. For
se purposes the data selected were the
rage TP score and the ®rst BK score
uivalent to the average BK score) for each
ter.
riterion validity, using X-ray, of TP/BK
ta from one EX tester was evaluated via
ear regression analysis. Available options for
ression analysis included the least-squares or
least-product approach (Ludbrook 1997).
e least-product approach was adopted as it
ws for measurement error in both x and y
iables. In reality, with this data set, both
thods of regression analysis produced
ilar outcomes and thus interpretation. The
ntercept, slope (+95% con®dence intervals)
standard error of the estimate (SEE) were
culated. This data was only calculated for
e tester and must be treated with some
ree of caution, because of the limited sample
e (n ˆ 10).
ll statistical analyses were performed using
SPSS 7.0 Statistical Package (Chicago,
nois). Alpha level (where appropriate) was set
0.05.
*c 2000 Harcourt Publishers Ltd
*c 2000 Harcourt Pub
`Tape' and `Block' methods
ur testers and the two methods
BK-1
(cm)
BK-2
(cm)
o 1.0) 0.30 + 0.69 (ÿ1.0 to 2.0) 0.30 + 0.69 (ÿ1.0 to 2.0)
o 1.0) 0.15 + 0.56 (ÿ1.0 to 2.0) 0.15 + 0.56 (ÿ1.0 to 2.0)
o 1.2) 0.15 + 0.62 (ÿ1.0 to 2.5) 0.15 + 0.62 (ÿ1.0 to 2.5)
o
o
e
w
Results
A range of LLD data was recorded across
individuals and between testers and methods.
The group mean + S.D. (and range values) for
LLD for each tester using both methods are
presented in Table 1. It is important to
remember when interpreting Table 1 that when
subjects presented with shorter left legs the
LLD was denoted as positive and with shorter
right legs the LLD was designated as negative.
Thus the group mean is likely to tend towards
zero with a S.D. and range either side of zero.
The variability in range, mean and S.D.
scores suggests that some measurement error
occurred both within and between testers and
methods. A good exemplar may be seen with
NEX-1. This tester using the TP method
reported a maximal LLD (with left leg longer)
of 1.2 cm on the ®rst assessment and 2.8 cm on
the second assessment. This difference, 1.6 cm,
would likely be viewed as signi®cant in any
clinical assessment and unacceptable in terms
of reproducibility.
Table 1 Group LLD data, mean+ S.D. (range), for the fo
TP-1
(cm)
TP-2
(cm)
EX-1 ÿ0.22 + 0.65 (ÿ2.0 to 1.2) ÿ0.10 + 0.61 (ÿ1.9 t
EX-2 ÿ0.10 + 0.56 (ÿ1.2 to 1.0) ÿ0.03 + 0.64 (ÿ2.0 t
NEX-1 ÿ0.19 + 0.55 (ÿ1.2 to 0.9) ÿ0.25 + 0.77 (ÿ2.8 t
NEX-2 0.01 + 0.61 (ÿ1.6 to 0.8) 0.02 + 0.66 (ÿ1.9 t
TP-1 Tape method ®rst assessment, TP-1 Tape method sec
Block method second assessment, EX-1 ± Experienced test
experienced tester one, NEX-2 ± Non-experienced tester t
Another point that is clear from Table 1 is
that the BK data for all testers is exactly the
same from test one to test two. Group data, as
presented in Table 1, could be misleading in
that individual comparisons of LLD data are
`hidden' within group scores. However, when
the raw data were evaluated for the BK method
it was obvious that for all four testers the test-
retest LLD data were exactly the same. This
effectively means that the intra-tester reliability
of the BK method was perfect within this study,
irrespective of the experience of the tester. This
was not the case for intra-tester reliability using
the TP method. Intra-tester data for the TP
method is presented in Table 2. The typical
error and LOA data suggest a range of
lishers Ltd
intra-tester variability. The two NEX testers
demonstrated similar variability with c. 0.3 cm
as a minimum limit of reliability. Tester EX-2 is
also similar to this. However, the ®rst EX tester
had reduced variability (c. 0.2 cm) to the point
that con®dence intervals only just overlapped
between this and the other three testers.
Inter-tester reliability kappa coef®cients are
presented in Table 3. It is apparent from Table 3
that for the BK method there is similar but
moderate inter-tester reliability whether the
tester is experienced or not. For the TP method
there is greater variability in kappa coef®cient
(0.01±0.69). Slightly better agreement between
the two EX testers (0.49) was noted compared
to the two NEX testers (0.19). However, the
reliability on the whole would likely be
considered as only moderate at best. The kappa
coef®cient is interpreted much like other
correlation coef®cients (possible range 0±1).
With the clinical impact being the primary
factor for interpreting the strength of the kappa
coef®cient it is clear that the values obtained for
1.2) 0.09 + 0.62 (ÿ1.0 to 2.0) 0.09 + 0.62 (ÿ1.0 to 2.0)
nd assessment, BK-1 Block method ®rst assessment, BK-2
r one, EX-2 ± Experienced tester two, NEX-1 ± Non-
o.
this study (0.01±0.69) do not approach what are
`classically' considered high or good correlation
coef®cients (0.8±1.0).
Between method variability within each of the
four testers produced a range of typical error
and limit of agreement data. Typical error data
Physical Therapy In Sport (2000) 1, 91±99 95
Table 2 Intra-tester reliability for the TP method
Typical Error (95%CI)
(cm)
Limits of Agreement*
(cm)
EX-1 +0.17 (0.13±0.24) +0.50
EX-2 +0.32 (0.25±0.45) +0.93
NEX-1 +0.29 (0.23±0.40) +0.84
NEX-2 +0.31 (0.24±0.43) +0.90
CI ± con®dence intervals; *assumed mean difference of
zero (when using ANOVA calculation).
for
lim
2.1
wa
pro
ran
rel
L
pro
BK
of
sam
( fu
reg
tha
are
reg
com
(ÿ
no
slo
96 Physical Therap
Physical Therapy in
Tab n
LLD
EX-
EX-
NEX
EX-
EX-
NEX
C
rit
er
io
n 
X
-r
ay
 L
LD
 (
cm
)
Fig.
Sport
le 3 Inter-tester reliability kappa coef®cient (with 95% co
assessment
EX-1 TP EX-2 TP
1 TP ± 0.49 (0.16±0.82)
2 TP ±
-1 TP
EX-1 BK EX-2 BK
1 BK ± 0.53 (0.22±0.84)
2 BK ±
-1 BK
all testers ranged from 0.72 to 0.85 cm and
its of agreement for all testers ranged from
to 2.5 cm. Thus between method variability
s fairly consistent between testers but
duced typical error and limits of agreement
ges that were far in excess of intra-tester
iability data for one single method.
east-product linear regression plots were
duced for EX-1 to compare both the TP and
methods against the criterion X-ray process
LLD estimation. These plots, of the sub-
ple data, are presented in Figures 1 and 2
ll line ˆ line of identity, dashed line ˆ the
ression line). It is quite clear from the ®gures
t the regression line and the line of identity
fairly close together. This is borne out by the
ression analysis statistics. For the
parison of TP and X-ray the y-intercept 0.11
0.02 to 0.32, 95% con®dence intervals) wast signi®cantly different from zero and the
pe 1.17 (0.94 to 1.44, 95% con®dence
int
on
res
X-r
con
dif
1.3
sig
we
me
lim
of
Ho
wa
clo
Di
Th
int
rel
y in Sport (2000) 1, 91±99
Estimation TP
−2.5 −2 −1.5 −1
1 Criterion (X-ray) validity assessment of TP method of LL
NEX-1 TP NEX-2 TP
0.69 (0.41±0.97) 0.17 (ÿ0.13±0.47)
0.43 (0.13±0.73) 0.01 (ÿ0.31±0.32)
± 0.19 (ÿ0.10±0.48)
NEX-1 BK NEX-2 BK
0.68 (0.40±0.96) 0.54 (0.24±0.84)
0.64 (0.34±0.94) 0.40 (0.06±0.74)
± 0.65 (0.34±0.96)
1
0.5
0
®dence intervals) for both the TP and BK methods of
ervals) was not signi®cantly different from
e. The SEE and r2 were 0.17 cm and 0.95,
pectively. For the comparison of BK and
ay the y-intercept, ÿ0.08 (ÿ0.36 to 0.30, 95%
®dence intervals), was not signi®cantly
ferent from zero, and the slope, 1.05 (0.68 to
1, 95% con®dence intervals), was not
ni®cantly different from one. The SEE and r2
re 0.59 cm and 0.45, respectively. Both
thods produced fairly broad con®dence
its that overlapped each other and the value
1.0 for the slope and 0 for the y-intercept.
wever, the SEE was lower and the r2 value
s higher for the TP method suggesting a
ser prediction of the criterion value.
scussion
e analysis of data provided a range of
eresting outcomes that have clinical
evance as well as highlighting the value of
*c 2000 Harcourt Publishers Ltd
Line of Identity
Regression line
 LLD (cm)
−0.5
−1
−1.5
−2
−2.5
−0.5 0 0.5 1
D estimation.
*c 2000 Harcourt Pub
n
o
speci®c statistical approaches to the
determination of test reliability and validity.
Hopefully, this data provides a good exemplar
or template for other clinical researchers
attempting to produce reliability and validity
data for a range of assessment or measurement
procedures.
Intra-tester reliability data was different for
the two testing methods. The BK method
provided perfect intra-tester reliability within
this sample and, comparatively, it might be
suggested that the BK method of LLD
Estimatio
C
rit
er
io
n 
X
-r
ay
 L
LD
 (
cm
)
−2.5 −2 −1.5 −1
Fig. 2 Criterion (X-ray) validity assessment of BK method
assessment is preferable to TP. The dif®culty in
interpreting this outcome, however, is that it
was impossible to blind the tester for the BK
method. Despite the randomized order of
testing and the delay between individual
subject re-tests (up to 8 h) some LLD recall was
likely for all testers using the BK method.
Conversely, the blinded TP method did
produce some degree of intra-tester variability
for all testers. This variability may be due to a
number of factors associated with the
measurement process. Primarily this may be
related to the precision of the identi®cation of
surface anatomical landmarks and how these
represent internal anatomy (Gofton & Trueman
1971). Likewise the deformation of the tape
over the surface of the lower extremity may
introduce random error (Hoyle et al. 1991).
Whilst some of these anatomical factors could
affect both TP and BK method (Friberg et al.
1988) it is likely that the non-blinded approach
to the BK testing may have alleviated these
lishers Ltd
issues. The fact that the subject was supine in
the TP method (compared to standing for BK)
may also in¯uence error propagation. The
difference in TP intra-tester reliability for EX
tester one (c. 0.2 cm) and the other three testers
(c. 0.3 cm) may re¯ect the in¯uence of
experience and training in coping with such
sources of error. This data would suggest that
for some individuals, though not all, training
and experience will improve reliability of TP
assessment of LLD and should therefore be
considered carefully within the clinical setting.
`Tape' and `Block' methods
Line of Identity
Regression line
 BK LLD (cm)
1
0.5
0
−0.5
−1
−1.5
−2
−2.5
−0.5 0.5 10
f LLD estimation.
Despite some individual differences in intra-
tester reliability limits for the TP method the
range of 0.2±0.3 cm means that most clinically
signi®cant LLD may be estimated with an
acceptable degree of reliability.
Typical error and limits of agreement for
intra-tester reliability are dif®cult to directly
compare with ICC or other statistical methods
adopted in previous research. However, the
perfect data for the BK method compare well
with ICC data (0.87 and 0.84) from Jonson and
Gross (1997) and Gross et al. (1998). Previous
research data for intra-tester reliability for TP
assessment of LLD are dif®cult to interpret
because of issues such as small sample sizes
(Woerman & Binder-MacLeod 1984). The
clinical suggestion from this analysis is that
although the BK method may appear to
demonstrate better intra-tester reliability it is
likely that both methods display adequate
intra-tester reliability to be used to determine
`signi®cant' LLD (0.3 cm and above) in serial
Physical Therapy In Sport (2000) 1, 91±99 97
ass
ap
wo
I
deg
hig
the
the
bet
Wh
rel
tw
up
rat
uti
of
exe
rel
the
`lar
wh
pre
ind
wr
Ag
cur
hin
sta
bro
int
tha
Th
cle
inv
sam
D
wh
dif
tha
like
allu
var
Lim
ser
der
me
to
mo
and
et
`M
do
h
a
d
h
m
t
n
T
al
n
es
en
u
e
al
e
P
l
¯
z
g
s
r
e
m
r
i
c
ri
s
a
n
C
h
e
et
p
p
e
in
i
s
n
T
a
p
ro
98 Physical Therap
Physical Therapy in
essments. An adjunct to this is that
propriate training for either technique
uld seem sensible.
nter-tester reliability data mirrored, to some
ree, the intra-tester data in that generally
her kappa coef®cients were calculated for
BK compared to the TP method. However,
coef®cients were variable for both methods
ween testers and were only moderate at best.
ilst some data would suggest a difference in
iability when comparing two EX testers vs
o NEX testers the general effect of experience
on inter-tester reliability for both methods is
her mixed. Inter-tester reliability in this study
lized Cohen's kappa coef®cient for analysis
categorical data. This was selected as an
mplar of a different statistical approach to
iability. It is also important to remember that
clinical categorization of LLD as `small' or
ge' is an important and contentious issue
en interpreting any investigation. Indeed
vious research suggests a sizeable portion of
ividual assessments may even designate the
ong limb as being longer (Friberg et al. 1988).
ain the comparison of the data from the
rent study with previous research is
dered because of the use of different
tistical approaches. However, there is some
ad agreement with past research in that
er-tester reliability introduces more error
n intra-tester reliability (Hoyle et al. 1991).
e clinical suggestion from this analysis is
ar and simple. In any clinical situation
olving multiple LLD assessments then the
e tester should be used wherever possible.
ata calculated for the error in assessment
en the same tester estimated LLD with
ferent methods in quantitatively much larger
n either intra or inter-tester reliability. This is
ly due to many of the factors already
ded to as well as the propagation of
iability from each assessment procedure.
its of agreement reported (2.0±2.5 cm)
iously question the clinical value of any data
ived from a combination of the BK and TP
thods. This approach has not been reported
any great extent in previous literature as
st research limited comparisons to reliability
T
th
a
T
sa
a
e
v
co
d
g
m
th
v
th
T
re
re
si
si
re
er
m
co
fo
th
re
F
a
et
®
IC
W
in
th
d
re
a
th
cl
w
re
co
d
a
p
Sport
criterion validity assessment. Whilst Hoyle
al. (1991) compared the TP method to a
etrocom' method, with good agreement, this
es not have direct relevance to this study.
da
int
of
cri
y in Sport (2000) 1, 91±99
e clinicalsuggestion from this analysis is
t one method of LLD estimation should be
opted consistently by the same single tester.
is is pertinent to serial assessments on the
e individual as well as for the approach of
eam of clinicians working in the same
vironment.
he assessment of TP and BK criterion
idity, via regression analysis, must be
sidered within the limitations of the study
ign. Firstly, the data reported were
erated from a small sub-sample and we
st assume that the procedures employed in
X-ray measurements were reliable and
id. Having taken these points into account
data presented does not discredit either the
or BK approach. In both instances, primarily
ated to the wide con®dence intervals
ecting low precision due to small sample
e, slope and intercept values are not
ni®cantly different from 1 and 0,
pectively. This suggests that the degree of
or associated with the reliability of both
thods is not substantial enough to
pletely invalidate these indirect processes
the estimation of LLD. The interpretation of
s study is similar to the cautious
ommendations of Beattie et al. (1990) and
berg et al. (1988), for TP estimations of LLD,
well as Jonson and Gross (1997) and Gross
l. (1998) for BK methods. In contrast to these
dings Gogia and Braatz (1986) reported an
of 0.99 between TP and X-ray methods.
ether this re¯ects study speci®c differences
the subjects, the assessment procedures or
statistical approach adopted is impossible to
ermine. The higher r2 and lower SEE
orted for the TP method might favour this
proach to LLD estimation. However, from a
oretical and practical point of view, any
ical recommendation must be tempered
thin the limits of the study and further
earch employing larger samples should be
ducted on both methods.
his research has added to the available
tabase by the data presented and the
proach taken to a range of statistical
cedures. However, as with any scienti®c
ta the outcomes of this study must be
erpreted alongside the recognized limitations
the research. Care in interpretation of the
terion validity ®ndings and the BK
*c 2000 Harcourt Publishers Ltd
the sample size investigated in the reliability
procedure should be adopted in the estimation
*c 2000 Harcourt Pub
of LLD and the fundamental question of what
magnitude constitutes a clinically signi®cant
LLD, are still to be conclusively answered.
In conclusion, the current data suggest that
the estimation of LLD, via either TP or BK
method, is affected by a degree of error in the
measurement process. Data ranges for
reliability and criterion validity of both
methods have been produced that can help the
interpretation of any LLD estimation. Neither
method should be abandoned but emphasis
may be placed upon careful use and
interpretation of any data. Some speci®c clinical
suggestions have been made throughout the
discussion that may be debated by clinical
professionals. As well as suggesting some
sample speci®c advice for LLD assessment this
paper has, hopefully, provided some useful
exemplars of how to assess reliability and
criterion validity within a clinical setting. It is
hoped that both the process and the clinical
context of the study will be valuable to
practising clinicians and especially those
involved in research or those investigating
aspects of evidence-based practice.
Acknowledgement
The authors would like to gratefully
acknowledge the support and time provided for
this study by the staff and patients at the
Podiatry Department of the Castleford,
Normanton and District Hospital.
References
Batterham A, George K 2000 Reliability in evidence-based
study is larger than most previous research in
this area continued investigation of this topic
with larger and more varied populations is
warranted. Future research may wish to
investigate ways of blinding the BK method as
well as investigating the impact of supine vs
standing TP assessments. The issue of what
intra-tester reliability data have already been
alluded to. It is also worthy of note that whilst
clinical practice: A primer for allied health professions.
Physical Therapy in Sport 1: 54±61
Baylis W J, Rzonca E C 1988 Functional and structural
limb length discrepancies: Evaluation and treatment.
Clinics in Podiatric Medicine and Surgery 5: 509±519
lishers Ltd
Beattie P, Isaacson K, Riddel D L, Rothstein J M 1990
Validity of derived measurements of leg length
differences obtained by the use of a tape measure.
Physical Therapy 70: 150±157
Chambers M R C 1996 Leg length inequality: Types,
aetiologies, pathomechanics, values and incidence. The
Journal of British Podiatric Medicine 51 (5): 74±81
Friberg O 1983 Clinical symptoms and biomechanics of
lumbar spine and hip joint in leg length inequality.
Spine 8 (6): 643±651
Friberg O, Nurminen M, Korhonene K, Soininen E, Manttari
T 1988 Accuracy and precision of clinical estimation of
leg length inequality and lumbar scoliosis: Comparison
of clinical and radiological measurements. International
Disability Studies 10 (1): 49±53
George K, Batterham A, Sullivan I 2000 Validity in clinical
research: a review of basic concepts and de®nitions.
Physical Therapy in Sport 1 (1): 19±27
Gofton J P, Trueman G E 1971 Studies in osteoarthritis of the
hip: Part II: Osteoarthritis of the hip and leg-length
inequality. Journal of the Canadian Medical Association
104: 791±799
Gogia P P, Braatz J H 1986 Validity and reliability of leg
length measurements. Journal of Orthopaedic and
Sports Physical Therapy 8 (4): 185±188
Gross M T, Burns C B, Chapman S W, Hudson C J et al.
1998 Reliability and validity of rigid lift and pelvic
levelling device method in assessing leg length
inequality. Journal of Orthopaedic and Sports Physical
Therapy 27 (4): 285±294
Hoyle D A, Latour M, Bohannon R W 1991 Intra-examiner,
inter-examiner and interdevice comparability of leg
length measurements obtained with measuring tape and
Metrocom. Journal of Orthopaedic and Sports Physical
Therapy 14 (6): 263±268
Ingram A J 1980 Anterior poliomyelitis. In: Edmonson A S,
Crenshaw A H (eds). Campbell's operative
orthopaedics. Mosby, St-Louis p 1550±1560
Jonson S R, Gross M T 1997 Intraexaminer reliability,
interexaminer reliability and normal values for nine
lower extremity skeletal measures. Journal of
Orthopaedic and Sports Physical Therapy 25 (4): 253±263
Klein K K 1983 Developmental asymmetries and knee
injury. Physician and Sportsmedicine 11 (8): 67±72
Ludbrook J 1997 Comparing methods of measurement.
Clinical and Experimental Pharmacology and
Physiology 24: 193±203
McCaw S T 1992 Leg length inequality. Implications for
running injury prevention. Sports Medicine 14 (6):
422±429
McRae R 1990 Clinical orthopaedic assessment, 3rd edn.
Churchill Livingstone, New York
Subotnick S I 1981 Limb length discrepancies of the lower
extremity (the short leg syndrome). Journal of
Orthopaedic and Sports Physical Therapy 3 (1) 11±16
Taylor J R 1997 An introduction to error analysis: the study
of uncertainties in physical measurements, 2nd edn.
`Tape' and `Block' methods
University Science Books, California
Woerman A L, Binder-McLeod S A 1984 Leg length
discrepancy assessment: Accuracy and precision in ®ve
clinical methods of evaluation. Journal of Orthopaedic
and Sports Physical Therapy 5 (5): 230±239
Physical Therapy In Sport (2000) 1, 91±99 99
	The reliability and validity of the `Tape' and `Block' methods for assessing anatomical leg-length discrepancy
	Introduction
	Method
	Subjects and testers
	Testing procedures
	Study design
	Statistical analysis
	Results
	Discussion
	Acknowledgement
	References
	Figures
	Figure1
	Figure2
	Tables
	Table1
	Table2
	Table3

Outros materiais