Buscar

Apostila 7

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 3, do total de 36 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 6, do total de 36 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 9, do total de 36 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Prévia do material em texto

Apostila 7, PGE953, copyright Betsabe Blas, 2018
Apostila 7 - Outline
• Gauss Markov Theorem
• Generalized Least Squares (GLS)
• Weighted Least Squares (WLS)
• The linear mixed model as GLS
• Start distribution theory for linear models
1 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: What can we estimate?
• In Result 6.1, we showed that there was a unique
decomposition Y = Y1 + Y2 with Y1 ∈ C(X1) and
Y2 ∈ C⊥(X1), and that Y1 = PX1Y, the predicted values
from the least squares fit using X1.
• However, since C(X1) = C(X2), the same result holds for X2.
• Hence if C(X1) = C(X2), the predicted values from the least
squares fits are the same.
• Result 6.2: If two design matrices are reparameterizations of
one another, the predicted values of the form PXY are the
same.
2 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: More Properties of Least Squares
• Result 6.3:
E(β̂ols|X) = GXTXβ
cov(β̂ols|X) = σ2GXTXGT.
• If G is the Moore-Penrose inverse, then GXTXGT = G
3 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: More Properties of Least Squares
• Result 6.4:
E(�̂|X) = 0
cov(�̂|X) = σ2(I−PX)
4 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: More Properties of Least Squares
• In practice, it is important to estimate σ2. Let
r = rank(XTX). The usual estimate is
σ̂2 = (n− r)−1�̂T�̂
• Result 6.6: E(σ̂2|X) = σ2
5 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: Estimability
• We know that if XTX is of full rank, then E(β̂ols|X) = β,
and hence OLS gives us a method for estimating β and
functions of it.
• What happens if XTX is not of full rank?
• Definition 6.2: The linear combination cTβ is said to be
estimable if there exists an (n× 1) vector b with the property
that E(bTY|X) = cTβ for all β when E(Y|X) = Xβ. In
this case, bTY is said to be an unbiased estimator of cTβ.
• Basically we are asking: what can we actually estimate from a
non-full rank parameterization?
6 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: Estimability
• Result 6.7: The function cTβ is estimable if and only if
cT = bTX for some vector b.
• Result 6.8: The function cTβ is estimable if and only if
cT = cTGXTX, where G is a generalized inverse of XTX.
7 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Review: The Gauss Markov Theorem
• Gauss Markov Theorem: Suppose that cTβ is estimable.
Then among all estimators that are linear combinations of Y,
i.e., dTY and that are also unbiased, i.e., E(dTY|X) = cTβ,
cTβ̂ols is the one with the minimum variance.
• We showed that cTβ̂ols is an unbiased estimator of cTβ.
8 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
The Gauss Markov Theorem
• Next we have to show the minimum variance property. We
showed previously that (XGXTX)T = XT. Remember that
cT = cTGXTX = bTX. Then
var(cTβ̂ols|X) = σ2cTGXTXGTc
= σ2cTGXTXGTXTb
= σ2cTGXTb
= σ2cTGcT.
9 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
The Gauss Markov Theorem
• There is a somewhat subtle argument going on here
cTβ̂ols = cTGXTY
= bTXGXTY
= bTPXY
• The estimate does not depend on the form of the generalized
inverse
• Hence its variance does not
• For the M-P, it is symmetric, so GXTXGT = GXTXG = G
• Hence, the previous argument holds for any G
10 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
The Gauss Markov Theorem
• Now consider any other linear function dTY with the property
that it is unbiased, i.e., E(dTY|X) = cTβ.
• This means that for any β, dTXβ = cTβ, and hence that
dTX = cT.
• Then
cov(cTβ̂ols,dTY|X) = cov(cTGXTY,dTY|X)
= σ2cTGXTd
= σ2cTGc,
the last step following from dTX = cT.
11 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
The Gauss Markov Theorem
• This means that
0 ≤ var(cTβ̂ols − dTY|X)
= var(cTβ̂ols|X) + var(dTY|X)− 2cov(cTβ̂ols,dTY|X)
= σ2cTGc + var(dTY|X)− 2σ2cTGc
= var(dTY|X)− σ2cTGc
= var(dTY|X)− var(cTβ̂ols|X)
as desired.
12 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
The Gauss Markov Theorem
• The Gauss-Markov Theorem is a powerful result of course.
• It says that if we restrict to linear functions of Y that are
unbiased for cTβ, then β̂ols is as good as we can do.
• However, both unbiasedness and linear are constraints on the
system.
• There is a large literature on estimators that relax both.
13 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Critical Result
• Result 7.1: Suppose that the m×m symmetric matrix Σ is
positive semidefinite matrix of rank r. Then there is an m× r
matrix K such that Σ = KKT, and KTK is positive definite
of rank r.
• First use the spectral decomposition
Σ = P
[
D 0
0 0
]
PT,
where P is orthogonal and D is a diagonal matrix of the
positive eigenvalues of Σ
14 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Critical Result
• Now note that we can rewrite[
D 0
0 0
]
=
[
D1/2
0m−r×r
] [
D1/2 0r×m−r
]
=
[
D1/2
0m−r×r
] [
D1/2
0m−r×r
]T
• Hence, Σ = KKT, where
K = P
[
D1/2
0m−r×r
]
15 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Critical Result
• Hence, Σ = KKT, where
K = P
[
D1/2
0m−r×r
]
• Because PTP = I, we get
KTK = D
as claimed.
16 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Generalized Least Squares (GLS)
• The GLS model occurs when cov(�|X) 6= σ2I.
• Some examples later.
• Let us write
Y = Xβ + �;
E(Y|X) = Xβ;
cov(�|X) = σ2V;
cov(Y|X) = σ2V.
where V is positive definite (it is by definition symmetric).
17 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Generalized Least Squares (GLS)
• Since V is symmetric and positive definite, we can write it as
KKT, where K is nonsingular. (why?)
• Define
Y∗ = (K−1)Y;
X∗ = (K−1)X;
�∗ = (K−1)�
18 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Generalized Least Squares (GLS)
• Then, because if I know X I also know X∗ and vice-versa
because K is invertible.
E(Y∗|X∗) = E(Y∗|X) = X∗β;
cov(Y∗|X∗) = σ2I why?
• The OLS estimator for this model is the GLS estimator:
β̂gls = (XT∗X∗)−XT∗Y∗
= (XTV−1X)−XTV−1Y
19 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Generalized Least Squares (GLS)
• Remember,
E(Y∗|X∗) = E(Y∗|X) = X∗β;
cov(Y∗|X∗) = σ2I
• The GLS estimator is the OLS estimator for this equivalent
model. We minimize
(Y∗ −X∗β)T(Y∗ −X∗β) = (Y−Xβ)V−1(Y−Xβ)
(homework)
20 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Generalized Least Squares (GLS)
• Since (XTV−1X) is symmetric, its Moore-Penrose inverse is
as well, and if you use the Moore-Penose inverse, then
E(β̂gls|X) = (XTV−1X)−(XTV−1X)β;
cov(β̂gls|X) = σ2(XTV−1X)−
21 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Gauss-Markov for GLS
• Result 7.2: Let cTβ be estimable. Then cTβ̂gls is unbiased
for cTβ and it has minimum variance among all linear
estimators that are also unbiased for cTβ.
• Because K is invertible, an equivalent model is that
E(Y∗|X∗) = E(Y∗|X) = X∗β;
cov(Y∗|X∗) = σ2I
• Note that cTβ being estimable and K is invertible, this means
that
cT = bTX = bTKK−1X = bT∗X∗
• Thus, if cTβ is estimable in the original model, it is estimable
in this equivalent model, and hence we can apply
Gauss-Markov.
22 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Weighted Least Squares, an Example of GLS
• Suppose that V = diag(w21, w22, ..., w2n).
• This problem is generally known as weighted least squares.
• As homework, I have asked you to show that GLS is
equivalent to dividing Yi and Xi by wi.
• As homework, I have asked you to show that GLS minimizes∑n
i=1(Yi −XTi β)2/w2i
•This case is known as weighted least squares, with the weights
inversely proportional to the regression error variances
23 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
The Linear Mixed Model
• The most general form of the linear mixed model is that for
independent u and �,
Y = Xβ + Zu + �;
E(u|X,Z) = E(�|X,Z) = 0;
cov(u|X,Z) = D;
cov(�|X,Z) = σ2I
• Here, D is not necessarily a diagonal matrix.
24 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Repeated Measures
• The classic example of this model is repeated measures with a
random intercept. There are N individuals and mi
observations per individual, and n = ∑Ni=1mi.
• The model takes the form
Yij = XTi β + ui + �ij ;
ui =
[
0, σ2u
]
;
�i =
[
0, σ2
]
.
• Note how the data within an individual are correlated:
cov(Yij , Yik|Xi) = σ2u
25 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Repeated Measures
• The idea of repeated measures analysis is to get better
inference for β while taking into account the correlation in the
observations for an individual.
• In practice, the gain in efficiency can be very large
26 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Repeated Measures
• We can write the repeated measures model as a linear mixed
model. Let u = (u1, ..., uN )T. Let ei be an mi vector of
ones, and define
Z =

e1 0 0 · · · 0
0 e2 0 · · · 0
0 0 e3 · · · 0
... ... ... ... ...
0 0 0 · · · eN

• This is an n×N vector, where n is the total sample size and
N the number of individuals
27 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Repeated Measures and the Linear Mixed Model
• With X and � defined in the usual way, and λ = σ2u/σ2, we
have
Y = Xβ + Zu + �
• Since cov(U|X,Z) = σ2uIN, we have that
E(Y|X,Z) = Xβ;
cov(Y|X,Z) = σ2uZZT + σ2I
= = σ2{I + λZZT) = σ2V
• Note that from a homework assignment, V is positive
definite, even though ZZT is not.
28 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Repeated Measures and the Linear Mixed Model
• If we knew V, i.e., if we knew λ = (σ2u/σ2), then from,
Gauss-Markov, the GLS estimator is just
β̂gls = β̂gls(λ) = (XTV−1X)−XTV−1Y
• Later we will learn how to estimate the variance component
σ2u.
29 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Distribution Theory Relevant for Linear Models
• The standard normal distribution, denoted by
X = Normal(0, 1), has density function
f(z) = (2pi)−1 exp(−z2/2)
• Its first four moments are E(Z) = 0, E(Z2) = 1, E(Z3) = 0
and E(Z4) = 3.
• We write X = Normal(µ, σ2) if X = µ+ σZ, Z is standard
normal. Its density function is
fX(x, µ, σ2) = (2piσ2)−1/2 exp
{
−(2σ2)−1(x− µ)2
}
30 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Distribution Theory Relevant for Linear Models
• If X = Normal(µ, σ2), its moment generating function is
mX(t) = E{exp(tX)} =
∫
exp(tx)fX(x, µ, σ2)dx
= exp(tµ+ t2σ2/2)
• If X = Normal(µ, σ2), then Z = (X − µ)/σ = Normal(0, 1)
(homework)
31 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Distribution Theory Relevant for Linear Models
• If it exists, the moment generating function determines the
distribution function of a random variable, i.e., if two random
variables have the same mgf on an open interval including
zero, then they have the same distribution.
• The mgf determines the moments:
E(Xk) = ∂
kmX(t)
∂tk
∣∣∣∣∣
t=0
32 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Multivariate Standard Normal
• Z = (Z1, ..., Zn) is multivariate standard normal if the Zi are
independent standard normal random variables, and its density
function is
fZ(z) = (2pi)−n/2 exp(−zTz/2)
• The mgf of a multivariate random vector X is
mX(b) = E
{
exp(bTX)
}
• If the multivariate random vector X = (X1, ...,Xn) has mgf
mX(b), and if Xi has mgf mi(ti), then the Xi are
independent if and only if mX(b) =
∏n
i=1mi(ti)
33 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Multivariate Standard Normal
• Suppose that X has mean µ and covariance matrix Σ, where
rank(Σ) = r > 0, and hence that Σ has r non-zero
eigenvalues.
• Using the spectral decomposition, we can write Σ = KKT.
• Definition 7.1: Then X = Normal(µ,Σ), the multivariate
normal with mean µ, covariance matrix Σ and rank r if it has
the same distribution as µ+ KZ, where Z is multivariate
standard normal
34 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Multivariate Standard Normal
• Result 7.3: Suppose that X = Normal(µ,Σ). Then
(homework) its mgf is
mX(b) = exp(bTµ+ bTΣb/2)
• In addition, let a and B be constants. Then, with another
mgf calculation, a + BX = Normal(a + Bµ,BΣBT).
35 / 36
Apostila 7, PGE953, copyright Betsabe Blas, 2018
Multivariate Standard Normal
• Thus, if X = (XT1 ,XT2 ) and µ = (µT1 , µT2 )T, and if
Σ =
[
Σ11 Σ12
Σ21 Σ22
]
then X1 = Normal(µ1,Σ11) and X2 = Normal(µ2,Σ22)
• Result 7.4: In other words, the marginal distributions of a
multivariate normal are also multivariate normal.
36 / 36
	Apostila 7, PGE953, copyright Betsabe Blas, 2018

Outros materiais