Buscar

Note 6 Econometrics Havard


Continue navegando


Prévia do material em texto

Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
Note 6: Instrumental Variable Model - Introduction
1 Motivation
We provide some key motivation for the Instrumental Variable Model. First, this model is helpful
for estimating regression functions with omitted variables (in cross sections). When focus is on
predictive effects holding constant some omitted variables, the instrumental variable model (through
assumptions) provides an approach. Another main use of this model is to obtain causal effects from
predictive effects. The assumptions of the model allows one to interpret predictive effects as causal
ones. Causal effects are predictive effects but for counterfactuals that are not observed. For
example, would your headache have disappeared had you not taken an aspirin (everything else held
constant)? or would your wage be this high had you not gone to business school? There is a certain
sense in which causal effects are the most interesting to uncover as they relate to explanations of
phenomena rather than pure prediction. The causal effect motivation for IV is done in the next
two notes. Here, we focus on the omitted variable bias motivation for IV.
The original use of the IV model in economics dates back to the work of Wright in the 1920s,
either the dad or the son Sewall - there is a controversy as to who wrote the Appendix where
this was developed. Please see the interesting tidbit on this at
http://scholar.harvard.edu/stock/content/history-iv-regression
This IV was used in the context of simultaneity that arises in (empirical) models of demand
and supply. Simultaneity is a feature of equilibrium based models.
1
Victoria
Highlight
Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
2 Omitted Variable Bias
Suppose that we are interested in the long regression:
E(Yi|FBi, EDi, Ai) = FB′iφ+ EDiβ +Ai
but data on Ai are not available, and we run least squares fit of Y on FB and ED. The least-squares
coefficients will converge in probability to the coefficients in the following short linear predictor:
E∗(Yi|FBi, EDi) = FB′iφ˜+ EDiβ˜
The relationship between the long coefficients φ, β and the short coefficients φ˜ and β˜ was worked
out earlier in class. We need to consider an auxiliary linear predictor of the omitted variable Ai on
FBi and EDi:
E∗(Ai|FBi, EDi) = FB′iψ1 + EDiψ2
The omitted variable formula gives
φ˜ = φ+ ψ1, β˜ = β + ψ2
For example, Yi is the log of earnings of individual i, FBi consists of a constant and a set of
family background variables, EDi is years of schooling, and Ai is a measure of initial (prior to the
schooling) ability. The scale of Ai is chosen so that its coefficient equals one in the long regression.
The short least-squares fit provides consistent estimates of the short linear predictor coeffi-
cients φ˜ and β˜. But these differ from the long regression coefficients by the auxiliary coefficients ψ1
and ψ2. This is a classic problem of omitted variable bias. The instrumental variable model
will provide a solution. This new model requires an additional variable (or set of variables),
called instruments that satisfy certain exclusion restrictions.
2.1 Exclusion Restrictions and Random Assignment
Now suppose that we observe an additional variable (or set of variables) SUBi (think of this as an
education subsidy), so that the data we observe
(FBi, SUBi, EDi, Yi) for i = 1, . . . , n
2
Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
Ai is not observed. We assume random sampling and maintain the following assumptions which
define the IV model.
The first exclusion restriction is that SUBi does not help to predict Yi if it is added to
the long regression:
E(Yi|FBi, SUBi, EDi, Ai) = FB′iφ+ EDiβ +Ai
Notice here that we are controlling for Ai, i.e., when we include Ai, the variable SUBi has no
predictive effect (of course that is not true, i.e., SUBi matters, if we omit Ai).
The second exclusion restriction is that SUBi does not help to predict Ai in a linear
predictor that includes FBi:
E∗(Ai|FBi, SUBi) = FB′iλ
For example, SUBi is an education subsidy that provides encouragement to obtain additional
schooling. So it is correlated with EDi, but the first exclusion restriction is that once we control
for EDi (and the other regressors in the long regression such as Ai), the amount of subsidy that the
individual receives does not have any additional predictive power. The second exclusion restriction
is satisfied if the subsidy is randomly assigned. For example, suppose that the subsidy takes on
only two values, zero and one, and the value that is assigned to i is determined by a coin flip. Then
SUBi will not be correlated with Ai or FBi. We can allow for a modified version of the second
exclusion restriction whereby we allow for the linear predictor to be uncorrelated with FB, i.e.,
E∗(Ai|FBi, SUBi) = 0
but this is not needed if we are indeed interested in learning β.
Now, define the prediction errors:
Wi = Ai − E∗(Ai|FBi, SUBi)
Ui = Yi − E(Yi|FBi, SUBi, EDi, Ai)
and write the equations
Ai = FB
′
iλ+Wi
Yi = FB
′
iφ+ EDiβ +Ai + Ui
3
Victoria
Highlight
Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
Note that Wi and Ui are orthogonal to FBi and SUBi. Substitute for Ai in the Yi equation:
Yi = FB
′
i(φ+ λ) + EDiβ + (Wi + Ui)
= FB′iδ + EDiβ + Vi
with δ = φ+ λ and Vi = Wi + Ui. Note that FBi and SUBi are orthogonal to Vi
E(FBiVi) = 0, E(SUBiVi) = 0
These key moment conditions come from both exclusion restrictions that define the model. These
moment conditions will form the basis for estimating β. Note: here, least squares estimation does
not give us the correct answer.
3 Just identified case
Again, we have
Yi = FB
′
iδ + EDiβ + Vi
and define here
Ri = (FB
′
i EDi) Bi = (FB
′
i SUBi)
′, γ = (δ β)′
where note here, Ri is a row vector and Bi is a column vector. Then, the exclusion restrictions
imply that
Yi = Riγ + Vi E(BiVi) = 0
Here, in the just identified case, since EDi is a scalar, the dimension K of the coefficient
vector γ above is
K = dim(FBi) + 1
The dimension L of Bi is
L = dim(FBi) + dim(SUBi)
So if there is a single variable in SUB, then L = K and the number of orthogonality conditions
equals the number of parameters to be estimated. This is the just-identified case. The estimation
4
Victoria
Highlight
Victoria
Sticky Note
why??
Victoria
Highlight
Victoria
Highlight
Victoria
Highlight
Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
of γ is based on the L orthogonality conditions in E(BiVi) = 0. The resulting estimator is often
called an instrumental variables (IV) estimator. In the estimation context, all the variables in B
are instrumental variables; there is no distinction between FB and SUB in providing orthogonality
conditions. But in terms of the underlying model, FB and SUB play very different roles. The
exclusion restrictions at the core of the model only apply to SUB. The random assignment argument
only applies to SUB. So if we do refer to FB as instrumental variables (in the sense of generating
orthogonality conditions), we should keep in mind that it is the excluded instrumental variables in
SUB that play the key role in an instrumental variable model.
We can exploit the orthogonality conditions above by multiplying the Yi equation by Bi
BiYi = (BiRi)γ +BiVi
so, then
E(BiYi) = [E(BiRi)]γ
and so,
γ = [E(BiYi)]
−1E[BiYi]
if EBiRi is nonsingular. This is an important assumption and is often called instrument rele-
vance in the UV literature. I will illustrate this below in the case when we have FBi = 1.
Suppose we have thespecial case where FBi = 1, then, we can translate the above as
Cov(SUBi, Vi) = 0 and so
Cov(SUBi, Yi) = βCov(SUBi, EDi)
then solving for β
β =
Cov(Subi, Yi)
Cov(Subi, EDi)
if
Cov(SUBi, EDi) 6= 0
This is EXACTLY the instrument relevance condition above: it REQUIRES THAT THE INSTRU-
MENT BE CORRELATED WITH THE VARIABLE OF INTEREST (Subsidies are correlated
with whether someone attends college). Instrument relevance is a condition we can check in the
5
Victoria
Highlight
Victoria
Highlight
Victoria
Highlight
Victoria
Highlight
Victoria
Highlight
Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
data. So, to summarize, in addition to the exclusion restrictions on SUB, we require that SUB
be CORRELATED with ED. The estimator for β here that substitutes sample analogues for the
population covariances is called the IV estimator. Its sample formula is
βˆIV =
∑
i(Subi − Sub)(Yi − Y )∑
i(Subi − Sub)(EDi − ED)
4 Example: Returns to Education
Card (1995) used wage and education data1 for a sample of men in 1976 to estimate the return
to education. He used a dummy variable for whether someone grew up near a four- year college
(nearc4) as an instrumental variable for education. In a log(wage) equation, he included other
standard controls: experience, a black dummy variable, dummy variables for living in an SMSA
and living in the South, and a full set of regional dummy variables and an SMSA dummy for where
the man was living in 1966. In order for nearc4 to be a valid instrument, it must be uncorrelated
with the error term in the wage equationwe assume thisand it must be partially correlated with
educ. To check the latter requirement, we regress educ on nearc4 and all of the exogenous variables
appearing in the equation. (That is, we estimate the reduced form for educ.)
Using TSLS (which is IV for just identified models), Card obtains the following estimates
and contrast them with Ordinary LS. The results are given in the following Table (reproduced from
Wooldridge).
As you can see, the column OLS states that the predictive effect of an extra year of education
1This example was obtained from Wooldridge (2005).
6
Victoria
Highlight
Victoria
Sticky Note
why are we so interested in the beta of ED??
Victoria
Highlight
Victoria
Sticky Note
not for wages, for education
Harvard Economics
Ec 1126
Tamer - October 22, 2015
Note 6 - IV Model
holding all other variables constant is an increase of 7.5% in wages. This regression though does
not take care of ”ability” and so if you are interested in the predictive effect for people with the
same ability (and the other regressors), and of course ability is NOT observed than we use IV.
The instrument used here is nearc4 and the regression is estimated via instrumental variables.
Now, the effect of an extra year of eduction seems to increase to 13%. But, the estimate is
MUCH MORE noisy as the standard errors increase 18 folds and the 95% CI for this effect is
[.132− 2 ∗ .055, .132 + 1.96 ∗ .055] = [.0242, .239] which is wide.
The next note will provide formulas based on asymptotic approximations that allow us to
build valid confidence intervals.
7
Victoria
Highlight
	Motivation
	Omitted Variable Bias
	Exclusion Restrictions and Random Assignment
	Just identified case
	Example: Returns to Education