Prévia do material em texto
Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model Note 6: Instrumental Variable Model - Introduction 1 Motivation We provide some key motivation for the Instrumental Variable Model. First, this model is helpful for estimating regression functions with omitted variables (in cross sections). When focus is on predictive effects holding constant some omitted variables, the instrumental variable model (through assumptions) provides an approach. Another main use of this model is to obtain causal effects from predictive effects. The assumptions of the model allows one to interpret predictive effects as causal ones. Causal effects are predictive effects but for counterfactuals that are not observed. For example, would your headache have disappeared had you not taken an aspirin (everything else held constant)? or would your wage be this high had you not gone to business school? There is a certain sense in which causal effects are the most interesting to uncover as they relate to explanations of phenomena rather than pure prediction. The causal effect motivation for IV is done in the next two notes. Here, we focus on the omitted variable bias motivation for IV. The original use of the IV model in economics dates back to the work of Wright in the 1920s, either the dad or the son Sewall - there is a controversy as to who wrote the Appendix where this was developed. Please see the interesting tidbit on this at http://scholar.harvard.edu/stock/content/history-iv-regression This IV was used in the context of simultaneity that arises in (empirical) models of demand and supply. Simultaneity is a feature of equilibrium based models. 1 Victoria Highlight Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model 2 Omitted Variable Bias Suppose that we are interested in the long regression: E(Yi|FBi, EDi, Ai) = FB′iφ+ EDiβ +Ai but data on Ai are not available, and we run least squares fit of Y on FB and ED. The least-squares coefficients will converge in probability to the coefficients in the following short linear predictor: E∗(Yi|FBi, EDi) = FB′iφ˜+ EDiβ˜ The relationship between the long coefficients φ, β and the short coefficients φ˜ and β˜ was worked out earlier in class. We need to consider an auxiliary linear predictor of the omitted variable Ai on FBi and EDi: E∗(Ai|FBi, EDi) = FB′iψ1 + EDiψ2 The omitted variable formula gives φ˜ = φ+ ψ1, β˜ = β + ψ2 For example, Yi is the log of earnings of individual i, FBi consists of a constant and a set of family background variables, EDi is years of schooling, and Ai is a measure of initial (prior to the schooling) ability. The scale of Ai is chosen so that its coefficient equals one in the long regression. The short least-squares fit provides consistent estimates of the short linear predictor coeffi- cients φ˜ and β˜. But these differ from the long regression coefficients by the auxiliary coefficients ψ1 and ψ2. This is a classic problem of omitted variable bias. The instrumental variable model will provide a solution. This new model requires an additional variable (or set of variables), called instruments that satisfy certain exclusion restrictions. 2.1 Exclusion Restrictions and Random Assignment Now suppose that we observe an additional variable (or set of variables) SUBi (think of this as an education subsidy), so that the data we observe (FBi, SUBi, EDi, Yi) for i = 1, . . . , n 2 Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model Ai is not observed. We assume random sampling and maintain the following assumptions which define the IV model. The first exclusion restriction is that SUBi does not help to predict Yi if it is added to the long regression: E(Yi|FBi, SUBi, EDi, Ai) = FB′iφ+ EDiβ +Ai Notice here that we are controlling for Ai, i.e., when we include Ai, the variable SUBi has no predictive effect (of course that is not true, i.e., SUBi matters, if we omit Ai). The second exclusion restriction is that SUBi does not help to predict Ai in a linear predictor that includes FBi: E∗(Ai|FBi, SUBi) = FB′iλ For example, SUBi is an education subsidy that provides encouragement to obtain additional schooling. So it is correlated with EDi, but the first exclusion restriction is that once we control for EDi (and the other regressors in the long regression such as Ai), the amount of subsidy that the individual receives does not have any additional predictive power. The second exclusion restriction is satisfied if the subsidy is randomly assigned. For example, suppose that the subsidy takes on only two values, zero and one, and the value that is assigned to i is determined by a coin flip. Then SUBi will not be correlated with Ai or FBi. We can allow for a modified version of the second exclusion restriction whereby we allow for the linear predictor to be uncorrelated with FB, i.e., E∗(Ai|FBi, SUBi) = 0 but this is not needed if we are indeed interested in learning β. Now, define the prediction errors: Wi = Ai − E∗(Ai|FBi, SUBi) Ui = Yi − E(Yi|FBi, SUBi, EDi, Ai) and write the equations Ai = FB ′ iλ+Wi Yi = FB ′ iφ+ EDiβ +Ai + Ui 3 Victoria Highlight Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model Note that Wi and Ui are orthogonal to FBi and SUBi. Substitute for Ai in the Yi equation: Yi = FB ′ i(φ+ λ) + EDiβ + (Wi + Ui) = FB′iδ + EDiβ + Vi with δ = φ+ λ and Vi = Wi + Ui. Note that FBi and SUBi are orthogonal to Vi E(FBiVi) = 0, E(SUBiVi) = 0 These key moment conditions come from both exclusion restrictions that define the model. These moment conditions will form the basis for estimating β. Note: here, least squares estimation does not give us the correct answer. 3 Just identified case Again, we have Yi = FB ′ iδ + EDiβ + Vi and define here Ri = (FB ′ i EDi) Bi = (FB ′ i SUBi) ′, γ = (δ β)′ where note here, Ri is a row vector and Bi is a column vector. Then, the exclusion restrictions imply that Yi = Riγ + Vi E(BiVi) = 0 Here, in the just identified case, since EDi is a scalar, the dimension K of the coefficient vector γ above is K = dim(FBi) + 1 The dimension L of Bi is L = dim(FBi) + dim(SUBi) So if there is a single variable in SUB, then L = K and the number of orthogonality conditions equals the number of parameters to be estimated. This is the just-identified case. The estimation 4 Victoria Highlight Victoria Sticky Note why?? Victoria Highlight Victoria Highlight Victoria Highlight Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model of γ is based on the L orthogonality conditions in E(BiVi) = 0. The resulting estimator is often called an instrumental variables (IV) estimator. In the estimation context, all the variables in B are instrumental variables; there is no distinction between FB and SUB in providing orthogonality conditions. But in terms of the underlying model, FB and SUB play very different roles. The exclusion restrictions at the core of the model only apply to SUB. The random assignment argument only applies to SUB. So if we do refer to FB as instrumental variables (in the sense of generating orthogonality conditions), we should keep in mind that it is the excluded instrumental variables in SUB that play the key role in an instrumental variable model. We can exploit the orthogonality conditions above by multiplying the Yi equation by Bi BiYi = (BiRi)γ +BiVi so, then E(BiYi) = [E(BiRi)]γ and so, γ = [E(BiYi)] −1E[BiYi] if EBiRi is nonsingular. This is an important assumption and is often called instrument rele- vance in the UV literature. I will illustrate this below in the case when we have FBi = 1. Suppose we have thespecial case where FBi = 1, then, we can translate the above as Cov(SUBi, Vi) = 0 and so Cov(SUBi, Yi) = βCov(SUBi, EDi) then solving for β β = Cov(Subi, Yi) Cov(Subi, EDi) if Cov(SUBi, EDi) 6= 0 This is EXACTLY the instrument relevance condition above: it REQUIRES THAT THE INSTRU- MENT BE CORRELATED WITH THE VARIABLE OF INTEREST (Subsidies are correlated with whether someone attends college). Instrument relevance is a condition we can check in the 5 Victoria Highlight Victoria Highlight Victoria Highlight Victoria Highlight Victoria Highlight Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model data. So, to summarize, in addition to the exclusion restrictions on SUB, we require that SUB be CORRELATED with ED. The estimator for β here that substitutes sample analogues for the population covariances is called the IV estimator. Its sample formula is βˆIV = ∑ i(Subi − Sub)(Yi − Y )∑ i(Subi − Sub)(EDi − ED) 4 Example: Returns to Education Card (1995) used wage and education data1 for a sample of men in 1976 to estimate the return to education. He used a dummy variable for whether someone grew up near a four- year college (nearc4) as an instrumental variable for education. In a log(wage) equation, he included other standard controls: experience, a black dummy variable, dummy variables for living in an SMSA and living in the South, and a full set of regional dummy variables and an SMSA dummy for where the man was living in 1966. In order for nearc4 to be a valid instrument, it must be uncorrelated with the error term in the wage equationwe assume thisand it must be partially correlated with educ. To check the latter requirement, we regress educ on nearc4 and all of the exogenous variables appearing in the equation. (That is, we estimate the reduced form for educ.) Using TSLS (which is IV for just identified models), Card obtains the following estimates and contrast them with Ordinary LS. The results are given in the following Table (reproduced from Wooldridge). As you can see, the column OLS states that the predictive effect of an extra year of education 1This example was obtained from Wooldridge (2005). 6 Victoria Highlight Victoria Sticky Note why are we so interested in the beta of ED?? Victoria Highlight Victoria Sticky Note not for wages, for education Harvard Economics Ec 1126 Tamer - October 22, 2015 Note 6 - IV Model holding all other variables constant is an increase of 7.5% in wages. This regression though does not take care of ”ability” and so if you are interested in the predictive effect for people with the same ability (and the other regressors), and of course ability is NOT observed than we use IV. The instrument used here is nearc4 and the regression is estimated via instrumental variables. Now, the effect of an extra year of eduction seems to increase to 13%. But, the estimate is MUCH MORE noisy as the standard errors increase 18 folds and the 95% CI for this effect is [.132− 2 ∗ .055, .132 + 1.96 ∗ .055] = [.0242, .239] which is wide. The next note will provide formulas based on asymptotic approximations that allow us to build valid confidence intervals. 7 Victoria Highlight Motivation Omitted Variable Bias Exclusion Restrictions and Random Assignment Just identified case Example: Returns to Education