- [[indirect inference]], [[endogenous variables|endogenous variable]], [[two-sample two-stage least-squares]]
# Idea
When researchers use two-stage least-squares, they are performing an [[instrumental variables|instrumental variable analysis]]. It's a method for estimating a causal effect. It's a consistent estimator of the [[local average treatment effect|complier average causal effect]].
![[s20220530_103856.png]]
$
Y_{i}=\beta_{0}+A_{i} \beta_{1}+\epsilon_{i}
$
By [[randomization]], $Z$ and error term are independent.
Rationale: Many variables can cause the treatment. We "split" the treatment into two pieces: one that can be explained by the instrument, and the part that can be explained by everything else. The part explained by the treatment is the **adjusted treatment variable**.
## Two stages
### First stage effect
Estimate the instrument-treatment correlation via regression, `treatment ~ instrument`. We estimate the **predicted value of treatment $A$**, given instrument $Z$.
$
\hat{A}_{i}=\hat{\alpha}_{0}+Z_{i} \hat{\alpha}_{1}
$
$
\hat{A}_{i} \text { is estimate of } \mathrm{E}(\mathrm{A} \mid \mathrm{Z})
$
Matrix notation ($X$ is often used instead of $A$):
$
\hat{X}=Z\left(Z^{\prime} Z\right)^{-1} Z^{\prime} X
$
If treatment $A$ is binary, then we're estimating the probability of treatment given $Z$.
### Second stage effect
Get estimated treatment effect via correlation between outcomes and adjusted treatments, `outcome ~ adjusted_treatment`. Regress the outcome $Y$ on the fitted value from first stage, $\hat{A_i}$:
$
Y_{i}=\beta_{0}+\hat{A}_{i} \beta_{1}+\epsilon_{i}
$
$\hat{A}$ is projection of $A$ onto space spanned by $Z$. The estimate of $\beta_i$ is estimate of causal effect.
Matrix notion ($\hat{X}$ = $\hat{A}$):
$
\hat{\boldsymbol{\beta}}_{2 S L S}=\left(\hat{X}^{\prime} \hat{X}\right)^{-1} \hat{X}^{\prime} Y
$
#### Interpretation of $\beta_i$
$
\beta_{1}=E(Y \mid \hat{A}=1)-E(Y \mid \hat{A}=0)
$
It's the contrast between the treated and untreated.
$
(\hat{\alpha}_{0}+\hat{\alpha}_{1}) - \hat{\alpha}_{0}
$
It's the [[local average treatment effect|complier average causal effect]]:
$
\beta_{1}=\mathrm{CACE}=\frac{\mathrm{E}(\mathrm{Y} \mid \mathrm{Z}=1)-\mathrm{E}(\mathrm{Y} \mid \mathrm{Z}=0)}{E(A \mid Z=1)-E(A \mid Z=0)}
$
## Code implementation
```r
# stage 1
stage1 <- lm(treatment ~ instrument, data = dt)
dt$adjusted_treatment <- predict(stage1, data = dt)
# stage 2
stage2 <- lm(outcome ~ adjusted_treatment, data = dt)
# actual example
library(ivpack); library(data.table)
data(card.data)
d <- data.table(card.data)
d[, educ12 := ifelse(educ > 12, 1, 0)] # make treatment binary
prop_complier <- d[, mean(educ12), keyby = .(nearc4)][, diff(V1)] # proportion of compliers
prop_complier
# intent-to-treat effect
itt <- d[, mean(lwage), keyby = .(nearc4)][, diff(V1)]
itt
# complier avg causal effect
itt / prop_complier
# 2-stage least squares approach
s1 <- lm(educ12 ~ nearc4, d)
d[, predx := predict(s1, type = 'response')]
s2 <- lm(lwage ~ predx, d) # SEs are incorrect! doesn't adjust for predictions from the first stage
summary(s2)
# fixest library
# https://lrberge.github.io/fixest/reference/feols.html
library(fixest)
# use ~1 to estimate model without exogenous variables
femodel <- feols(lwage ~ 1 | educ12 ~ nearc4, data = d)
femodel
feols(lwage ~ 1 | educ12 ~ nearc4, data = d, vcov = "HC1")
femodel$iv_first_stage
femodel$iv_first_stage$educ12$scores
summary(femodel, stage = 1)
summary(femodel, stage = 1:2)
summary(femodel, stage = 2:1)
femodel2 <- feols(lwage ~ exper + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668 | educ12 ~ nearc4, data = d, vcov = "HC1")
femodel2
# ivpack library
ivmodel <- ivreg(lwage ~ educ12 | nearc4, x = TRUE, data = d)
summary(ivmodel)
robust.se(ivmodel)
table(ivmodel$x$projected)
ivmodel2 <- ivreg(lwage ~ educ12 + exper + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668 | nearc4 + exper + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668, x = TRUE, data = d)
summary(ivmodel2)
```
$\frac{reduced \ form}{1st \ stage}$
# Sensitivity analysis
[[exclusion restriction]]: If $Z$ does directly affect $Y$ by an amount $\rho$, would my conclusions change? Vary $\rho$.
[[monotonicity assumption]]: If the proportion of defiers was $\pi$, would my conclusions change?
# References
- [12.1 The IV Estimator with a Single Regressor and a Single Instrument | Introduction to Econometrics with R](https://www.econometrics-with-r.org/12.1-TIVEWASRAASI.html)
- [Two stage leasxt squares - Instrumental Variables Methods | Coursera](https://www.coursera.org/learn/crash-course-in-causality/lecture/5B3AW/two-stage-leasxt-squares)
- [IV analysis in R - Instrumental Variables Methods | Coursera](https://www.coursera.org/learn/crash-course-in-causality/lecture/D19Ae/iv-analysis-in-r)
- https://campus.datacamp.com/courses/causal-inference-with-r-instrumental-variables-rdd/instrumental-variables-in-practice?ex=14
- https://campus.datacamp.com/courses/causal-inference-with-r-instrumental-variables-rdd/instrumental-variables-in-practice?ex=18