causal inference

- [[Judea Pearl]], [[machine learning]], [[Bayesian network]], [[causal revolution]], [[statistical mediation]], [[causal inference - Pearl's 3 levels]], [[potential outcomes]], [[identification strategy]], [[bias makes association different from causation]], [[sensitivity analysis]], [[propensity score]], [[time-dependent confounding]], [[optimal dynamic treatment strategies]], [[targeted learning]], [[incident user design]], [[marginal structural models]], [[counterfactuals]] # Idea Statisticians started thinking about causal modeling in the 1920s, but it only became its own area of research in the 1970s. In both statistical [[machine learning]] and computer science, learning causal relations is much more difficult than identifying correlations. Causal inference aims to identify **causal relations** between variables, whereas [[machine learning solves problems by framing all problems as prediction problems]]. Causal inference is essentially a [[missing data]] problem: since we can only observe the outcome that actually happened, we are always missing the [[counterfactuals|counterfactual outcome]]. Because we can never know both potential outcomes for an individual, we need to use a different method to estimate causal effects. The most accurate way to do this is to use [[randomization]]. ![[20250216175805.png]] ![[20250216175823.png]] Key concepts - [[correlation is not causation]] - [[counterfactuals|counterfactual thinking]] - figuring out how to determine [[conditional average treatment effect|individual treatment effect]] from [[average treatment effect]] (a [[missing data]] problem) In general, causal inference focuses on the causal effect of some treatment (or exposure to) $A$ on some outcome $Y$. Examples: - $A = 1$ if receive vaccine; $A = 0$ otherwise $A$ or $T$ or $D$ are used to denote treatment received. Let $T_i$ be the treatment intake/received for unit $i$: $T_{i}=\left\{\begin{array}{l}1 \text { if unit i received the treatment } \\ 0 \text { otherwise }\end{array}\right.$ The individual treatment/causal effect is the following: $ Y_{1 i}-Y_{0 i} $ ## Observed vs potential outcomes Broadly, $A$ has a causal effect on $Y$ if $Y^1$ differs from $Y^0$, which are [[potential outcomes]] (not observed outcomes). However, we can never observe the same unit/person with and without treatment, which is the [[fundamental problem of causal inference]]. When we say causal effects, we usually mean the **causal effects of some intervention or action**. That is, the causal effects of variables that can be **manipulated** (hypothetically or in reality). > Holland 1986: "no causation without manipulation" ## Different ways to think about causal inference It is more than association between variables: [[bias makes association different from causation]]. The herculean task of causal inference is to find clever ways to remove [[bias]] and make the treated and the untreated comparable so that all the difference we see is only the [[average treatment effect|average treatment effect]]. It is **prediction** of intervention. Knowing a **cause** of an intervention means being able to **predict** the consequences of an intervention. If you do something to a system, you cause a change and you know what the change is. It is **imputation** of missing observations. Knowing a cause means being able to construct unobserved counterfactual outcomes. What if I had done something else? ## Causal assumptions To identify causal effects, we must make certain untestable assumptions. They are often referred to as [[causal assumptions]]. - [[no hidden versions or variations of treatment]] - when we refer to the causal effects of $x$, it means $x$ can at least hypothetically be manipulated/randomized - immutable variables: certain variables aren't mutable or cannot be manipulated (e.g., race, gender, age, SES), so it's hard to think of the causal effects of these variables within the [[potential outcomes]] framework (though they **do** have causal effects!) - remember when we think of [[potential outcomes]] $Y^a$, we imagine we could, hypothetically, set treatment to $A = a$ and then observe an outcome # References - [\| Codecademy](https://www.codecademy.com/courses/learn-the-basics-of-causal-inference-with-r/lessons/potential-outcomes-framework/exercises/a-missing-data-problem) - [Causal Inference in R](https://www.r-causal.org/) - [online causal inference seminar](https://sites.google.com/view/ocis/home) - [01 - Introduction To Causality — Causal Inference for the Brave and True](https://matheusfacure.github.io/python-causality-handbook/01-Introduction-To-Causality.html) - [[Luo 2020 when causal inference meets deep learning]] - [[Lubke 2020 causal inference - linear regression with simulated data]] - [[Pearl 2018 book of why]] - [Causal Inference for The Brave and True — Causal Inference for the Brave and True](https://matheusfacure.github.io/python-causality-handbook/landing-page.html) - [Confusion over causality - Welcome and Introduction to Causal Effects | Coursera](https://www.coursera.org/learn/crash-course-in-causality/lecture/x4UMR/confusion-over-causality) - [Potential outcomes and counterfactuals - Welcome and Introduction to Causal Effects | Coursera](https://www.coursera.org/learn/crash-course-in-causality/lecture/0XWFc/potential-outcomes-and-counterfactuals) - [Hypothetical interventions - Welcome and Introduction to Causal Effects | Coursera](https://www.coursera.org/learn/crash-course-in-causality/lecture/Lgb6O/hypothetical-interventions)