(lecture_06)=
:::{post} Jan 7, 2024 :tags: statistical rethinking, bayesian inference, causal inference, controls :category: intermediate :author: Dustin Stansbury :::
This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.
Being clever is
Using explicit causal models allows one to:
The gold standard is Randomization. However randomization often this generally isn't possible:
in such a way that all the non-causal arrows entering X have been removed
It turns out that we can analyze graph structure to determine if there is such a procedure that exists
In the Fork example, we've shown that stratifying by the confound, we "close" the fork by conditioning on U, thus removing any of the causal effect of U on X, thus allowing us to isolate the treatment's effect on Y.
This procedure is part of what is known as Do-calculus. The operator do(X) tends to mean intervening on X (i.e. setting it to a specific value that is independent of the confound)
i.e. the procedure that gives us the intervention on X is equivalent of the distribution of Y, stratified by the treatment X and the confound X, averaged over the distribution of the confound.
Note that when we use linear regression estimator for each X, we are implicity marginalizing and averaging over out treatment and confound (e.g. in the model form $Y \sim \mathcal{N}(\alpha + \beta_X X + \beta_Z Z, \sigma^2)$
Shortcut for applying Do-calculus graphically with your eyeballs. General rule for finding the minimal sufficient adjustment set of variables to condition on.
Backdoor path highlighted in red.
Here we simulate a situation where Y is caused by X and and an unmeasured confound U that also effects Z and X. (We could also prove mathematically, but simulation is quite confincing as well--for me anyways)
NOTE: the model coefficient beta_Z means nothing in in terms of causal effect of $Z$ on $Y$. In order to determine the causal effect of $Z$ on $Y$ you'd need a different estimator. In general, variables in the adjustment set are not interpretable. This is related to the "Table 2 Fallacy"
Chossing B over A turns out to be more statistically efficient, though not causally different than choosing A
$Z$ is a collider for unobserved variables $u$ and $v$, which independently affect $X$ and $Y$
No backdoor path here, so no need to control for any confounds. In fact, stratifying by Z (the bad mediator) will introduce bias in estimate because it introduces the causal effect of u that would otherwise be blocked.
Z is often a post-treatment variable, e.g. below, where "Happiness" is affected by the treatment "Win Lottery"
though there is no causal effect, you end up concluding a negative effect of X on Y
Generally, Avoid the Collider!
Adding descendants of the target variable is almost always a terrible idea, because your selecting groups based on the outcome. This is known as Case Control Bias (selection on outcome)
Colliders not always so obvious
Collider is formed by unobserved variable u
Stratifying on a variable affected by the outcome is a very bad practice.
The estimated causal effect has been reduced because the descendent reduces the variation in $Y$ that can can be explained by $X$
The descendant no longer has any effect here, so we should recover the same (correct) inference for both stratified and unstratified models
Now $Z$ is a parent of $X$
Stratifying by Z doesn't add bias (it's centered on the correct value), but it does increase variance in estimator. This reduction in precision is proportional to the magnitude of the causal relationship between Z and X
Stratifying on an ancestor when there are other confounders, particularly unobserved forks. This is like the Bias Parasite scenario, but it also adds bias.
Above we see that both estimators are biased -- even in the best case, we can't observe, and thus control for the the confound $u$. But when stratifying by the ancestor, things are MUCH WORSE.
Hand-wavy explanation:
You have to do scientific modeling to do scientific analysis
Adjustment set is ${S, A}$
Conditioning on $A$ and $S$ essentially removes the arrows going into $X$, $\beta_X$ giving us the direct effect of $X$ on $Y$
In the unconditional model, the total causal effect of $A$ on $Y$ flows through all paths:
This gets trickier if we consider unobserved confounds on variables!
:::{include} ../page_footer.md :::