(difference_in_differences)=
:::{post} Sept, 2022 :tags: counterfactuals, causal inference, time series, regression, posterior predictive, difference in differences, quasi experiments, panel data :category: intermediate :author: Benjamin T. Vincent :::
This notebook provides a brief overview of the difference in differences approach to causal inference, and shows a working example of how to conduct this type of analysis under the Bayesian framework, using PyMC. While the notebooks provides a high level overview of the approach, I recommend consulting two excellent textbooks on causal inference. Both The Effect {cite:p}huntington2021effect and Causal Inference: The Mixtape {cite:p}cunningham2021causal have chapters devoted to difference in differences.
Difference in differences would be a good approach to take for causal inference if:
Otherwise there are likely better suited approaches you could use.
Note that our desire to estimate the causal impact of a treatment involves counterfactual thinking. This is because we are asking "What would the post-treatment outcome of the treatment group be if treatment had not been administered?" but we can never observe this.
A classic example is given by a study by {cite:t}card1993minimum. This study examined the effects of increasing the minimum wage upon employment in the fast food sector. This is a quasi-experimental setting because the intervention (increase in minimum wages) was not applied to different geographical units (e.g. states) randomly. The intevention was applied to New Jersey in April 1992. If they measured pre and post intervention employment rates in New Jersey only, then they would have failed to control for omitted variables changing over time (e.g. seasonal effects) which could provide alternative causal explanations for changes in employment rates. But by selecting a control state (Pennsylvania), this allows one to infer that changes in employment in Pennsylvania would match the counterfactual - what would have happened if New Jersey had not received the intervention?
The causal DAG for difference in differences is given below. It says:
We are primarily interested in the effect of the treatment upon the outcome and how this changes over time (pre to post treatment). If we only focused on treatment, time and outcome on the treatment group (i.e. not have a control group), then we would be unable to attribute changes in the outcome to the treatment rather than any number of other factors occurring over time to the treatment group. Another way of saying this is that treatment would be fully determined by time, so there is no way to disambiguate the changes in the pre and post outcome measures as being caused by treatment or time.
[Image blocked: No description]
But by adding a control group, we are able to compare the changes in time of the control group and the changes in time of the treatment group. One of the key assumptions in the difference in differences approach is the parallel trends assumption - that both groups change in similar ways over time. Another way of saying this is that if the control and treatment groups change in similar ways over time, then we can be fairly convinced that difference in differences in groups over time is due to the treatment.
Note: I'm defining this model slightly differently compared to what you might find in other sources. This is to facilitate counterfactual inference later on in the notebook, and to emphasise the assumptions about trends over continuous time.
First, let's define a Python function to calculate the expected value of the outcome:
But we should take a closer look at this with mathematical notation. The expected value of the $i^{th}$ observation is $\mu_i$ and is defined by:
where there are the following parameters:
and the following observed data:
We can underline this latter point that treatment is causally influenced by time and group by looking at the DAG above, and by writing a Python function to define this function.
Very often a picture is worth a thousand words, so if the description above was confusing, then I'd recommend re-reading it after getting some more visual intuition from the plot below.
So we can summarise the intuition of difference in differences by looking at this plot:
If we can answer that question and estimate this counterfactual quantity, then we can ask: "What is the causal impact of the treatment?" And we can answer this question by comparing the observed post treatment outcome of the treatment group against the counterfactual quantity.
We can think about this visually and state another way... By looking at the pre/post difference in the control group, we can attribute any differences in the pre/post differences of the control and treatment groups to the causal effect of the treatment. And that is why the method is called difference in differences.
So we see that we have panel data with just two points in time: the pre ($t=0$) and post ($t=1$) intervention measurement times.
If we wanted, we could calculate a point estimate of the difference in differences (in a non-regression approach) like this.
But hang on, we are Bayesians! Let's Bayes...
For those already well-versed in PyMC, you can see that this model is pretty simple. We just have a few components:
outcome function that we already defined aboveNOTE: Technically we are doing 'pushforward prediction' for $\mu$ as this is a deterministic function of it's inputs. Posterior prediction would be a more appropriate label if we generated predicted observations - these would be stochastic based on the normal likelihood we've specified for our data. Nevertheless, this section is called 'posterior prediction' to emphasise the fact that we are following the Bayesian workflow.
We can plot what we've learnt below:
This is an awesome plot, but there are quite a few things going on here, so let's go through it:
So there we have it, we have a full posterior distribution over our estimated causal impact using the difference in differences approach.
Of course, when using the difference in differences approach for real applications, there is a lot more due diligence that's needed. Readers are encouraged to check out the textbooks listed above in the introduction as well as a useful review paper {cite:p}wing2018designing which covers the important contextual issues in more detail. Additionally, {cite:t}bertrand2004much takes a skeptical look at the approach as well as proposing solutions to some of the problems they highlight.
:::{bibliography} :filter: docname in docnames :::
:::{include} ../page_footer.md :::