(lecture_15)=
:::{post} Jan 7, 2024 :tags: statistical rethinking, bayesian inference, social networks :category: intermediate :author: Dustin Stansbury :::
This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.
Video - Lecture 15 - Social Networks# Lecture 15 - Social Networks
We'll loop between 2 and 3 often as we build the complexity of our model
At first, we'll ignore the backdoor paths to get a good flow, and get the model running, then add them in later
LKJCholeskyCov instead of LKJCorr for numerical stability/efficiencyRun the model on the real data samples
We use the same social ties network, but augment gift-giving behavior based on (unmeasured) household wealth:
Now we'll add predictor featurs to the model. Specifically, we'll add GLM parameters for
We can see that including parameters for giving and receiving reduces posterior standard deviation associated with social ties. This is expected because, we're explaining away more variance with those additional parameters.
Using this model--which is highly aligned with the data simulation--we're able to recover the coefficients from the underlying generative model.
Controlling for the correct confounds provides a better model of the data, in terms of cross-validation scores
Giving/receiving is mostly explained by friendship and/or household wealth, so after accounting for those variables, the giving receiving dynamics defined in the dyads has less signal
The x-shape in the joint is indicative of an independent set of variables. m
This indicates social ties are more-or-less random after accounting for freindship and wealth
In the lecture McElreath reports results on the real data, but given the version of the dataset I have in hand, it's somewhat unclear how to move forward.
To fit the model we need to know which columns in the real dataset are associated with
Looking at the dataset, it's not entirely clear which columns we should/could associate with each of those variables to replicate the figure in lecture. That said, if we DID know those columns--or how to derive them--it would be easy to fit the model via
Blocks and clusters are still discrete subgroups, what about "continuous clusters" like age or spatial distance? The goal of next lecture on Gaussian Processes
⚠️ The example below is using the simulated data
The observed gifting network is denser than the social ties network estimated from the data, indicating that the model's pooling is adding regularization (compression) to the network's of social ties.
it gets worse, though. In the scenario below, where we want to estimate the causal effect of $X$ on GDP/P, the fork created by $P$ isn't removed by simply calculating GDP/P
For example the plant growth experiment, where H0 and H1 are the starting and ending heights of the plant, X is the antifungal treatment.
:::{include} ../page_footer.md :::