conditional_autoregressive_priors.ipynb

https://github.com/pymc-devs/pymc-examples/blob/main/examples/spatial/conditional_autoregressive_priors.ipynb

(conditional_autoregressive_priors)=

Conditional Autoregressive (CAR) Models for Spatial Data

:::{post} Jul 29, 2022 :tags: spatial, autoregressive, count data :category: beginner, tutorial :author: Conor Hassan, Daniel Saunders :::

:::{include} ../extra_installs.md :::

Conditional Autoregressive (CAR) model

A conditional autoregressive CAR prior on a set of random effects ${\phi_i}_{i=1}^N$ models the random effect $\phi_i$ as having a mean, that is the weighted average the random effects of observation $i$'s adjacent neighbours. Mathematically, this can be expressed as

$\phi_i \big | \mathbf{\phi}_{j\sim i} \sim \text{Normal} \bigg( \alpha \frac{ \sum_{j=1}^{n_i}w_{ij} \phi_j}{n_i}, \sigma_{i}^2\bigg)$

where ${j \sim i}$ indicates the set of adjacent neighbours to observation $i$, $n_i$ denotes the number of adjacent neighbours that observation $i$ has, $w_{ij}$ is the weighting of the spatial relationship between observation $i$ and $j$. If $i$ and $j$ are not adjacent, then $w_{ij}=0$. Lastly, $\sigma_i^2$ is a spatially varying variance parameter for each area. Note that information such as an adjacency matrix, indicating the neighbour relationships, and a weight matrix $\textbf{w}$, indicating the weights of the spatial relationships, is required as input data. The parameters that we infer are ${\phi}{i=1}^N, {\sigma_i}{i=1}^N$, and $\alpha$.

Model specification

Here we will demonstrate the implementation of a CAR model using a canonical example: the lip cancer risk data in Scotland between 1975 and 1980. The original data is from [1]. This dataset includes observed lip cancer case counts ${y_i}{i=1}^N$ at $N=56$ spatial units in Scotland, with the expected number of cases ${E_i}{i=1}^N$ as an offset term, an intercept parameter, and and a parameter for an area-specific continuous variable for the proportion of the population employed in agriculture, fishing, or forestry, denoted by ${x_i}{i=1}^N$. We want to model how the lip cancer rates relate to the distribution of employment among industries, as exposure to sunlight is a risk factor. Mathematically, the model is \begin{align*} y_i &\sim \text{Poisson}\big (\lambda_i),\ \log \lambda_i &= \beta_0+\beta_1x_i + \phi_i + \log E_i,\ \phi_i \big | \mathbf{\phi}{j\sim i}&\sim\text{Normal}\big(\alpha\sum_{j=1}^{n_i}w_{ij}\phi_j, \sigma_{i}^2\big ), \ \beta_0, \beta_1 &\sim \text{Normal}\big (0, a\big ), \end{align*} where $a$ is the some chosen hyperparameter for the variance of the prior distribution of the regression coefficients.

Preparing the data

We need to load in the dataset to access the variables ${y_i, x_i, E_i}_{i=1}^N$. But more unique to the use of CAR models, is the creation of the necessary spatial adjacency matrix. For the models that we fit, all neighbours are weighted as $1$, circumventing the need for a weight matrix.

Below are the steps that we take to create the necessary adjacency matrix, where the entry $i,j$ of the matrix is $1$ if observations $i$ and $j$ are considered neighbours, and $0$ otherwise.

Visualizing the data

An important aspect of modelling spatial data is the ability to effectively visualize the spatial nature of the data, and whether the model that you have chosen captures this spatial dependency.

We load in an alternate version of the Scottish lip cancer dataset, from the libpysal package, to use for plotting.

We initially plot the observed number of cancer counts over the expected number of cancer counts for each area. The spatial dependency that we observe in this plot indicates that we may need to consider a spatial model for the data.

Writing some models in PyMC

Our first model: an independent random effects model

We begin by fitting an independent random effect's model. We are not modelling any spatial dependency between the areas. This model is equivalent to a Poisson regression model with a normal random effect, and mathematically looks like \begin{align*} y_i &\sim \text{Poisson}\big (\lambda_i),\ \log \lambda_i &= \beta_0+\beta_1x_i + \theta_i + \log E_i,\ \theta_i &\sim\text{Normal}\big(\mu=0, \tau=\tau_{\text{ind}}\big ), \ \beta_0, \beta_1 &\sim \text{Normal}\big (\mu=0, \tau = 1e^{-5}\big ), \ \tau_{\text{ind}} &\sim \text{Gamma}\big (\alpha=3.2761, \beta=1.81\big), \end{align*} where $\tau_\text{ind}$ is an unknown parameter for the precision of the independent random effects. The values for the $\text{Gamma}$ prior are chosen specific to our second model and thus we will delay explaining our choice until then.

We can plot the residuals of this first model.

The mean of the residuals for the areas appear spatially correlated. This leads us to explore the addition of a spatially dependent random effect, by using a conditional autoregressive (CAR) prior.

Our second model: a spatial random effects model (with fixed spatial dependence)

Let us fit a model that has two random effects for each area: an independent random effect, and a spatial random effect first. This models looks \begin{align*} y_i &\sim \text{Poisson}\big (\lambda_i),\ \log \lambda_i &= \beta_0+\beta_1x_i + \theta_i + \phi_i + \log E_i,\ \theta_i &\sim\text{Normal}\big(\mu=0, \tau=\tau_{\text{ind}}\big ), \ \phi_i \big | \mathbf{\phi}{j\sim i} &\sim \text{Normal}\big(\mu=\alpha\sum{j=1}^{n_i}\phi_j, \tau=\tau_{\text{spat}}\big ),\ \beta_0, \beta_1 &\sim \text{Normal}\big (\mu = 0, \tau = 1e^{-5}\big), \ \tau_{\text{ind}} &\sim \text{Gamma}\big (\alpha=3.2761, \beta=1.81\big), \ \tau_{\text{spat}} &\sim \text{Gamma}\big (\alpha=1, \beta=1\big ), \end{align*} where the line $\phi_i \big | \mathbf{\phi}{j\sim i} \sim \text{Normal}\big(\mu=\alpha\sum{j=1}^{n_i}\phi_j, \tau=\tau_{\text{spat}}\big )$ denotes the CAR prior, $\tau_\text{spat}$ is an unknown parameter for the precision of the spatial random effects, and $\alpha$ captures the degree of spatial dependence between the areas. In this instance, we fix $\alpha=0.95$.

Side note: Here we explain the prior's used for the precision of the two random effect terms. As we have two random effects $\theta_i$ and $\phi_i$ for each $i$, they are independently unidentifiable, but the sum $\theta_i + \phi_i$ is identifiable. However, by scaling the priors of this precision in this manner, one may be able to interpret the proportion of variance explained by each of the random effects.

We can see by plotting the residuals of the second model, by accounting for spatial dependency with the CAR prior, the residuals of the model appear more independent with respect to the spatial location of the observation.

If we wanted to be fully Bayesian about the model that we specify, we would estimate the spatial dependence parameter $\alpha$. This leads to ...

Our third model: a spatial random effects model, with unknown spatial dependence

The only difference between model 3 and model 2, is that in model 3, $\alpha$ is unknown, so we put a prior $\alpha \sim \text{Beta} \big (\alpha = 1, \beta=1\big )$ over it.

We can plot the marginal posterior for $\alpha$, and see that it is very near $1$, suggesting strong levels of spatial dependence.

Comparing the regression parameters $\beta_0$ and $\beta_1$ between the three models that we have fit, we can see that accounting for the spatial dependence between observations has the ability to greatly impact the interpretation of the effect of covariates on the response variable.

Limitations

Our final model provided some evidence that the spatial dependence parameter might be $1$. However, in the definition of the CAR prior, $\alpha$ can only take on values up to and excluding $1$. If $\alpha = 1$, we get an alternate prior called the intrinsic conditional autoregressive (ICAR) prior. The ICAR prior is more widely used in spatial models, specifically the BYM {cite:p}besag1991bayesian, Leroux {cite:p}leroux2000estimation and BYM2 {cite:p}riebler2016intuitive models. It also scales efficiently with large datasets, a limitation of the CAR distribution. Currently, work is being done to include the ICAR prior within PyMC.

Authors

Adapted from {ref}another PyMC example notebook <conditional_autoregressive_model> by Conor Hassan (pymc-examples#417) and Daniel Saunders (pymc-examples#547).

References

:::{bibliography} :filter: docname in docnames :::

Watermark

:::{include} ../page_footer.md :::