(lecture_17)=
:::{post} Jan 7, 2024 :tags: statistical rethinking, bayesian inference, measurement error :category: intermediate :author: Dustin Stansbury :::
This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.
Video - Lecture 17 - Measurement and Misclassification# Lecture 17 - Measurement & Misclassification
McElreath poses a re-dressing of the classic Monty Hall Problem. The original problem uses an old gameshow called "Let's Make a Deal" (hosted by Monty Hall) as the backdrop for a scenario where the correct, optimal strategy for winning a game is given by following the rules of probability theory. What's interesting about the Monty Hall problem is that the optimal strategy doesn't align with our intuitions.
In lecure, instead of opening doors to find donkeys or prizes, as was in the case of the Monty Hall problem, we have pancakes that are either burnt or perfectly cooked on either side. The thought experiment goes like this:
Most folks would say 1/2, which intuitively feels correct. However, the correct answer is given by Bayes rule:
If we define $U, D$ as observing upside $U$ or downsid $D$ hot being burnt and $U', D'$ as up or down sides being burnt
In the scenario above, measurement error can cause us over-estimate the actual effect
In the scenario above, measurement error can cause us under-estimate the actual effect
With
Not being clever, right out all possible measurement outcomes
The probability of observing $X=1$ given $p$ is the red path below
The probability of observing $X=0$ given $p$ is the red path below
AFAICT, the dataset in the Scelza et al. paper isn't publicly available, so let's simulate one from the process defined by the generative model.
We print out the actual, and observed paternity rates above. In principle, we should be able to recover these with our model
pm.Potential function, which takes in the log probability of each observational distribution
custom function used in lecturelogsumexp, etc. functions, as it compiles just fineThere are a number of problems and solutions related to modeling measurment error and misclassification
pm.logsumexp: efficiently computes the log of the sum of exponentials of input elements.pm.math.log1mexp: calculates $\log(1 - \exp(-x))$pm.log(1 - p)Some terms can become numerically unstable
log1p and Taylor series approximation for small $p$logsumexp:::{include} ../page_footer.md :::