https://github.com/pymc-devs/pymc-examples/blob/main/examples/bart/bart_heteroscedasticity.ipynb
(bart_heteroscedasticity)=
:::{post} January, 2023 :tags: BART, regression :category: beginner, reference :author: Juan Orduz :::
In this notebook we show how to use BART to model heteroscedasticity as described in Section 4.1 of pymc-bart's paper {cite:p}quiroga2022bart. We use the marketing data set provided by the R package datarium {cite:p}kassambara2019datarium. The idea is to model a marketing channel contribution to sales as a function of budget.
We start by looking into the data. We are going to focus on Youtube.
We clearly see that both the mean and variance are increasing as a function of budget. One possibility is to manually select an explicit parametrization of these functions, e.g. square root or logarithm. However, in this example we want to learn these functions from the data using a BART model.
We proceed to prepare the data for modeling. We are going to use the budget as the predictor and sales as the response.
Next, we specify the model. Note that we just need one BART distribution which can be vectorized to model both the mean and variance. We use a Gamma distribution as likelihood as we expect the sales to be positive.
We now fit the model.
We can now visualize the posterior predictive distribution of the mean and the likelihood.
The fit looks good! In fact, we see that the mean and variance increase as a function of the budget.
:::{bibliography} :filter: docname in docnames :::
:::{include} ../page_footer.md :::