MOGP-Coregion-Hadamard.ipynb

https://github.com/pymc-devs/pymc-examples/blob/main/examples/gaussian_processes/MOGP-Coregion-Hadamard.ipynb

(Multi-output-GPs_Coregion)=

Multi-output Gaussian Processes: Coregionalization models using Hamadard product

:::{post} October, 2022 :tags: gaussian process, multi-output :category: intermediate :author: Danh Phan, Bill Engels, Chris Fonnesbeck :::

This notebook shows how to implement the Intrinsic Coregionalization Model (ICM) and the Linear Coregionalization Model (LCM) using a Hamadard product between the Coregion kernel and input kernels. Multi-output Gaussian Process is discussed in this paper by {cite:t}bonilla2007multioutput. For further information about ICM and LCM, please check out the talk on Multi-output Gaussian Processes by Mauricio Alvarez, and his slides with more references at the last page.

The advantage of Multi-output Gaussian Processes is their capacity to simultaneously learn and infer many outputs which have the same source of uncertainty from inputs. In this example, we model the average spin rates of several pitchers in different games from a baseball dataset.

Preparing the data

The baseball dataset contains the average spin rate of several pitchers on different game dates.

Top N popular pitchers

Create a game date index

Create training data

Visualise training data

Intrinsic Coregionalization Model (ICM)

The Intrinsic Coregionalization Model (ICM) is a particular case of the Linear Coregionalization Model (LCM) with one input kernel, for example:

$K_{ICM} = B \otimes K_{ExpQuad}$

Where $B(o,o')$ is the output kernel, and $K_{ExpQuad}(x,x')$ is an input kernel.

$B = WW^T + diag(kappa)$

Prediction

It can be seen that the average spin rate of Rodriguez Richard decreases significantly from the 75th game dates. Besides, Kopech Michael's performance improves after a break of several weeks in the middle, while Hearn Taylor has performed better recently.

Linear Coregionalization Model (LCM)

The LCM is a generalization of the ICM with two or more input kernels, so the LCM kernel is basically a sum of several ICM kernels. The LMC allows several independent samples from GPs with different covariances (kernels).

In this example, in addition to an ExpQuad kernel, we add a Matern32 kernel for input data.

$K_{LCM} = B \otimes K_{ExpQuad} + B \otimes K_{Matern32}$

Prediction

Acknowledgement

This work is supported by 2022 Google Summer of Codes and NUMFOCUS.

Authors

Authored by Danh Phan, Bill Engels, Chris Fonnesbeck in November, 2022 (pymc-examples#454)

References

:::{bibliography} :filter: docname in docnames :::

Watermark

:::{include} ../page_footer.md :::