(GP-MeansAndCovs)=
:::{post} Mar 22, 2022 :tags: gaussian process :category: intermediate, reference :author: Bill Engels, Oriol Abril Pla :::
A large set of mean and covariance functions are available in PyMC. It is relatively easy to define custom mean and covariance functions. Since PyMC uses PyTensor, their gradients do not need to be defined by the user.
The following mean functions are available in PyMC.
pymc.gp.mean.Zeropymc.gp.mean.Constantpymc.gp.mean.LinearAll follow a similar usage pattern. First, the mean function is specified. Then it can be evaluated over some inputs. The first two mean functions are very simple. Regardless of the inputs, gp.mean.Zero returns a vector of zeros with the same length as the number of input values.
The default mean functions for all GP implementations in PyMC is Zero.
gp.mean.Constant returns a vector whose value is provided.
As long as the shape matches the input it will receive, gp.mean.Constant can also accept a PyTensor tensor or vector of PyMC random variables.
gp.mean.Linear is a takes as input a matrix of coefficients and a vector of intercepts (or a slope and scalar intercept in one dimension).
To define a custom mean function, subclass gp.mean.Mean, and provide __call__ and __init__ methods. For example, the code for the Constant mean function is
Remember that PyTensor must be used instead of NumPy.
PyMC contains a much larger suite of {mod}built-in covariance functions <pymc.gp.cov>. The following shows functions drawn from a GP prior with a given covariance function, and demonstrates how composite covariance functions can be constructed with Python operators in a straightforward manner. Our goal was for our API to follow kernel algebra (see Ch.4 of {cite:t}rasmussen2003gaussian) as closely as possible. See the main documentation page for an overview on their usage in PyMC.
It is easy to define kernels with higher dimensional inputs. Notice that the ls (lengthscale) parameter is an array of length 2. Lists of PyMC random variables can be used for automatic relevance determination (ARD).
Note that this is equivalent to using a two dimensional ExpQuad with separate lengthscale parameters for each dimension.
A covariance function cov can be multiplied with numpy matrix, K_cos, as long as the shapes are appropriate.
If $k(x, x')$ is a valid covariance function, then so is $k(w(x), w(x'))$.
The first argument of the warping function must be the input X. The remaining arguments can be anything else, including random variables.
Periodic using WarpedInputThe WarpedInput kernel can be used to create the Periodic covariance. This covariance models functions that are periodic, but are not an exact sine wave (like the Cosine kernel is).
The periodic kernel is given by
Where T is the period, and $\ell$ is the lengthscale. It can be derived by warping the input of an ExpQuad kernel with the function $\mathbf{u}(x) = (\sin(2\pi x \frac{1}{T}),, \cos(2 \pi x \frac{1}{T}))$. Here we use the WarpedInput kernel to construct it.
The input X, which is defined at the top of this page, is 2 "seconds" long. We use a period of $0.5$, which means that functions
drawn from this GP prior will repeat 4 times over 2 seconds.
There is no need to construct the periodic covariance this way every time. A more efficient implementation of this covariance function is built in.
Circular kernel is similar to Periodic one but has an additional nuisance parameter $\tau$
In {cite:t}padonou2015polar, the Weinland function is used to solve the problem and ensures positive definite kernel on the circular domain (and not only).
where $c$ is maximum value for $t$ and $\tau\ge 4$ is some positive number
The kernel itself for geodesic distance (arc length) on a circle looks like
Briefly, you can think
We can see the effect of $\tau$, it adds more non-smooth patterns
The Gibbs covariance function applies a positive definite warping function to the lengthscale. Similarly to WarpedInput, the lengthscale warping function can be specified with parameters that are either fixed or random variables.
One can construct a new kernel or covariance function by multiplying some base kernel by a nonnegative function $\phi(x)$,
This is useful for specifying covariance functions whose amplitude changes across the domain.
ScaledCovThe ScaledCov kernel can be used to create the Changepoint covariance. This covariance models
a process that gradually transitions from one type of behavior to another.
The changepoint kernel is given by
where $\phi(x)$ is the logistic function.
You can combine different covariance functions to model complex data.
In particular, you can perform the following operations on any covaraince functions:
Covariance function objects in PyMC need to implement the __init__, diag, and full methods, and subclass gp.cov.Covariance. diag returns only the diagonal of the covariance matrix, and full returns the full covariance matrix. The full method has two inputs X and Xs. full(X) returns the square covariance matrix, and full(X, Xs) returns the cross-covariances between the two sets of inputs.
For example, here is the implementation of the WhiteNoise covariance function:
If we have forgotten an important covariance or mean function, please feel free to submit a pull request!
:::{bibliography} :filter: docname in docnames :::
:::{include} ../page_footer.md :::