model_builder.ipynb

https://github.com/pymc-devs/pymc-examples/blob/main/examples/howto/model_builder.ipynb

Using ModelBuilder class for deploying PyMC models

:::{post} Feb 22, 2023 :tags: deployment :category: advanced :author: Shashank Kirtania, Thomas Wiecki, Michał Raczycki :::

Motivation

Many users face difficulty in deploying their PyMC models to production because deploying/saving/loading a user-created model is not well standardized. One of the reasons behind this is there is no direct way to save or load a model in PyMC like scikit-learn or TensorFlow. The new ModelBuilder class is aimed to improve this workflow by providing a scikit-learn inspired API to wrap your PyMC models.

The new {class}ModelBuilder <pymc_extras.model_builder.ModelBuilder> class allows users to use methods to fit(), predict(), save(), load(). Users can create any model they want, inherit the {class}ModelBuilder <pymc_extras.model_builder.ModelBuilder> class, and use predefined methods.

Let's go through the full workflow, starting with a simple linear regression PyMC model as it's usually written. Of course, this model is just a place-holder for your own model.

Standard syntax

Usually a PyMC model will have this form:

How would we deploy this model? Save the fitted model, load it on an instance, and predict? Not so simple.

ModelBuilder is built for this purpose. It is currently part of the {ref}pymc-experimental package which we can pip install with pip install pymc-experimental. As the name implies, this feature is still experimental and subject to change.

Model builder class

Let's import the ModelBuilder class.

To define our desired model we inherit from the ModelBuilder class. There are a couple of methods we need to define.

Now we can create the LinearModel object. First step we need to take care of, is data generation:

After making the object of class LinearModel we can fit the model using the .fit() method.

Fitting to data

The fit() method takes one argument data on which we need to fit the model. The meta-data is saved in the InferenceData object where also the trace is stored. These are the fields that are stored:

id : This is a unique id given to a model based on model_config, sample_conifg, version, and model_type. Users can use it to check if the model matches to another model they have defined.
model_type : Model type tells us what kind of model it is. This in this case it outputs Linear Model
version : In case you want to improve on models, you can keep track of model by their version. As the version changes the unique hash in the id also changes.
sample_conifg : It stores values of the sampler configuration set by user for this particular model.
model_config : It stores values of the model configuration set by user for this particular model.

Saving model to file

After fitting the model, we can probably save it to share the model as a file so one can use it again. To save() or load(), we can quickly call methods for respective tasks with the following syntax.

This saves a file at the given path, and the name
A NetCDF .nc file that stores the inference data of the model.

Loading a model

Now if we wanted to deploy this model, or just have other people use it to predict data, they need two things:

the LinearModel class (probably in a .py file)
the linear_model_v1.nc file

With these, you can easily load a fitted model in a different environment (e.g. production):

Note that load() is a class-method, we do not need to instantiate the LinearModel object.

Prediction

Next we might want to predict on new data. The predict() method allows users to do posterior prediction with the fitted model on new data.

Our first task is to create data on which we need to predict.

ModelBuilder provides two methods for prediction:

point estimates (the mean) with predict()
full posterior prediction (samples) with predict_posterior()

After using the predict(), we can plot our data and see graphically how satisfactory our LinearModel is.

Authors

Authored by Shashank Kirtania and Thomas Wiecki in 2023.
Modified and updated by Michał Raczycki in 08/2023

:::{include} ../page_footer.md :::