(Forecasting Hurricane Trajectories with State Space Models)=

Forecasting Hurricane Trajectories with State Space Models

:::{post} June 15, 2025 :tags: state space model :category: intermediate, tutorial :author: Jonathan Dekermanjian :::

Introduction

In this case study we are going to forecast the paths of hurricanes by applying several State Space Models (SSM). We will begin with a simple two-dimensional constant acceleration tracking model, where we only have one parameter to estimate. Subsequently, we will progressively add complexity and parameters as we develop up our model.

As a brief introduction to SSMs, the general idea is that we define our system using two equations.
The state equation and the observation equation.

x_{t+1} = T_{t}x_{t} + c_{t} + R_{t}\epsilon_{t}

y_{t} = Z_{t}x_{t} + d_{t} + \eta_{t}

The process/state covariance is given by $\epsilon_{t} \sim N(0, Q_{t})$ where $Q_{t}$ is the process/state innovations and the observation/measurement covariance is given by $\eta_{t} \sim N(0, H_{t})$ where $H_{t}$ describes the uncertainty in the measurement device or measurement procedure.

We have the following matrices:

State Equation variables	Definition
$T_{t}$	The state transition matrix at time $t$ defines the kinematics of the process generating the series.
$x_{t}$	The state vector at time $t$ describes the current state of the system.
$c_{t}$	Intercept vector at time $t$ can include covariates/control/exogenous variables that are deterministically measured.
$R_{t}$	Selection matrix at time $t$ selects which process innovations are allowed to affect the next state.
$\epsilon_{t}$	State/Process innovations at time $t$ defines the shocks influencing the changes in the state matrix.

Observation Equation variables	Definition
$Z_{t}$	The design matrix at time $t$ defines which states directly influence the observed variables.
$x_{t}$	The state vector at time $t$ describes the current state of the system.
$d_{t}$	Intercept vector at time $t$ can include covariates/control/exogenous variables that are deterministically measured.
$\eta_{t}$	observation/measurement error at time $t$ defines the uncertainty in the observation.

Estimation occurs in an iterative fashion (after an initialization step). In which the following steps are repeated:

Predict the next state vector $x_{t+1|t}$ and the next state/process covariance matrix $P_{t+1|t}$
Compute the Kalman gain
Estimate the current state vector and the current state/process covariance matrix

Where $P_{t}$ is the uncertainty in the state predictions at time $t$.

The general idea is that we make predictions based on our current state vector and state/process covariance (uncertainty) then we correct these predictions once we have our observations.

The following equations define the process:

Description	Equation
Predict the next state vector	$\hat{x}{t+1\|t} = T{t}\hat{x}_{t\|t}$
Predict the next state/process covariance	$P_{t+1\|t} = T_{t}P_{t+1\|t}T_{t}^{T} + Q$
Compute Kalman Gain	$K_{t} = P_{t\|t-1}Z^{T}(ZP_{t\|t-1}Z^{T} + H_{t})^{-1}$
Estimate current state vector	$\hat{x}{t\|t} = \hat{x}{t\|t-1} + K_{t}(y_{t} - Z\hat{x}_{t\|t-1})$
Estimate current state/process covariance	$P_{t\|t} = (I - K_{t}Z_{t})P_{t\|t-1}(I - K_{t}Z_{t})^{T} + K_{t}H_{t}K_{t}^{T}$

:::{note} We wrote the equation for $P_{t|t}$ above using Joseph form, which is more numerically stable but also wordier. In different texts you may encounter this equation written in "standard" form. :::

Imports

:::{include} ../extra_installs.md :::

Helper Functions

Load and Process the Dataset

The data comes from the National Oceanic and Atmospheric Administration (NOAA) and is stored in an odd format (likely to save space). We need to wrangle it before we can proceed.

Generate visualizations

Hurricane Originations

Let's plot the origination points of the hurricanes in our dataset. There are a few different origination definitions when looking at the tropical cyclones within the HURDAT dataset:

A tropical depression when the maximum sustained surface wind is $\le$ 33.89 knots
A tropical cyclone when the maximum sustained surface wind is $\gt$ 33.89 knots
A Hurricane when the maximum sustained surface wind is $\ge$ 64.30 knots

We are going to define the first instance of tracking data per hurricane as our origin. These are important because hurricanes curve differently, due to the coriolis effect, depending on where they are located. For example, hurricanes in the northern hemisphere curve to the right. Whereas, in the southern hemisphere they curve to the left. In addition, the origination point influences the strength of the hurricane and its likelihood of making landfall. For example, Hurricanes that originate in the Gulf of Mexico have little time to gather energy but are surrounded by land making it more likely for landfall.

Hurricane Fiona's Path

From here on out our task is to forecast the trajectory of Hurricane Fiona. Let's plot the path it took. We mark the origination point in blue and the last observed location of Fiona in red. We see that Fiona initially travels westward and curves to the right making its way northward. This trajectory is typical for Hurricanes that originate in the Northern Hemisphere of the Atlantic Ocean.

Tracking Hurricane Fiona using a State Space Model

The simplest state space model for tracking an object in 2-dimensional space is one in which we define the kinematics using Newtonian equations of motion. In this example we assume constant acceleration but place a stochastic innovation $\sigma_{a}^{2}$ to account for unmodeled angular acceleration. Furthermore, we fix the observation/measurement noise, $H$, to a small value, reflecting our confidence in the collected data. We will also assume that the states in the x/longitude direction do not affect the states in the y/latitude direction. This means knowing the position/velocity/acceleration in x/longitude gives us no information on the position/velocity/acceleration in y/latitude

Let us begin by defining our matrices/vectors.

As a reminder the observation equation and the state equation define our linear system.

y_{t} = Z_{t}x_{t} + d_{t} + \eta_{t}

x_{t+1} = T_{t}x_{t} + c_{t} + R_{t}\epsilon_{t}

In this case we are assuming there is no state or observation intercepts ($c_{t} = 0$, $d_{t} = 0$). I will also drop the $t$ subscript over matrices that are fixed (do not change over time).

Our states are the following components derived from the Newtonian equations of motion.

x_{t} = \begin{bmatrix}longitude_{t} \\ latitude_{t} \\ longitude\_velocity_{t} \\ latitude\_velocity_{t} \\ longitude\_acceleration_{t} \\ latitude\_acceleration_{t} \end{bmatrix}

In order for our system to evolve in accordance with Newtonian motion our transitioin matrix is defined as:

T = \begin{bmatrix}1&0&\Delta t&0&\frac{\Delta t^{2}}{2}&0 \\ 0&1&0&\Delta t&0&\frac{\Delta t^{2}}{2} \\ 0&0&1&0&\Delta t&0 \\ 0&0&0&1&0&\Delta t \\ 0&0&0&0&1&0 \\ 0&0&0&0&0&1 \end{bmatrix}

The following design matrix tells us that only the positions ($longitude_{t}$ and $latitude_{t}$) are observed states

Z = \begin{bmatrix}1&0&0&0&0&0 \\ 0&1&0&0&0&0 \end{bmatrix}

The selection matrix is defined as the identity allowing the process/state covariance to affect all of our states.

R = I

Where

\epsilon_{t} \sim N(0, Q)

\eta_{t} \sim N(0, H)

In this example we fix our observation/measurement error to a small value (0.5) reflecting our confidence in the measurements. Here we are assuming that our measuring instrument is fairly accurate and that it may be off by $\sqrt{0.5} \approx 0.7$ degrees.

H = \begin{bmatrix} 0.5&0 \\ 0&0.5\end{bmatrix}

and finally, the state/process covariance matrix that is derived from using Newtonian motion is as follows:

Q = \sigma_{a}^{2}\begin{bmatrix} \frac{\Delta t^{4}}{4}&0&\frac{\Delta t^{3}}{2}&0&\frac{\Delta t^{2}}{2}&0 \\ 0&\frac{\Delta t^{4}}{4}&0&\frac{\Delta t^{3}}{2}&0&\frac{\Delta t^{2}}{2} \\ \frac{\Delta t^{3}}{2}&0&\Delta t^{2}&0&\Delta t&0 \\ 0&\frac{\Delta t^{3}}{2}&0&\Delta t^{2}&0&\Delta t \\ \frac{\Delta t^{2}}{2}&0&\Delta t&0&1&0 \\ 0&\frac{\Delta t^{2}}{2}&0&\Delta t&0&1 \end{bmatrix}

Let's briefly go over how we came about our $T$ and $Q$ matrices.

The $T$ transition matrix is built such that when we expand the observation model we end up with the Newtonian equations of motion. For example, if we expand the matrix vector multiplaction for the longitude (position) term we end up with:

\hat{y}_{longitude_{t+1}} = longitude_{t} + longitude\_velocity_{t}\Delta t + \frac{longitude\_acceleration_{t}\Delta t^{2}}{2}

This is the Newtonian motion equation of position. Where the new position is the old position plus the change in velocity plus the change in acceleration. The rest of the equations can be derived by completing all the entries of the matrix vector multiplication of the observation equation.

The process/state covariance matrix $Q$ is just the variance (diagonals) covariance (off-diagonals) of the Newtonian equations. For example, the variance of the longitudinal position entry is:

Var(longitude_{t}) = Var(longitude_{t} + longitude\_velocity_{t}\Delta t + \frac{longitude\_acceleration_{t}\Delta t^{2}}{2}) = Var(\frac{longitude\_acceleration_{t}\Delta t^{2}}{2})

= \frac{\Delta t^{4}}{4}Var(longitude\_acceleration_{t}) = \frac{\Delta t^{4}}{4}\sigma_{a}^{2}

Where in this case we assume the same stochastic innovations on the acceleration term in both dimensions. You can derive the rest of the entries in $Q$ by taking the variance or covariance of the Newtonian equations.

Optional Material

We can also derive the Newtonian equations of motion from a system of ordinary differntial equations (ODE)s. Here we have a system that consists of:

$\dot{p}(t) = v(t)$
$\dot{v}(t) = a(t)$
$\dot{a}(t) = 0$

our state vector (in one dimension)

x_{t} = \begin{bmatrix}p(t) \\ v(t) \\ a(t) \end{bmatrix}

and our ODE system becomes

\frac{d}{dt} \begin{bmatrix}p(t) \\ v(t) \\a(t) \end{bmatrix} = \begin{bmatrix}v(t) \\ a(t) \\ 0 \end{bmatrix}

Now we integrate our system over $\Delta{t}$ and we have

p(t + \Delta{t}) = p(t) + \int_{t}^{t + \Delta{t}}v(t')dt'

assuming that the change in time is small and that the change in velocity is negligible we can approximate the integrals using the forward Euler method and get

p(t + \Delta{t}) \approx p(t) + v(t)\Delta{t}

integrating the rest of our system equations using the same methodology we find that

v(t + \Delta{t}) \approx v(t) + a(t)\Delta{t}

a(t + \Delta{t}) = a(t)

We can now express our system of equations over a time interval $\Delta{t}$ as

\begin{bmatrix}p(t + \Delta{t}) \\ v(t + \Delta{t}) \\ a(t + \Delta{t}) \end{bmatrix} = \begin{bmatrix}p(t) \\ v(t) \\ a(t) \end{bmatrix} + \Delta{t} \begin{bmatrix}v(t) \\ a(t) \\ 0 \end{bmatrix}

Which when you write out the matrix vector multiplications will yield the Newtonian equations of motion.

We are going to use the PyMC StateSpace module in the pymc-extras package to code up the state space model we defined above. In This model we are going to set up 3 variables:

$x_{0|0}$ The initial state vector (for initializing the estimation steps described earlier)
$P_{0|0}$ The initial state/process covariance matrix (again for initializing the recursive estimator)
$\sigma_{a}^{2}$ The acceleration innovations (this is useful for when the acceleration is not actually constant as we have assumed in the kinematics of the model)

We will set deterministic values for both the initial values $x_{0}$ and $P_{0}$. Therefore, in this simplest model, we will only estimate one parameter $\sigma_{a}^{2}$

Not bad for a model with only one parameter. We can see that the forecast gets wonky in the middle where the trajectory of the Hurricane changes directions over short time periods. Again, it is important to keep in mind that what we are plotting are the one-step/period ahead forecast. In our case, our periods are six hours apart. Unfortunately, a 6-hour ahead hurricane forecast is not very practical. Let's see what we get when we generate a 4-period (24-hour) ahead forecast.

:::{note}

More often than not we build parametric models that are quite flexible in adapting to the data. In such cases, we have multiple parameters that give the model it's flexibility in fitting the data. In this particular case, we impose a rigid structure on the data by insisting that the data generating process adhere to the laws of Newtonian physics. In doing so, the model is much less flexible with only one parameter that accounts for the unmodeled angular acceleration.

:::

Generate 24-hour forecasts with our Newtonian model

The 4-period (24-hour) forecasts exhibit a smaller error compared to the one-step ahead forecast. This result is due to the smoothing effect that occurs when we forecast out longer time periods.

Adding Deterministic Covariates/Exogenous Variables

In our dataset we have variables that aren't a part of the Newtonian system process, but may carry information that we can leverage to better track the path of the hurricane. We have two options when introducing these exogenous variables into our model. We can add them in as time invariant or time-varying variables. In our case, we are going to add exogenous variables as time invariant. Our aim then is to model our observations as:

\hat{y}_{longitude_{t+1}} = longitude_{t} + longitude\_velocity_{t}\Delta t + \frac{longitude\_acceleration_{t}\Delta t^{2}}{2} + \beta_{exogenous_{longitude}} exogenous\_data

\hat{y}_{latitude_{t+1}} = latitude_{t} + latitude\_velocity_{t}\Delta t + \frac{latitude\_acceleration_{t}\Delta t^{2}}{2} + \beta_{exogenous_{latitude}} exogenous\_data

The max_wind variable measures the maximum sustained surface wind at 10 meter elevations and is a uni-dimensional measure (i.e not measured in terms of longitude and latitude). Therefore, we are going to use the same data to estimate two-parameters $\beta_{wind_{longitude}}$ and $\beta_{wind_{latitude}}$. This is less than ideal but demonstrates how to add exogenous variables to a StateSpace model.

The min_pressure variable measures the minumum central pressure. The lower the central pressure, typically, the more intense the hurricane. This variable again is only measured in one dimension, so we will handle that the same way we are handling it for max_wind.

Finally, we will also include the interaction between the maximum sustained surface wind and the minimum central pressure.

In order to put this in matrix form, we are going to add the newly added $\beta$ parameters to our state vector and we will add the corresponding measured wind_speed, min_pressure, and interaction_wind_pressure data into the design matrix. So our new state vector will be:

x_{t} = \begin{bmatrix}longitude_{t} \\ latitude_{t} \\ longitude\_velocity_{t} \\ latitude\_velocity_{t} \\ longitude\_acceleration_{t} \\ latitude\_acceleration_{t} \\ \beta_{wind_{longitude}} \\ \beta_{wind_{latitude}} \\ \beta_{pressure_{longitude}} \\ \beta_{pressure_{latitude}} \\ \beta_{wind\_pressure_{longitude}} \\ \beta_{wind\_pressure_{latitude}} \end{bmatrix}

and our design matrix will be:

Z' = \begin{bmatrix} Z & X_{exogenous\_data} \end{bmatrix}

Where $Z$ is our previously defined design matrix and $X_{exogenous}$ is the measured wind_speed, minimum_pressure, and interaction_wind_pressure exogenous data.

We also need to make adjustments to our $T$ state transition matrix and $R$ selection matrix.

T = \begin{bmatrix}1&0&\Delta t&0&\frac{\Delta t^{2}}{2}&0&1&0&1&0&1&0 \\ 0&1&0&\Delta t&0&\frac{\Delta t^{2}}{2}&0&1&0&1&0&1 \\ 0&0&1&0&\Delta t&0&0&0&0&0&0&0 \\ 0&0&0&1&0&\Delta t&0&0&0&0&0&0 \\ 0&0&0&0&1&0&0&0&0&0&0&0 \\ 0&0&0&0&0&1&0&0&0&0&0&0 \\ 0&0&0&0&0&0&1&0&0&0&0&0 \\ 0&0&0&0&0&0&0&1&0&0&0&0 \\ 0&0&0&0&0&0&0&0&1&0&0&0 \\ 0&0&0&0&0&0&0&0&0&1&0&0 \\ 0&0&0&0&0&0&0&0&0&0&1&0 \\ 0&0&0&0&0&0&0&0&0&0&0&1 \end{bmatrix}

The last 2 columns we added indicates what states our exogenous variables affect, and the last 2 rows indicate that the processes of our exogenous parameters are constant.

R = \begin{bmatrix} 1&0&0&0&0&0 \\ 0&1&0&0&0&0 \\ 0&0&1&0&0&0 \\ 0&0&0&1&0&0 \\ 0&0&0&0&1&0 \\ 0&0&0&0&0&1 \\ 0&0&0&0&0&0 \\ 0&0&0&0&0&0 \\ 0&0&0&0&0&0 \\ 0&0&0&0&0&0 \\ 0&0&0&0&0&0 \\ 0&0&0&0&0&0 \end{bmatrix}

The additions to the R matrix imply that the exogenous parameters do not exhibit any process innovations.

Note that because our variables are uni-dimensional we duplicate and mirror the data to apply each variable in two dimensions. The mirroring occurs in the model definition of the design matrix. This is necessary to ensure that the correct data is associated with the proper state during matrix multiplication. Take a look at how the states are defined above. In order to have a univariate variable apply to both the longitude and latitude positions (our observed states) we must duplicate our exogenous data for each observed stated and mirror the data so that when we are multiplying the exogenous data with the longitudinal states the data is non-zero on the longitudinal state but is zero for the latitudinal states and vice versa.

Typically, the surface wind speed and the central pressure of a hurricane carry little information on the path the hurricane will take. The path of a hurricane is, generally, influenced by surrounding atmospheric conditions like pressure gradients. Knowing this, it makes sense to see that many of our beta parameters are close to zero, indicating little to no influence on the hurricanes' path.

Make in-sample forecasts with new exogenous model

Our one-period ahead forecasts seem to be slightly worse than our Newtonian model. You will notice that at the end of the forecast we see that our trajectory is erroneously more north rather than north-east. Since the exogenous variables we added to the model don't carry additional information with respect to the hurricane's trajectory, this results are expected.

Generate 24-hour forecasts with our Exogenous SSM

We need to be careful here because we must ensure that we don't allow our model to peak into the future by supplying it with exogenous data that we would not have seen at the time of generating our forecasts. To ensure good and fair forecasts, we will be carrying forward the last exogenous data points from our prior period.

Similarly, our 24-hour forecasts are also slightly worse off compared to those produced by the simple model.

Add B-Splines

In the previous section, we tried adding an interaction term between the maximum sustained surface wind speed and the minumum central pressure. However, our estimated parameters were not too far off from zero. In this section we are going to attempt to model the non-linear complexities of the path, particularly in the mid-section, using cubic B-splines.

We first need to define what variables we are going to model as a smooth function. In our case, we are going to model the longitude values as a smooth function of the latitude values and vice versa.

To keep things simple, we are going to define a constant number of knots that are equal in both variables (same number of knots for both longitude and latitude) and we are going to place the knots using a quantile function over the variable space. You can see the knots plotted out below for each variable.

Next, we need to create the basis functions over the defined variable space knot locations for each variable.

Our new models' structure is going to be similar to our last model that had exogenous variables. However, in this case our data are going to be the basis functions we created earlier. These will be inserted into our design matrix ($Z$) and the beta parameters corresponding to each spline will be added to our state vector ($x_{t}$). Again, these parameters will be constant (non-time varying). We will also have to adjust our transition matrix ($T$) and selection matrix ($R$) similar to how we did previously. We will now have:

\hat{y}_{longitude_{t+1}} = longitude_{t} + longitude\_velocity_{t}\Delta t + \frac{longitude\_acceleration_{t}\Delta t^{2}}{2} + \beta_{spline\_params_{longitude}} longitude\_spline\_basis\_functions

\hat{y}_{latitude_{t+1}} = latitude_{t} + latitude\_velocity_{t}\Delta t + \frac{latitude\_acceleration_{t}\Delta t^{2}}{2} + \beta_{spline\_params_{latitude}} latitude\_spline\_basis\_functions

our design matrix will be:

Z' = \begin{bmatrix} Z & X_{basis\_functions} \end{bmatrix}

Where $Z$ is our previously defined design matrix and $X_{exogenous}$ are the basis functions we defined earlier.

Our transition matrix will be

T' = \begin{bmatrix} T & F \\ 0 & I_{n\_spline\_params} \end{bmatrix}

Where $T$ is the transition matrix defined for the Newtonian kinematics (the top-left 6x6 block in our previous model) and

F = \begin{bmatrix} 1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0 \\ 0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1&0&1 \\ 0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0 \\ 0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0 \\ 0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0 \\ 0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0&0 \end{bmatrix}

and the 0 in the matrix above is a matrix of 0s of shape (number of spline parameters by number of spline parameters)

Finally, we have

R' = \begin{bmatrix} R \\ 0 \end{bmatrix}

Where R is the selection matrix over our endogenous states (identity matrix of shape (number of states)) and again the 0 in the matrix is a matrix of 0s with shape (number of spline parameters by number of states)

Most of our spline parameters are around zero, with a handful of exceptions. Let's take a look at how these effect our forecasts.

Make in-sample forecasts with new spline model

Our one-period ahead forecasts, look better than the ones we generated from the Exogenous covariates model, but worse than the original model that purely follows Newtonian kinematics.

Our 24-hour (4-period) forecasts, look pretty good. So far, this follows the true trajectory during the mid-section the best.

Closing Remarks

In this case study we looked at how we can track a hurricane in two-dimensional space using a state space representation of Newtonian kinematics. We proceeded to expand on the pure Newtonian model and added exogenous variables that may hold information pertintent to the Hurricane's track. We then expanded our model by modeling our variables as smooth functions using cubic B-splines.

Throughout, the case study we have been evaluating our 24-hour forecasts and our overall mean error is smallest with our first Newtonian model. Below you will find the errors from all three models plotting against one another. It seems that (as expected) the exogenous information we included in the exogenous model was not informative with respect to the hurricances' trajectory. However, it is worth noting that in the period (around 40 through 56) where the hurricane manuevers we obtain less spikes in error in that section with our cubic B-spline model. This implies that the model could benefit from some non-linear specification to handle the angular acceleration. Hopefully, someday the StateSpace module in pymc-extras may support non-linear state space specifications with either the Extended Kalman Filter or with the Unscented Kalman Filter. Until then you can learn more about how to build your own custom state space models with the StateSpace module on the pymc-extras official GitHub repository.

Authors

Authored by Jonathan Dekermanjian in June, 2025

Acknowledgements

Chris Fonnesbeck who reviewed and provided valueable feedback that improved the text
Jesse Grabowski reviewed and provided valueable feedback that improved the text and content, and suggested additional content that improve the overall quality of the work.

References

:::{bibliography} :filter: docname in docnames

becker2023kalman :::

ssm_hurricane_tracking.ipynb