Bounded forecasts

In forecasting, we often want to make sure the predictions stay within a certain range. For example, for predicting the sales of a product, we may require all forecasts to be positive. Thus, the forecasts may need to be bounded.

With TimeGPT, you can create bounded forecasts by transforming your data prior to calling the forecast function.

1. Import packages

First, we install and import the required packages

import pandas as pd
import numpy as np

from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

2. Load data

We use the annual egg prices dataset from Forecasting, Principles and Practices. We expect egg prices to be strictly positive, so we want to bound our forecasts to be positive.

Note

You can install pyreadr with pip:

pip install pyreadr
import pyreadr
from pathlib import Path

# Download and store the dataset
url = 'https://github.com/robjhyndman/fpp3package/raw/master/data/prices.rda'
dst_path = str(Path.cwd().joinpath('prices.rda'))
result = pyreadr.read_r(pyreadr.download_file(url, dst_path), dst_path)
# Perform some preprocessing
df = result['prices'][['year', 'eggs']]
df = df.dropna().reset_index(drop=True)
df = df.rename(columns={'year':'ds', 'eggs':'y'})
df['ds'] = pd.to_datetime(df['ds'], format='%Y')
df['unique_id'] = 'eggs'

df.tail(10)
dsyunique_id
841984-01-01100.58eggs
851985-01-0176.84eggs
861986-01-0181.10eggs
871987-01-0169.60eggs
881988-01-0164.55eggs
891989-01-0180.36eggs
901990-01-0179.79eggs
911991-01-0174.79eggs
921992-01-0164.86eggs
931993-01-0162.27eggs

We can have a look at how the prices have evolved in the 20th century, demonstrating that the price is trending down.

nixtla_client.plot(df)

3. Bounded forecasts with TimeGPT

First, we transform the target data. In this case, we will log-transform the data prior to forecasting, such that we can only forecast positive prices.

df_transformed = df.copy()
df_transformed['y'] = np.log(df_transformed['y'])

We will create forecasts for the next 10 years, and we include an 80, 90 and 99.5 percentile of our forecast distribution.

timegpt_fcst_with_transform = nixtla_client.forecast(df=df_transformed, h=10, freq='Y', level=[80, 90, 99.5])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: AS-JAN
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

After having created the forecasts, we need to inverse the transformation that we applied earlier. With a log-transformation, this simply means we need to exponentiate the forecasts:

cols_to_transform = [col for col in timegpt_fcst_with_transform if col not in ['unique_id', 'ds']]
for col in cols_to_transform:
    timegpt_fcst_with_transform[col] = np.exp(timegpt_fcst_with_transform[col])

Now, we can plot the forecasts. We include a number of prediction intervals, indicating the 80, 90 and 99.5 percentile of our forecast distribution.

nixtla_client.plot(
    df, 
    timegpt_fcst_with_transform, 
    level=[80, 90, 99.5],
    max_insample_length=20
)

The forecast and the prediction intervals look reasonable.

Let’s compare these forecasts to the situation where we don’t apply a transformation. In this case, it may be possible to forecast a negative price.

timegpt_fcst_without_transform = nixtla_client.forecast(df=df, h=10, freq='Y', level=[80, 90, 99.5])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: AS-JAN
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

Indeed, we now observe prediction intervals that become negative:

nixtla_client.plot(
    df, 
    timegpt_fcst_without_transform, 
    level=[80, 90, 99.5],
    max_insample_length=20
)

For example, in 1995:

timegpt_fcst_without_transform
unique_iddsTimeGPTTimeGPT-lo-99.5TimeGPT-lo-90TimeGPT-lo-80TimeGPT-hi-80TimeGPT-hi-90TimeGPT-hi-99.5
0eggs1994-01-0166.85975643.10324046.13144849.31903484.40047987.58806590.616273
1eggs1995-01-0164.993477-20.924112-4.75004112.275298117.711656134.736995150.911066
2eggs1996-01-0166.6958086.4991708.29115010.177444123.214173125.100467126.892446
3eggs1997-01-0166.10332517.30428224.96693933.03289499.173756107.239711114.902368
4eggs1998-01-0167.9065174.99537112.34964820.090992115.722042123.463386130.817663
5eggs1999-01-0166.14757529.16220731.80446034.58577997.709372100.490691103.132943
6eggs2000-01-0166.06263714.67193219.30582224.183601107.941673112.819453117.453343
7eggs2001-01-0168.0457693.91528213.18896422.950736113.140802122.902573132.176256
8eggs2002-01-0166.718903-42.212631-30.583703-18.342726151.780531164.021508175.650436
9eggs2003-01-0167.344078-86.239911-44.959745-1.506939136.195095179.647901220.928067

This demonstrates the value of the log-transformation to obtain bounded forecasts with TimeGPT, which allows us to obtain better calibrated prediction intervals.

References