Hierarchical forecasting

In forecasting, we often find ourselves in need of forecasts for both lower- and higher (temporal) granularities, such as product demand forecasts but also product category or product department forecasts. These granularities can be formalized through the use of a hierarchy. In hierarchical forecasting, we create forecasts that are coherent with respect to a pre-specified hierarchy of the underlying time series.

With TimeGPT, we can create forecasts for multiple time series. We can subsequently post-process these forecasts using hierarchical forecasting techniques of HierarchicalForecast.

1. Import packages

First, we import the required packages and initialize the Nixtla client.

import pandas as pd
import numpy as np

from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

2. Load data

We use the Australian Tourism dataset, from Forecasting, Principles and Practices which contains data on Australian Tourism. We are interested in forecasts for Australia’s 7 States, 27 Zones and 76 Regions. This constitutes a hierarchy, where forecasts for the lower levels (e.g. the regions Sidney, Blue Mountains and Hunter) should be coherent with the forecasts of the higher levels (e.g. New South Wales).

Map of Australia color coded by state. The states are from west to east, and then north to south - Western Australia, Northern Territory, South Australia, Queensland, New South Wales, Victoria. Australian Capital Territory is a small area within New South Wales. Tasmania is an island to the southeast. Australia hierarchy. Australia at the top with New South Wales and Queenslad below. Sidney, Blue Mountains, and Hunter in New South Wales. Brisbane and Cairns in Queensland.

The dataset only contains the time series at the lowest level, so we need to create the time series for all hierarchies.

Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/tourism.csv')
Y_df = Y_df.rename({'Trips': 'y', 'Quarter': 'ds'}, axis=1)
Y_df.insert(0, 'Country', 'Australia')
Y_df = Y_df[['Country', 'Region', 'State', 'Purpose', 'ds', 'y']]
Y_df['ds'] = Y_df['ds'].str.replace(r'(\d+) (Q\d)', r'\1-\2', regex=True)
Y_df['ds'] = pd.to_datetime(Y_df['ds'])

Y_df.head(10)
C:\Users\ospra\AppData\Local\Temp\ipykernel_28980\3753786659.py:6: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  Y_df['ds'] = pd.to_datetime(Y_df['ds'])
CountryRegionStatePurposedsy
0AustraliaAdelaideSouth AustraliaBusiness1998-01-01135.077690
1AustraliaAdelaideSouth AustraliaBusiness1998-04-01109.987316
2AustraliaAdelaideSouth AustraliaBusiness1998-07-01166.034687
3AustraliaAdelaideSouth AustraliaBusiness1998-10-01127.160464
4AustraliaAdelaideSouth AustraliaBusiness1999-01-01137.448533
5AustraliaAdelaideSouth AustraliaBusiness1999-04-01199.912586
6AustraliaAdelaideSouth AustraliaBusiness1999-07-01169.355090
7AustraliaAdelaideSouth AustraliaBusiness1999-10-01134.357937
8AustraliaAdelaideSouth AustraliaBusiness2000-01-01154.034398
9AustraliaAdelaideSouth AustraliaBusiness2000-04-01168.776364

The dataset can be grouped in the following hierarchical structure.

spec = [
    ['Country'],
    ['Country', 'State'], 
    ['Country', 'Purpose'], 
    ['Country', 'State', 'Region'], 
    ['Country', 'State', 'Purpose'], 
    ['Country', 'State', 'Region', 'Purpose']
]

Using the aggregate function from HierarchicalForecast we can get the full set of time series.

Note

You can install hierarchicalforecast with pip:

pip install hierarchicalforecast
from hierarchicalforecast.utils import aggregate
Y_df, S_df, tags = aggregate(Y_df, spec)
Y_df = Y_df.reset_index()

Y_df.head(10)
unique_iddsy
0Australia1998-01-0123182.197269
1Australia1998-04-0120323.380067
2Australia1998-07-0119826.640511
3Australia1998-10-0120830.129891
4Australia1999-01-0122087.353380
5Australia1999-04-0121458.373285
6Australia1999-07-0119914.192508
7Australia1999-10-0120027.925640
8Australia2000-01-0122339.294779
9Australia2000-04-0119941.063482

We use the final two years (8 quarters) as test set.

Y_test_df = Y_df.groupby('unique_id').tail(8)
Y_train_df = Y_df.drop(Y_test_df.index)

3. Hierarchical forecasting with TimeGPT

First, we create base forecasts for all the time series with TimeGPT. Note that we set add_history=True, as we will need the in-sample fitted values of TimeGPT.

We will predict 2 years (8 quarters), starting from 01-01-2016.

timegpt_fcst = nixtla_client.forecast(df=Y_train_df, h=8, freq='QS', add_history=True)
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Calling Historical Forecast Endpoint...
timegpt_fcst_insample = timegpt_fcst.query("ds < '2016-01-01'")
timegpt_fcst_outsample = timegpt_fcst.query("ds >= '2016-01-01'")

Let’s plot some of the forecasts, starting from the highest aggregation level (Australia), to the lowest level (Australia/Queensland/Brisbane/Holiday). We can see that there is room for improvement in the forecasts.

nixtla_client.plot(
    Y_df, 
    timegpt_fcst_outsample, 
    max_insample_length=4 * 12, 
    unique_ids=['Australia', 'Australia/Queensland','Australia/Queensland/Brisbane', 'Australia/Queensland/Brisbane/Holiday']
)

We can make these forecasts coherent to the specified hierarchy by using a HierarchicalReconciliation method from NeuralForecast. We will be using the MinTrace method.

from hierarchicalforecast.methods import MinTrace
from hierarchicalforecast.core import HierarchicalReconciliation
reconcilers = [
    MinTrace(method='ols'),
    MinTrace(method='mint_shrink'),
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)

Y_df_with_insample_fcsts = Y_df.copy()
Y_df_with_insample_fcsts['ds'] = Y_df_with_insample_fcsts['ds'].astype(str)
Y_df_with_insample_fcsts = timegpt_fcst_insample.merge(Y_df_with_insample_fcsts)
Y_df_with_insample_fcsts = Y_df_with_insample_fcsts.set_index('unique_id')

Y_rec_df = hrec.reconcile(Y_hat_df=timegpt_fcst_outsample.set_index('unique_id'), Y_df=Y_df_with_insample_fcsts, S=S_df, tags=tags)

Again, we plot some of the forecasts. We can see a few, mostly minor differences in the forecasts.

nixtla_client.plot(
    Y_df, 
    Y_rec_df.reset_index(), 
    max_insample_length=4 * 12, 
    unique_ids=['Australia', 'Australia/Queensland','Australia/Queensland/Brisbane', 'Australia/Queensland/Brisbane/Holiday']
)

Let’s numerically verify the forecasts to the situation where we don’t apply a post-processing step. We can use HierarchicalEvaluation for this.

from hierarchicalforecast.evaluation import HierarchicalEvaluation
def rmse(y, y_hat):
    return np.mean(np.sqrt(np.mean((y-y_hat)**2, axis=1)))

eval_tags = {}
eval_tags['Total'] = tags['Country']
eval_tags['Purpose'] = tags['Country/Purpose']
eval_tags['State'] = tags['Country/State']
eval_tags['Regions'] = tags['Country/State/Region']
eval_tags['Bottom'] = tags['Country/State/Region/Purpose']
eval_tags['All'] = np.concatenate(list(tags.values()))

evaluator = HierarchicalEvaluation(evaluators=[rmse])
evaluation = evaluator.evaluate(
        Y_hat_df=Y_rec_df.rename(columns={'y': 'TimeGPT'}), Y_test_df=Y_test_df.set_index('unique_id'),
        tags=eval_tags, Y_df=Y_train_df.set_index('unique_id')
)
evaluation = evaluation.drop(labels='Overall', level='level')
evaluation.columns = ['Base', 'MinTrace(ols)', 'MinTrace(mint_shrink)']
evaluation = evaluation.map('{:.2f}'.format)
evaluation
BaseMinTrace(ols)MinTrace(mint_shrink)
levelmetric
Totalrmse1433.141436.121639.77
Purposermse482.10475.65511.14
Statermse275.86278.39295.03
Regionsrmse49.4047.9148.10
Bottomrmse19.3219.1118.87
Allrmse43.0042.5043.54

We made a small improvement in overall RMSE by reconciling the forecasts with MinTrace(ols), and made them slightly worse using MinTrace(mint_shrink), indicating that the base forecasts were relatively strong already.

However, we now have coherent forecasts too - so not only did we make a (small) accuracy improvement, we also got coherency to the hierarchy as a result of our reconciliation step.

References