Improve Forecast Accuracy with TimeGPT

In this notebook, we demonstrate how to use TimeGPT for forecasting and explore three common strategies to enhance forecast accuracy. We use the hourly electricity price data from Germany as our example dataset. Before running the notebook, please initiate a NixtlaClient object with your api_key in the code snippet below.

Result Summary

StepsDescriptionMAEMAE Improvement (%)RMSERMSE Improvement (%)
0Zero-Shot TimeGPT18.5N/A20.0N/A
1Add Fine-Tuning Steps12.035.14%13.333.5%
2Adjust Fine-Tuning Loss9.250.27%12.040.0%
3Fine-tune more parameters9.150.81%11.443.0%
4Add Exogenous Variables10.145.41%11.443.0%
5Switch to Long-Horizon Model6.465.38%7.761.50%

First, we install and import the required packages, initialize the Nixtla client and create a function for calculating evaluation metrics.

import numpy as np
import pandas as pd

from utilsforecast.evaluation import evaluate
from utilsforecast.plotting import plot_series
from utilsforecast.losses import mae, rmse
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
    # api_key = 'my_api_key_provided_by_nixtla'
)

1. load in dataset

In this notebook, we use hourly electricity prices as our example dataset, which consists of 5 time series, each with approximately 1700 data points. For demonstration purposes, we focus on the German electricity price series. The time series is split, with the last 48 steps (2 days) set aside as the test set.

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv')
df['ds'] = pd.to_datetime(df['ds'])
df_sub = df.query('unique_id == "DE"')
df_train = df_sub.query('ds < "2017-12-29"')
df_test = df_sub.query('ds >= "2017-12-29"')
df_train.shape, df_test.shape
((1632, 12), (48, 12))
plot_series(df_train[['unique_id','ds','y']][-200:], forecasts_df= df_test[['unique_id','ds','y']].rename(columns={'y': 'test'}))

2. Benchmark Forecasting using TimeGPT

We used TimeGPT to generate a zero-shot forecast for the time series. As illustrated in the plot, TimeGPT captures the overall trend reasonably well, but it falls short in modeling the short-term fluctuations and cyclical patterns present in the actual data. During the test period, the model achieved a Mean Absolute Error (MAE) of 18.5 and a Root Mean Square Error (RMSE) of 20. This forecast serves as a baseline for further comparison and optimization.

fcst_timegpt = nixtla_client.forecast(df = df_train[['unique_id','ds','y']],
                                      h=2*24,
                                      target_col = 'y',
                                      level = [90, 95])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: H
INFO:nixtla.nixtla_client:Querying model metadata...
WARNING:nixtla.nixtla_client:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
metrics = [mae, rmse]
evaluation = evaluate(
    fcst_timegpt.merge(df_test, on=['unique_id', 'ds']),
    metrics=metrics,
    models=['TimeGPT']
)
evaluation
unique_idmetricTimeGPT
0DEmae18.519004
1DErmse20.037751
plot_series(df_sub.iloc[-150:], forecasts_df= fcst_timegpt, level = [90])

3. Methods to Improve Forecast Accuracy

3a. Add Finetune Steps

The first approach to enhance forecast accuracy is to increase the number of fine-tuning steps. The fine-tuning process adjusts the weights within the TimeGPT model, allowing it to better fit your customized data. This adjustment enables TimeGPT to learn the nuances of your time series more effectively, leading to more accurate forecasts. With 30 fine-tuning steps, we observe that the MAE decreases to 12 and the RMSE drops to 13.2.

fcst_finetune_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                          h=24*2,
                                          finetune_steps = 30,
                                          level=[90, 95])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: H
WARNING:nixtla.nixtla_client:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
evaluation = evaluate(
    fcst_finetune_df.merge(df_test, on=['unique_id', 'ds']),
    metrics=metrics,
    models=['TimeGPT']
)
evaluation
unique_idmetricTimeGPT
0DEmae11.450266
1DErmse12.634852
plot_series(df_sub[-200:], forecasts_df= fcst_finetune_df, level = [90])

3b. Finetune with Different Loss Function

The second way to further reduce forecast error is to adjust the loss function used during fine-tuning. You can specify your customized loss function using the finetune_loss parameter. By modifying the loss function, we observe that the MAE decreases to 10 and the RMSE reduces to 11.4.

fcst_finetune_mae_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                          h=24*2,
                                          finetune_steps = 30,
                                          finetune_loss = 'mae',
                                          level=[90, 95])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: H
WARNING:nixtla.nixtla_client:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
evaluation = evaluate(
    fcst_finetune_mae_df.merge(df_test, on=['unique_id', 'ds']),
    metrics=metrics,
    models=['TimeGPT']
)
evaluation
unique_idmetricTimeGPT
0DEmae9.636473
1DErmse10.948288
plot_series(df_sub[-200:], forecasts_df= fcst_finetune_mae_df, level = [90])

3c. Adjust the number of parameters being fine-tuned

Using the finetune_depth parameter, we can control the number of parameters that get fine-tuned. By default, finetune_depth=1, meaning that few parameters are tuned. We can set it to any value from 1 to 5, where 5 means that we fine-tune all of the parameters of the model.

fcst_finetune_depth_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                                h=24*2,
                                                finetune_steps = 30,
                                                finetune_depth=2,
                                                finetune_loss = 'mae',
                                                level=[90, 95])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: H
WARNING:nixtla.nixtla_client:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
evaluation = evaluate(
    fcst_finetune_depth_df.merge(df_test, on=['unique_id', 'ds']),
    metrics=metrics,
    models=['TimeGPT']
)
evaluation
unique_idmetricTimeGPT
0DEmae9.05529
1DErmse11.41665
plot_series(df_sub[-200:], forecasts_df= fcst_finetune_depth_df, level = [90])

3d. Forecast with Exogenous Variables

Exogenous variables are external factors or predictors that are not part of the target time series but can influence its behavior. Incorporating these variables can provide the model with additional context, improving its ability to understand complex relationships and patterns in the data.

To use exogenous variables in TimeGPT, pair each point in your input time series with the corresponding external data. If you have future values available for these variables during the forecast period, include them using the X_df parameter. Otherwise, you can omit this parameter and still see improvements using only historical values. In the example below, we incorporate 8 historical exogenous variables along with their values during the test period, which reduces the MAE and RMSE to 9.2 and 11.9, respectively.

df_train.head()
unique_iddsyExogenous1Exogenous2day_0day_1day_2day_3day_4day_5day_6
1680DE2017-10-22 00:00:0019.10587.2516972.750.00.00.00.00.00.01.0
1681DE2017-10-22 01:00:0019.03623.0016254.500.00.00.00.00.00.01.0
1682DE2017-10-22 02:00:0016.90650.0015940.250.00.00.00.00.00.01.0
1683DE2017-10-22 03:00:0012.98687.2515959.500.00.00.00.00.00.01.0
1684DE2017-10-22 04:00:009.24717.2516071.500.00.00.00.00.00.01.0
future_ex_vars_df = df_test.drop(columns = ['y'])
future_ex_vars_df.head()
unique_iddsExogenous1Exogenous2day_0day_1day_2day_3day_4day_5day_6
3312DE2017-12-29 00:00:00917.5017347.000.00.00.00.01.00.00.0
3313DE2017-12-29 01:00:00925.7516587.250.00.00.00.01.00.00.0
3314DE2017-12-29 02:00:00930.7516396.000.00.00.00.01.00.00.0
3315DE2017-12-29 03:00:00933.5016481.250.00.00.00.01.00.00.0
3316DE2017-12-29 04:00:00927.5016827.750.00.00.00.01.00.00.0
fcst_ex_vars_df = nixtla_client.forecast(df=df_train,
                                         X_df=future_ex_vars_df,
                                         h=24*2,
                                         level=[90, 95])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: h
WARNING:nixtla.nixtla_client:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Using future exogenous features: ['Exogenous1', 'Exogenous2', 'day_0', 'day_1', 'day_2', 'day_3', 'day_4', 'day_5', 'day_6']
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
evaluation = evaluate(
    fcst_ex_vars_df.merge(df_test, on=['unique_id', 'ds']),
    metrics=metrics,
    models=['TimeGPT']
)
evaluation
unique_idmetricTimeGPT
0DEmae9.165934
1DErmse11.900955
plot_series(df_sub[-200:], forecasts_df= fcst_ex_vars_df, level = [90])

3d. TimeGPT for Long Horizon Forecasting

When the forecasting period is too long, the predicted results may not be as accurate. TimeGPT performs best with forecast periods that are shorter than one complete cycle of the time series. For longer forecast periods, switching to the timegpt-1-long-horizon model can yield better results. You can specify this model by using the model parameter.

In the electricity price time series used here, one cycle is 24 steps (representing one day). Since we’re forecasting two days (48 steps) into the future, using timegpt-1-long-horizon significantly improves the forecasting accuracy, reducing the MAE to 5.7 and RMSE to 7.0.

fcst_long_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                          h=24*2,
                                          model = 'timegpt-1-long-horizon',
                                          level=[90, 95])
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: h
INFO:nixtla.nixtla_client:Querying model metadata...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
evaluation = evaluate(
    fcst_long_df.merge(df_test, on=['unique_id', 'ds']),
    metrics=metrics,
    models=['TimeGPT']
)
evaluation
unique_idmetricTimeGPT
0DEmae6.365540
1DErmse7.738188
plot_series(df_sub[-200:], forecasts_df= fcst_long_df, level = [90])

4. Conclusion and Next Steps

In this notebook, we demonstrated four effective strategies for enhancing forecast accuracy with TimeGPT:

  1. Increasing the number of fine-tuning steps.
  2. Adjusting the fine-tuning loss function.
  3. Incorporating exogenous variables.
  4. Switching to the long-horizon model for extended forecasting periods.

We encourage you to experiment with these hyperparameters to identify the optimal settings that best suit your specific needs. Additionally, please refer to our documentation for further features, such as model explainability and more.

In the examples provided, after applying these methods, we observed significant improvements in forecast accuracy metrics, as summarized below.

Result Summary

StepsDescriptionMAEMAE Improvement (%)RMSERMSE Improvement (%)
0Zero-Shot TimeGPT18.5N/A20.0N/A
1Add Fine-Tuning Steps12.035.14%13.333.5%
2Adjust Fine-Tuning Loss9.250.27%12.040.0%
3Fine-tune more parameters9.150.81%11.443.0%
4Add Exogenous Variables10.145.41%11.443.0%
5Switch to Long-Horizon Model6.465.38%7.761.50%