Exogenous variables

Exogenous variables

Exogenous variables or external factors are crucial in time series forecasting as they provide additional information that might influence the prediction. These variables could include holiday markers, marketing spending, weather data, or any other external data that correlate with the time series data you are forecasting. For example, if you’re forecasting ice cream sales, temperature data could serve as a useful exogenous variable. On hotter days, ice cream sales may increase. To incorporate exogenous variables in TimeGPT, you’ll need to pair each point in your time series data with the corresponding external data.


import pandas as pd
from nixtlats import TimeGPT

/home/ubuntu/miniconda/envs/nixtlats/lib/python3.11/site-packages/statsforecast/core.py:25: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from tqdm.autonotebook import tqdm

timegpt = TimeGPT(
    # defaults to os.environ.get("TIMEGPT_TOKEN")
    token = 'my_token_provided_by_nixtla'
)

Let’s see an example on predicting day-ahead electricity prices. The following dataset contains the hourly electricity price (y column) for five markets in Europe and US, identified by the unique_id column. The columns from Exogenous1 to day_6 are exogenous variables that TimeGPT will use to predict the prices.


df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv')
df.head()

unique_iddsyExogenous1Exogenous2day_0day_1day_2day_3day_4day_5day_6
0BE2016-12-01 00:00:0072.0061507.071066.00.00.00.01.00.00.00.0
1BE2016-12-01 01:00:0065.8059528.067311.00.00.00.01.00.00.00.0
2BE2016-12-01 02:00:0059.9958812.067470.00.00.00.01.00.00.00.0
3BE2016-12-01 03:00:0050.6957676.064529.00.00.00.01.00.00.00.0
4BE2016-12-01 04:00:0052.5856804.062773.00.00.00.01.00.00.00.0
To produce forecasts we also have to add the future values of the exogenous variables. Let’s read this dataset. In this case, we want to predict 24 steps ahead, therefore each `unique_id` will have 24 observations.

future_ex_vars_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-future-ex-vars.csv')
future_ex_vars_df.head()

unique_iddsExogenous1Exogenous2day_0day_1day_2day_3day_4day_5day_6
0BE2016-12-31 00:00:0064108.070318.00.00.00.00.00.01.00.0
1BE2016-12-31 01:00:0062492.067898.00.00.00.00.00.01.00.0
2BE2016-12-31 02:00:0061571.068379.00.00.00.00.00.01.00.0
3BE2016-12-31 03:00:0060381.064972.00.00.00.00.00.01.00.0
4BE2016-12-31 04:00:0060298.062900.00.00.00.00.00.01.00.0
Let’s call the `forecast` method, adding this information:

timegpt_fcst_ex_vars_df = timegpt.forecast(df=df, X_df=future_ex_vars_df, h=24, level=[80, 90])
timegpt_fcst_ex_vars_df.head()

INFO:nixtlats.timegpt:Validating inputs...
INFO:nixtlats.timegpt:Preprocessing dataframes...
INFO:nixtlats.timegpt:Inferred freq: H
INFO:nixtlats.timegpt:Calling Forecast Endpoint...
unique_iddsTimeGPTTimeGPT-lo-90TimeGPT-lo-80TimeGPT-hi-80TimeGPT-hi-90
0BE2016-12-31 00:00:0038.86176233.82107334.36866943.35485443.902450
1BE2016-12-31 01:00:0035.38210230.01459431.49332239.27088240.749610
2BE2016-12-31 02:00:0033.81142526.65882128.54308739.07976440.964029
3BE2016-12-31 03:00:0031.70747524.89620526.81879536.59615538.518745
4BE2016-12-31 04:00:0030.31647521.12514324.43214836.20080139.507807

timegpt.plot(
    df[['unique_id', 'ds', 'y']], 
    timegpt_fcst_ex_vars_df, 
    max_insample_length=365, 
    level=[80, 90], 
)


We also can get the importance of the features.


timegpt.weights_x.plot.barh(x='features', y='weights')


You can also add country holidays using the CountryHolidays class.


from nixtlats.date_features import CountryHolidays


timegpt_fcst_ex_vars_df = timegpt.forecast(
    df=df, X_df=future_ex_vars_df, h=24, level=[80, 90], 
    date_features=[CountryHolidays(['US'])]
)
timegpt.weights_x.plot.barh(x='features', y='weights')

INFO:nixtlats.timegpt:Validating inputs...
INFO:nixtlats.timegpt:Preprocessing dataframes...
INFO:nixtlats.timegpt:Inferred freq: H
INFO:nixtlats.timegpt:Calling Forecast Endpoint...