Holidays and Special Dates

Holidays and Special Dates

Calendar variables and special dates are one of the most common types of exogenous variables used in forecasting applications. They provide additional context on the current state of the time series, especially for window-based models such as TimeGPT-1. These variables often include adding information on each observation’s month, week, day, or hour. For example, in high-frequency hourly data, providing the current month of the year provides more context than the limited history available in the input window to improve the forecasts. In this tutorial we will show how to add calendar variables automatically to a dataset using the date_features function.


import pandas as pd
from nixtlats import TimeGPT

/home/ubuntu/miniconda/envs/nixtlats/lib/python3.11/site-packages/statsforecast/core.py:25: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from tqdm.autonotebook import tqdm

timegpt = TimeGPT(
    # defaults to os.environ.get("TIMEGPT_TOKEN")
    token = 'my_token_provided_by_nixtla'
)

Given the predominance usage of calendar variables, we included an automatic creation of common calendar variables to the forecast method as a pre-processing step. To automatically add calendar variables, use the date_features argument.


pltr_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/openbb/pltr.csv')


fcst_pltr_calendar_df = timegpt.forecast(
    df=pltr_df.tail(2 * 14), h=14, freq='B',
    time_col='date', target_col='Close',
    date_features=['month','weekday']
)
fcst_pltr_calendar_df.head()

INFO:nixtlats.timegpt:Validating inputs...
INFO:nixtlats.timegpt:Preprocessing dataframes...
WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtlats.timegpt:Calling Forecast Endpoint...
dateTimeGPT
02023-09-2514.677374
12023-09-2614.825757
22023-09-2715.126798
32023-09-2814.398899
42023-09-2914.387407

timegpt.plot(
    pltr_df, 
    fcst_pltr_calendar_df, 
    id_col='series_id',
    time_col='date',
    target_col='Close',
    max_insample_length=90,
)


We can also plot the importance of each of the date features:


timegpt.weights_x.plot.barh(x='features', y='weights', figsize=(10, 10))


You can also add country holidays using the CountryHolidays class.


from nixtlats.date_features import CountryHolidays


fcst_pltr_calendar_df = timegpt.forecast(
    df=pltr_df, h=14, freq='B',
    time_col='date', target_col='Close',
    date_features=[CountryHolidays(['US'])]
)
timegpt.weights_x.plot.barh(x='features', y='weights', figsize=(10, 10))

INFO:nixtlats.timegpt:Validating inputs...
INFO:nixtlats.timegpt:Preprocessing dataframes...
WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtlats.timegpt:Calling Forecast Endpoint...


Here’s a breakdown of how the date_features parameter works:

  • date_features (bool or list of str or callable): This
    parameter specifies which date attributes to consider.
    • If set to True, the model will automatically add the most
      common date features related to the frequency of the given
      dataframe (df). For a daily frequency, this could include
      features like day of the week, month, and year.
    • If provided a list of strings, it will consider those specific
      date attributes. For example,
      date_features=['weekday', 'month'] will only add the day of
      the week and month as features.
    • If provided a callable, it should be a function that takes dates
      as input and returns the desired feature. This gives flexibility
      in computing custom date features.
  • date_features_to_one_hot (bool or list of str): After
    determining the date features, one might want to one-hot encode
    them, especially if they are categorical in nature (like weekdays).
    One-hot encoding transforms these categorical features into a binary
    matrix, making them more suitable for many machine learning
    algorithms.
    - If date_features=True, then by default, all computed date
    features will be one-hot encoded.
    - If provided a list of strings, only those specific date features
    will be one-hot encoded.
    By leveraging the date_features and date_features_to_one_hot parameters, one can efficiently incorporate the temporal effects of date attributes into their forecasting model, potentially enhancing its accuracy and interpretability.