Holidays and special dates

Calendar variables and special dates are one of the most common types of additional variables used in forecasting applications. They provide additional context on the current state of the time series, especially for window-based models such as TimeGPT-1. These variables often include adding information on each observation’s month, week, day, or hour. For example, in high-frequency hourly data, providing the current month of the year provides more context than the limited history available in the input window to improve the forecasts.

In this tutorial we will show how to add calendar variables automatically to a dataset using the date_features function.

1. Import packages

First, we import the required packages and initialize the Nixtla client.

import pandas as pd
from nixtla import NixtlaClient

nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍
Use an Azure AI endpoint
To use an Azure AI endpoint, remember to set also the base_url argument:
nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")

2. Load data

We will use a Google trends dataset on chocolate, with monthly data.

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/google_trend_chocolate.csv')
df['month'] = pd.to_datetime(df['month']).dt.to_period('M').dt.to_timestamp('M')

df.head()

	month	chocolate
0	2004-01-31	35
1	2004-02-29	45
2	2004-03-31	28
3	2004-04-30	30
4	2004-05-31	29

3. Forecasting with holidays and special dates

Given the predominance usage of calendar variables, we included an automatic creation of common calendar variables to the forecast method as a pre-processing step. Let’s create a future dataframe that contains the upcoming holidays in the United States.

# Create future dataframe with exogenous features

start_date = '2024-05'
dates = pd.date_range(start=start_date, periods=14, freq='M')

dates = dates.to_period('M').to_timestamp('M')

future_df = pd.DataFrame(dates, columns=['month'])

from nixtla.date_features import CountryHolidays

us_holidays = CountryHolidays(countries=['US'])
dates = pd.date_range(start=future_df.iloc[0]['month'], end=future_df.iloc[-1]['month'], freq='D')
holidays_df = us_holidays(dates)
monthly_holidays = holidays_df.resample('M').max()

monthly_holidays = monthly_holidays.reset_index(names='month')

future_df = future_df.merge(monthly_holidays)

future_df.head()

	month	US_Juneteenth National Independence Day	US_Independence Day	US_Labor Day
0	2024-05-31	0	0	0
1	2024-06-30	1	0	0
2	2024-07-31	0	1	0
3	2024-08-31	0	0	0
4	2024-09-30	0	0	1

We perform the same steps for the input dataframe.

# Add exogenous features to input dataframe

dates = pd.date_range(start=df.iloc[0]['month'], end=df.iloc[-1]['month'], freq='D')
holidays_df = us_holidays(dates)
monthly_holidays = holidays_df.resample('M').max()

monthly_holidays = monthly_holidays.reset_index(names='month')

df = df.merge(monthly_holidays)

df.tail()

	month	chocolate	US_New Year's Day	US_Christmas Day	US_Martin Luther King Jr. Day	US_Washington's Birthday
239	2023-12-31	90	0	1	0	0
240	2024-01-31	64	1	0	1	0
241	2024-02-29	66	0	0	0	1
242	2024-03-31	59	0	0	0	0
243	2024-04-30	51	0	0	0	0

Great! Now, TimeGPT will consider the holidays as exogenous variables and the upcoming holidays will help it make predictions.

fcst_df = nixtla_client.forecast(
    df=df,
    h=14,
    freq='M',
    time_col='month',
    target_col='chocolate',
    X_df=future_df
)

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: M
WARNING:nixtla.nixtla_client:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtla.nixtla_client:Using the following exogenous variables: US_New Year's Day, US_Memorial Day, US_Juneteenth National Independence Day, US_Independence Day, US_Labor Day, US_Veterans Day, US_Thanksgiving, US_Christmas Day, US_Martin Luther King Jr. Day, US_Washington's Birthday, US_Columbus Day
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

📘
Available models in Azure AI
If you are using an Azure AI endpoint, please be sure to set model="azureai":
nixtla_client.forecast(..., model="azureai")
For the public API, we support two models: timegpt-1 and timegpt-1-long-horizon.
By default, timegpt-1 is used. Please see this tutorial on how and when to use timegpt-1-long-horizon.

nixtla_client.plot(
    df, 
    fcst_df, 
    time_col='month',
    target_col='chocolate',
)

We can then plot the weights of each holiday to see which are more important in forecasing the interest in chocolate.

nixtla_client.weights_x.plot.barh(x='features', y='weights', figsize=(10, 10))

Here’s a breakdown of how the date_features parameter works:

date_features (bool or list of str or callable): This parameter specifies which date attributes to consider.
- If set to True, the model will automatically add the most common date features related to the frequency of the given dataframe (df). For a daily frequency, this could include features like day of the week, month, and year.
- If provided a list of strings, it will consider those specific date attributes. For example, date_features=['weekday', 'month'] will only add the day of the week and month as features.
- If provided a callable, it should be a function that takes dates as input and returns the desired feature. This gives flexibility in computing custom date features.
date_features_to_one_hot (bool or list of str): After determining the date features, one might want to one-hot encode them, especially if they are categorical in nature (like weekdays). One-hot encoding transforms these categorical features into a binary matrix, making them more suitable for many machine learning algorithms.
- If date_features=True, then by default, all computed date features will be one-hot encoded.
- If provided a list of strings, only those specific date features will be one-hot encoded.

By leveraging the date_features and date_features_to_one_hot parameters, one can efficiently incorporate the temporal effects of date attributes into their forecasting model, potentially enhancing its accuracy and interpretability.

Holidays and special dates

1. Import packages

👍
Use an Azure AI endpoint

2. Load data

3. Forecasting with holidays and special dates

📘
Available models in Azure AI

1. Import packages

👍Use an Azure AI endpoint

2. Load data

3. Forecasting with holidays and special dates

📘Available models in Azure AI

👍
Use an Azure AI endpoint

📘
Available models in Azure AI