Quantile Forecasts

Quantile forecasts are a critical component of TimeGPT’s approach to time series forecasting, offering an understanding of future uncertainties. Unlike traditional point forecasts, which provide a singular expected value, quantile forecasts present specific percentiles of the forecast distribution, allowing for a more detailed exploration of potential outcomes. This method acknowledges the inherent randomness in future time series values by treating them as random variables with their own distribution, termed the “forecast distribution.” Quantile forecasts then extract key points from this distribution, each representing a different level of confidence or probability. For instance, the 25th and 75th quantiles give insights into the lower and upper quartiles of expected outcomes, respectively, while the 50th quantile, or median, offers a central estimate. By focusing on quantile forecasts, TimeGPT enables users to gauge not just a single expected outcome but to assess the likelihood of various scenarios. This approach is helpful for planning under uncertainty, providing a spectrum of possible future values, each attached to a specific confidence level. Thus, users can make more informed decisions by considering the full range of potential outcomes. TimeGPT uses conformal prediction to produce the quantiles.


import pandas as pd
from nixtlats import NixtlaClient


nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

When using TimeGPT for time series forecasting, you can set the quantiles you want to predict. Here’s how you could do it:


df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df.head()

timestampvalue
01949-01-01112
11949-02-01118
21949-03-01132
31949-04-01129
41949-05-01121

quantiles = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
timegpt_quantile_fcst_df = nixtla_client.forecast(
    df=df, h=12, 
    quantiles=quantiles, 
    time_col='timestamp', target_col='value',
)
timegpt_quantile_fcst_df.head()

INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
timestampTimeGPTTimeGPT-q-10TimeGPT-q-20TimeGPT-q-30TimeGPT-q-40TimeGPT-q-50TimeGPT-q-60TimeGPT-q-70TimeGPT-q-80TimeGPT-q-90
01961-01-01437.837952431.987091435.043799435.384363436.402155437.837952439.273749440.291541440.632104443.688812
11961-02-01426.062744412.704956414.832837416.042432421.719196426.062744430.406293436.083057437.292651439.420532
21961-03-01463.116577437.412564444.234985446.420233450.705762463.116577475.527393479.812921481.998169488.820590
31961-04-01478.244507448.726837455.428375465.570038469.879114478.244507486.609900490.918976501.060638507.762177
41961-05-01505.646484478.409872493.154315497.990848499.138708505.646484512.154260513.302121518.138654532.883096
TimeGPT will return forecasts in the format `TimeGPT-q-{int(100 * q)}` for each quantile `q`.

nixtla_client.plot(
    df, timegpt_quantile_fcst_df, 
    time_col='timestamp', target_col='value',
)


It’s essential to note that the choice of the quantile (or quantiles) depends on your specific use case. For high-stakes predictions, you might lean towards more conservative quantiles, such as the 10th or 20th percentile, to ensure you’re prepared for worse-case scenarios. On the other hand, if you’re in a situation where the cost of over-preparation is high, you might choose a quantile closer to the median, like the 50th percentile, to balance being cautious and efficient. For instance, if you’re managing inventory for a retail business during a big sale event, opting for a lower quantile might help you avoid running out of stock, even if it means you might overstock a bit. But if you’re scheduling staff for a restaurant, you might go with a quantile closer to the middle to ensure you have enough staff on hand without significantly overstaffing. Ultimately, the choice comes down to understanding the balance between risk and cost in your specific context, and using quantile forecasts from TimeGPT allows you to tailor your strategy to fit that balance perfectly.

Historical Forecast

You can also compute quantile forecasts for historical forecasts adding the add_history=True parameter as follows:


timegpt_quantile_fcst_df = nixtla_client.forecast(
    df=df, h=12, 
    quantiles=quantiles, 
    time_col='timestamp', target_col='value',
    add_history=True,
)
timegpt_quantile_fcst_df.head()

INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Calling Historical Forecast Endpoint...
timestampTimeGPTTimeGPT-q-10TimeGPT-q-20TimeGPT-q-30TimeGPT-q-40TimeGPT-q-50TimeGPT-q-60TimeGPT-q-70TimeGPT-q-80TimeGPT-q-90
01951-01-01135.483673111.937767120.020593125.848879130.828935135.483673140.138411145.118467150.946753159.029579
11951-02-01144.442413120.896508128.979334134.807620139.787675144.442413149.097151154.077207159.905493167.988319
21951-03-01157.191910133.646004141.728830147.557116152.537172157.191910161.846648166.826704172.654990180.737815
31951-04-01148.769379125.223473133.306299139.134585144.114641148.769379153.424117158.404172164.232458172.315284
41951-05-01140.472946116.927041125.009866130.838152135.818208140.472946145.127684150.107740155.936026164.018852

nixtla_client.plot(
    df, timegpt_quantile_fcst_df, 
    time_col='timestamp', target_col='value',
)

Cross Validation

The quantiles argument can also be included in the cross_validation method, allowing comparing the performance of TimeGPT across different windows and different quantiles.


timegpt_cv_quantile_fcst_df = nixtla_client.cross_validation(
    df=df, 
    h=12, 
    n_windows=5,
    quantiles=quantiles, 
    time_col='timestamp', 
    target_col='value',
)
timegpt_quantile_fcst_df.head()

INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: MS
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Validating inputs...
timestampTimeGPTTimeGPT-q-10TimeGPT-q-20TimeGPT-q-30TimeGPT-q-40TimeGPT-q-50TimeGPT-q-60TimeGPT-q-70TimeGPT-q-80TimeGPT-q-90
01951-01-01135.483673111.937767120.020593125.848879130.828935135.483673140.138411145.118467150.946753159.029579
11951-02-01144.442413120.896508128.979334134.807620139.787675144.442413149.097151154.077207159.905493167.988319
21951-03-01157.191910133.646004141.728830147.557116152.537172157.191910161.846648166.826704172.654990180.737815
31951-04-01148.769379125.223473133.306299139.134585144.114641148.769379153.424117158.404172164.232458172.315284
41951-05-01140.472946116.927041125.009866130.838152135.818208140.472946145.127684150.107740155.936026164.018852

from IPython.display import display


cutoffs = timegpt_cv_quantile_fcst_df['cutoff'].unique()
for cutoff in cutoffs:
    fig = nixtla_client.plot(
        df.tail(100), 
        timegpt_cv_quantile_fcst_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'value']),
        time_col='timestamp', 
        target_col='value'
    )
    display(fig)