Ray

Run TimeGPT distributedly on top of Ray

Ray is an open source unified compute framework to scale Python workloads. In this guide, we will explain how to use TimeGPT on top of Ray.

Outline:

Installation
Load Your Data
Initialize Ray
Use TimeGPT on Ray
Shutdown Ray

1. Installation

Install Ray through Fugue. Fugue provides an easy-to-use interface for distributed computing that lets users execute Python code on top of several distributed computing frameworks, including Ray.

Note

You can install fugue with pip:
pip install fugue[ray]

If executing on a distributed Ray cluster, ensure that the nixtla library is installed across all the workers.

2. Load Data

You can load your data as a pandas DataFrame. In this tutorial, we will use a dataset that contains hourly electricity prices from different markets.

import pandas as pd

df = pd.read_csv(
    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv',
    parse_dates=['ds'],
) 
df.head()

	unique_id	ds	y
0	BE	2016-10-22 00:00:00	70.00
1	BE	2016-10-22 01:00:00	37.10
2	BE	2016-10-22 02:00:00	37.10
3	BE	2016-10-22 03:00:00	44.75
4	BE	2016-10-22 04:00:00	37.10

3. Initialize Ray

Initialize Ray and convert the pandas DataFrame to a Ray DataFrame.

import ray
from ray.cluster_utils import Cluster

ray_cluster = Cluster(
    initialize_head=True,
    head_node_args={"num_cpus": 2}
)
ray.init(address=ray_cluster.address, ignore_reinit_error=True)

2024-05-10 11:09:17,240 WARNING cluster_utils.py:157 -- Ray cluster mode is currently experimental and untested on Windows. If you are using it and running into issues please file a report at https://github.com/ray-project/ray/issues.
2024-05-10 11:09:19,076 INFO worker.py:1564 -- Connecting to existing Ray cluster at address: 127.0.0.1:63694...
2024-05-10 11:09:19,092 INFO worker.py:1740 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265

Python version:	3.10.14
Ray version:	2.20.0
Dashboard:	http://127.0.0.1:8265

</div>

ray_df = ray.data.from_pandas(df)
ray_df

MaterializedDataset(
   num_blocks=1,
   num_rows=8400,
   schema={unique_id: object, ds: object, y: float64}
)

4. Use TimeGPT on Ray

Using TimeGPT on top of Ray is almost identical to the non-distributed case. The only difference is that you need to use a Ray DataFrame.

First, instantiate the NixtlaClient class.

from nixtla import NixtlaClient

nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍
Use an Azure AI endpoint
To use an Azure AI endpoint, set the base_url argument:
nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")

Then use any method from the NixtlaClient class such as forecast or cross_validation.

%%capture
fcst_df = nixtla_client.forecast(ray_df, h=12)

📘
Available models in Azure AI
If you are using an Azure AI endpoint, please be sure to set model="azureai":
nixtla_client.forecast(..., model="azureai")
For the public API, we support two models: timegpt-1 and timegpt-1-long-horizon.
By default, timegpt-1 is used. Please see this tutorial on how and when to use timegpt-1-long-horizon.

To visualize the result, use the to_pandas method to convert the output of Ray to a pandas DataFrame.

fcst_df.to_pandas().tail()

	unique_id	ds	TimeGPT
55	NP	2018-12-24 07:00:00	55.387066
56	NP	2018-12-24 08:00:00	56.115517
57	NP	2018-12-24 09:00:00	56.090714
58	NP	2018-12-24 10:00:00	55.813717
59	NP	2018-12-24 11:00:00	55.528519

%%capture
cv_df = nixtla_client.cross_validation(ray_df, h=12, freq='H', n_windows=5, step_size=2)

cv_df.to_pandas().tail()

	unique_id	ds	cutoff	TimeGPT
295	NP	2018-12-23 19:00:00	2018-12-23 11:00:00	53.632019
296	NP	2018-12-23 20:00:00	2018-12-23 11:00:00	52.512775
297	NP	2018-12-23 21:00:00	2018-12-23 11:00:00	51.894035
298	NP	2018-12-23 22:00:00	2018-12-23 11:00:00	51.06572
299	NP	2018-12-23 23:00:00	2018-12-23 11:00:00	50.32592

You can also use exogenous variables with TimeGPT on top of Ray. To do this, please refer to the Exogenous Variables tutorial. Just keep in mind that instead of using a pandas DataFrame, you need to use a Ray DataFrame instead.

5. Shutdown Ray

When you are done, shutdown the Ray session.

ray.shutdown()

Ray

1. Installation

2. Load Data

3. Initialize Ray

4. Use TimeGPT on Ray

👍
Use an Azure AI endpoint

📘
Available models in Azure AI

5. Shutdown Ray

1. Installation

2. Load Data

3. Initialize Ray

4. Use TimeGPT on Ray

👍Use an Azure AI endpoint

📘Available models in Azure AI

5. Shutdown Ray

👍
Use an Azure AI endpoint

📘
Available models in Azure AI