Running the Feast Tutorial on macOS

Overview

In this article, we’ll follow the Feast tutorial and run it on a Mac.

Prerequisites

Refer to Setting up a Python development environment on Mac with UV to prepare your environment. Once you’ve set up UV, install Feast in that environment.

Installing Feast and Launching the UI

1

uv pip install feast==0.40.1

Note
As of 2025/01/13, there is a known bug in Feast causing the UI to fail on the latest version. (Issue)

Initialize a Feature Repository:

1
2
3


feast init my_feature_repo

Creating a new Feast repository in /Users/kouhei/Program/ML/feast/my_feature_repo.

You’ll see a new repository like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


tree my_feature_repo
.
└── my_feature_repo
    ├── README.md
    ├── __init__.py
    └── feature_repo
        ├── __init__.py
        ├── data
        │    └── driver_stats.parquet
        ├── example_repo.py
        ├── feature_store.yaml
        └── test_workflow.py

Apply the repository configuration:

1
2


cd my_feature_repo/feature_repo
feast apply

Then launch the Feast UI:

1

feast ui

Access http://0.0.0.0:8888/p/my_feature_repo to see the interface.

Manipulating Data in Feast

1. Creating a Training Dataset

Starting from step 5 in the tutorial, you’ll use Jupyter Notebook. Install it first:

1

uv pip install jupyter

Launch the notebook:

1

jupyter notebook

In a Jupyter Notebook, run the following code to create a dataset for training:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


from feast import FeatureStore
import pandas as pd
from datetime import datetime

entity_df = pd.DataFrame.from_dict({
    "driver_id": [1001, 1002, 1003, 1004],
    "event_timestamp": [
        datetime(2021, 4, 12, 10, 59, 42),
        datetime(2021, 4, 12, 8, 12, 10),
        datetime(2021, 4, 12, 16, 40, 26),
        datetime(2021, 4, 12, 15, 1, 12)
    ]
})

store = FeatureStore(repo_path=".")

training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        'driver_hourly_stats:conv_rate',
        'driver_hourly_stats:acc_rate',
        'driver_hourly_stats:avg_daily_trips'
    ],
).to_df()

print(training_df.head())

# Train model
# model = ml.fit(training_df)

Below is an example of the notebook output:

2. Materializing the Online Store

To populate the Online Store, run:

1

feast materialize 1970-01-01T00:00:00Z 2025-01-04T01:24:24Z

Note
In the provided sample, the command feast materialize-incremental $CURRENT_TIME may not work, so we specify a broader time range here.

1
2
3
4
5
6
7
8


01/04/2025 10:28:40 AM root WARNING: _list_feature_views will make breaking changes. ...
Materializing 2 feature views from 1970-01-01 09:00:00+09:00 to 2025-01-04 10:24:24+09:00 into the sqlite online store.

driver_hourly_stats_fresh:
  0%|                                                                         | 0/5 ...
100%|███████████████████████████████████████████████████████████████| 5/5 ...
driver_hourly_stats:
100%|███████████████████████████████████████████████████████████████| 5/5 ...

3. Retrieving Data from the Online Store

Use Jupyter Notebook to fetch data from the Online Store:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


from pprint import pprint
from feast import FeatureStore

store = FeatureStore(repo_path=".")

feature_vector = store.get_online_features(
    features=[
        'driver_hourly_stats:conv_rate',
        'driver_hourly_stats:acc_rate',
        'driver_hourly_stats:avg_daily_trips'
    ],
    entity_rows=[{"driver_id": 1001}]
).to_dict()

pprint(feature_vector)

# Make prediction
# model.predict(feature_vector)

Example output:

1
2
3
4


{'acc_rate': [0.5004482269287109],
 'avg_daily_trips': [691],
 'conv_rate': [0.3067885637283325],
 'driver_id': [1001]}

We’ve successfully materialized data in the Online Store and fetched it using Feast.

Summary

We followed the Feast tutorial to prepare training data, materialize that data into the Online Store, and retrieve it. By managing both training and inference datasets in Feast, you can avoid training-serving skew, enhancing the consistency of your ML workflows.