Skip to main content
Skip table of contents

Feature Store with Feast

Kaptain comes pre-configured with Feast, a feature store that helps bridge the gap between models and data, therefore facilitating the development of ML/AI models and features. Feast is compatible with all types of Kaptain clusters, regardless of the license type, cluster type (managed, attached, Essential), or environment (networked, air-gapped, on-prem).

In Kaptain, Feast is available in your Jupyter notebooks and, among others, it allows you to: 

  • Consolidate and feed data from different data sources into your notebooks and production environments

  • Ensure consistency between training and serving data 

  • Transform data and share features across teams

When using Feast as a data store, you are able to feed data from different source types to your models for training or serving. 

The configuration of Feast as a data store takes place in your notebook directly and is different for each type of data source. For online or streaming data, set up and customize Redis as a key object store. For offline or batch data, set up any data warehouse like BigQuery, an S3 bucket or GCS. Alternatively, you can also use a scalable database-backed registry like MySQL (which is available with Kaptain per default).

Set up Feast to use it in production for Kaptain

Prerequisites

  • You have deployed Kaptain into a cluster.

  • Depending on the type of data storage, ensure the data is: accessible from your notebook, the notebook has read access, and that the env variables have been set up correctly in your notebook.

  • You have defined entities for each one of your data sources. Refer to the Feast tutorial for more information on how to do this. 

  • You obtained MySQL and Redis credentials from the administrator.
    OR
    The administrator has pre-configured the secrets required in the Inject the Feast configuration into a notebook server section. 

Set up a database-backed registry 

Kaptain provides a consolidated MySQL cluster with primary-primary replication for storage of the Pipelines execution history and artifacts, and Katib experiment results. It also can be used as a scalable registry for Feast projects. Kaptain notebook images come with pre-installed libraries, allowing Feast to integrate with MySQL registries. 

To enable SQL registry for your Feast project, set the following configuration in the feature_store.yaml file:

CODE
project: my_project
registry:
  registry_type: sql
  path: mysql+pymysql://<user>:<password>@<mysql_host>:<mysql_port>/feast
provider: local
online_store:
    type: sqlite
    path: data/online_store.db
entity_key_serialization_version: 2

Use Redis as the online store

Kaptain includes a fault-tolerant, distributed, highly-available Redis cluster, that can be used as an online store in Feast. Redis cluster configuration can be found in the Configuration page.

For security reasons, D2iQ recommends changing the default password of the Redis cluster during the installation of Kaptain.

To configure a Feast project to use Redis as an online store, set the following configuration in the feature_store.yaml file:

CODE
project: my_project
registry: data/registry.db
provider: local
online_store:
  type: redis
  redis_type: redis_cluster
  connection_string: "<redis_host>:<redis_port>,password=<password>"
entity_key_serialization_version: 2

Inject the Feast configuration into a notebook server 

To connect Feast to your environment as a Feature Store, create Secrets and distribute them to Redis, MySQL, (and the respective user namespaces and Kubeflow profiles). To do so, you require admin rights or knowledge of the Secret’s credentials.

For security and traceability reasons, it is best practice for the Kaptain administrator to be the only entity generating user credentials to databases in MySQL. For Redis, said admin should be the only person distributing the shared password with users. 

  1. Create a Secret containing the configuration properties for the MySQL and Redis clusters, and update the credentials, if necessary:

    CODE
    export USER_NAMESPACE=<user namespace>
    cat << EOF | kubectl apply -n ${USER_NAMESPACE} -f -
    apiVersion: v1
    kind: Secret
    metadata:
     name: feast-conf
    type: Opaque
    stringData:
     MYSQL_HOST: kaptain-mysql-store-haproxy.kubeflow
     MYSQL_PORT: "3306"
     REDIS_HOST: redis.kubeflow
     REDIS_PORT: "6379"
     FEAST_USAGE: "False"
    data:
     MYSQL_USER: <MySQL user name>
     MYSQL_PASSWORD: <MySQL password>
     REDIS_PASSWORD: <Redis password>
    EOF
  2. Create a Feast repository configuration file and store it in a ConfigMap:

    CODE
    apiVersion: v1
    kind: ConfigMap
    metadata:
     name: feature-store
    data:
     feature_store.yaml: |
       project: wine
       registry:
         registry_type: sql
         path: mysql+pymysql://${MYSQL_USER}:${MYSQL_PASSWORD}@${MYSQL_HOST}:${MYSQL_PORT}/feast
       provider: local
       online_store:
         type: redis
         redis_type: redis_cluster
         connection_string: "${REDIS_HOST}:${REDIS_PORT},password=${REDIS_PASSWORD}"
       entity_key_serialization_version: 2
  3. Create a PodDefault referencing the previously created Secret:

    CODE
    cat << EOF | kubectl apply -n ${USER_NAMESPACE} -f -
    apiVersion: "kubeflow.org/v1alpha1"
    kind: PodDefault
    metadata:
     name: feast-conf
    spec:
     selector:
       matchLabels:
         feast-conf: "true"
     desc: "Inject Feast configuration"
     envFrom:
     - secretRef:
         name: feast-conf
     volumeMounts:
     - name: feature-store
       mountPath: <feast-project-dir, e.g. /home/kubeflow/wine>
     volumes:
     - name: feature-store
       configMap:
         name: feature-store
    EOF
  4. When you create a new notebook server, make the previous Feast configuration properties available via environment variables. To do so, select Configuration and Inject Feast configuration in the Jupyter notebook UI.

The Feature Store is configured and ready to use in the notebook.

More on Feast

For more information on how to use Feast, refer to the Feast documentation.

For an example of how to use Feast with Kaptain in your ML/AI environment, refer to the Feast tutorial.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.