Python, PySpark, Spark, TensorFlow, Scikit-Learn, PyTorch
Feature Store Notebooks
This page contains example notebooks for Feature Engineering, Feature Ingestion, Feature Selection/Joining, Training Dataset Creating, Model Training, and Model Serving on Hopsworks.
The Feature Store API: hsfs
Try now on Colab
Create your first Features
Training Data
Online Feature Store
Online Feature Store end-to-end
- 0. Create a Kafka topic
- 1. Create empty feature groups for Online Feature Store
- 2. Generate credit card transactions data and send to kafka topic
- 3. Windowed aggregations using spark streaming and ingestion to the online feature store
- 4. Create training dataset from online feature store enabled feature groups
- 5. Online Feature Serving
Online transformations
Data Validation
Feature Store Tags
AWS and the Feature Store
Sagemaker
Databricks and the Feature Store
Databricks Ingestion with PySpark
Delta Lake and the Feature Store
Azure and the Feature Store
Azure Machine Learning with the Feature Store
General Feature Store
What-if-Tool Feature Analytics (Jupyter only, not Jupyterlab)
Snowflake ingestion into Feature Store
Snowflake ingestion to the Feature Store
Model Training on Hopsworks
Hyperparameter Tuning
Model Serving on Hopsworks
Inferencing (Model Serving)
Model Serving with KFServing
- Model Serving with KFServing and Tensorflow - MNIST Classification
- Model Serving with KFServing, Tensorflow and Transformers - MNIST Classification
- Model Serving with KFServing and Scikit-learn - Iris Flower Classification
- Model Serving with KFServing, Scikit-learn and Transformers - Iris Flower Classification
- Model Serving with KFServing, Scikit-learn and Predictors - Iris Flower Classification
- Model Serving with KFServing, Scikit-learn, Predictors and Transformers - Iris Flower Classification
Model Serving with Docker or Kubernetes
Maggy - Distributed Transparent ML
ML Experiments on Hopsworks using TensorFlow
Maggy Distributed Training on Hopsworks using PyTorch
- Creating a Petastorm Dataset from ImageNet
- Creating a Petastorm dataset from MNIST example
- Maggy Distributed Training with PyTorch and DeepSpeed ZeRO example
- Maggy Distributed Training with PyTorch's sharded optimize example
- Maggy Mixed precision training with PyTorch example
- Maggy PyTorch HParam Tuning Example
- Maggy distributed training ResNet-50 on ImageNet (Petastorm)
- Maggy distributed training ResNet-50 on ImageNet using PyTorch
- Maggy precision training using PyTorch
ML experiments on local machine
End-to-end Examples
Use cases
Credit Card Fraud Detection
- 0. Create Kafka topic for financial transactions
- 1. Create empty feature groups in the Online Feature Store
- 2. Generate credit card transactions data
- 3. Windowed aggregations using Spark streaming and ingestion to the Online Feature Store
- 4. Create training dataset from Online Feature Store enabled feature groups
- 5. Train a credit card fraud detector model
- 6. Serve the autoencoder and detect anomalous credit card activity