Snowflake: Scala External Tables (On-Demand FGs)

Snowflake (On-Demand) Feature Group

This tutorial notebook will show how you can define an external feature group for a table in Snowflake.

In this notebook we assume that you already have snowflake account and did the getting started with snowflake tutorial.

import com.logicalclocks.hsfs._
Starting Spark application
IDYARN Application IDKindStateSpark UIDriver log
0application_1614610677629_0001sparkidleLinkLink
SparkSession available as 'spark'.
import com.logicalclocks.hsfs._
val connection = HopsworksConnection.builder().build();
val fs = connection.getFeatureStore();
connection: com.logicalclocks.hsfs.HopsworksConnection = com.logicalclocks.hsfs.HopsworksConnection@295412ba
fs: com.logicalclocks.hsfs.FeatureStore = FeatureStore{id=67, name='project1_featurestore', projectId=119, featureGroupApi=com.logicalclocks.hsfs.metadata.FeatureGroupApi@4851643}
val snowflakeConn = fs.getSnowflakeConnector("sfconnector")
var sfOptions = snowflakeConn.sparkOptions
sfOptions.put("query", "select * from TELCO")
val df = spark.read.format("net.snowflake.spark.snowflake").options(sfOptions).load()
df.show(10)
df: org.apache.spark.sql.DataFrame = [CUSTOMER_ID: string, GENDER: string ... 19 more fields]
+-----------+------+--------------+-------+----------+------+-------------+----------------+----------------+---------------+-------------+-----------------+------------+------------+----------------+--------------+-----------------+--------------------+---------------+-------------+-----+
|CUSTOMER_ID|GENDER|SENIOR_CITIZEN|PARTNER|DEPENDENTS|TENURE|PHONE_SERVICE|  MULTIPLE_LINES|INTERNET_SERVICE|ONLINE_SECURITY|ONLINE_BACKUP|DEVICE_PROTECTION|TECH_SUPPORT|STREAMING_TV|STREAMING_MOVIES|      CONTRACT|PAPERLESS_BILLING|      PAYMENT_METHOD|MONTHLY_CHARGES|TOTAL_CHARGES|CHURN|
+-----------+------+--------------+-------+----------+------+-------------+----------------+----------------+---------------+-------------+-----------------+------------+------------+----------------+--------------+-----------------+--------------------+---------------+-------------+-----+
| 7590-VHVEG|Female|         false|    Yes|        No|     1|           No|No phone service|             DSL|             No|          Yes|               No|          No|          No|              No|Month-to-month|              Yes|    Electronic check|          29.85|        29.85|   No|
| 5575-GNVDE|  Male|         false|     No|        No|    34|          Yes|              No|             DSL|            Yes|           No|              Yes|          No|          No|              No|      One year|               No|        Mailed check|          56.95|       1889.5|   No|
| 3668-QPYBK|  Male|         false|     No|        No|     2|          Yes|              No|             DSL|            Yes|          Yes|               No|          No|          No|              No|Month-to-month|              Yes|        Mailed check|          53.85|       108.15|  Yes|
| 7795-CFOCW|  Male|         false|     No|        No|    45|           No|No phone service|             DSL|            Yes|           No|              Yes|         Yes|          No|              No|      One year|               No|Bank transfer (au...|           42.3|      1840.75|   No|
| 9237-HQITU|Female|         false|     No|        No|     2|          Yes|              No|     Fiber optic|             No|           No|               No|          No|          No|              No|Month-to-month|              Yes|    Electronic check|           70.7|       151.65|  Yes|
| 9305-CDSKC|Female|         false|     No|        No|     8|          Yes|             Yes|     Fiber optic|             No|           No|              Yes|          No|         Yes|             Yes|Month-to-month|              Yes|    Electronic check|          99.65|        820.5|  Yes|
| 1452-KIOVK|  Male|         false|     No|       Yes|    22|          Yes|             Yes|     Fiber optic|             No|          Yes|               No|          No|         Yes|              No|Month-to-month|              Yes|Credit card (auto...|           89.1|       1949.4|   No|
| 6713-OKOMC|Female|         false|     No|        No|    10|           No|No phone service|             DSL|            Yes|           No|               No|          No|          No|              No|Month-to-month|               No|        Mailed check|          29.75|        301.9|   No|
| 7892-POOKP|Female|         false|    Yes|        No|    28|          Yes|             Yes|     Fiber optic|             No|           No|              Yes|         Yes|         Yes|             Yes|Month-to-month|              Yes|    Electronic check|          104.8|      3046.05|  Yes|
| 6388-TABGU|  Male|         false|     No|       Yes|    62|          Yes|              No|             DSL|            Yes|          Yes|               No|          No|          No|              No|      One year|               No|Bank transfer (au...|          56.15|      3487.95|   No|
+-----------+------+--------------+-------+----------+------+-------------+----------------+----------------+---------------+-------------+-----------------+------------+------------+----------------+--------------+-----------------+--------------------+---------------+-------------+-----+
only showing top 10 rows

External (On-Demand) Feature Group

An on-Demand Feature Group stores the metadata for features in the feature store. The actual feature data is read from the external source using a storage connector. The next cell shows how to create an on-demand feature group using the snowflake connector.

val telcoOnDmd = (fs.createOnDemandFeatureGroup()
                    .name("telco_snowflake_scala")
                    .version(1)
                    .query("select * from CUSTOMER_CHURN")
                    .description("On-demand feature group for telecom customer data")
                    .storageConnector(snowflakeConn)
                    .statisticsConfig(new StatisticsConfig(true, true, true, true))
                    .build())
telcoOnDmd: com.logicalclocks.hsfs.OnDemandFeatureGroup = com.logicalclocks.hsfs.OnDemandFeatureGroup@55afc64e
telcoOnDmd.save()
import java.util.Arrays;
telcoOnDmd.select(Arrays.asList("customer_id", "internet_service", "phone_service", "total_charges", "churn")).show(5)
import java.util.Arrays
+-----------+----------------+-------------+-------------+-----+
|customer_id|internet_service|phone_service|total_charges|churn|
+-----------+----------------+-------------+-------------+-----+
| 7590-VHVEG|             DSL|           No|        29.85|   No|
| 5575-GNVDE|             DSL|          Yes|       1889.5|   No|
| 3668-QPYBK|             DSL|          Yes|       108.15|  Yes|
| 7795-CFOCW|             DSL|           No|      1840.75|   No|
| 9237-HQITU|     Fiber optic|          Yes|       151.65|  Yes|
+-----------+----------------+-------------+-------------+-----+
only showing top 5 rows