question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`feast apply` not registers all features in feature repository

See original GitHub issue

Expected Behavior

When I sync the feature repository using feast apply, all metadata in the feature repository is what I defined in my definitions files.

Current Behavior

This is the Feast objects definition file:

from datetime import timedelta

from feast import BigQuerySource, FeatureView, Field
from feast.types import Float32, Int32, Int64, String

from marketplace_sandbox.config import PROJECT_ID
from marketplace_sandbox.examples.entities import item, portal

entities = [item, portal]

fields = [
    Field(
        name="listing_price_local_currency",
        dtype=Float32,
    ),
    Field(
        name="listing_currency",
        dtype=String,
    ),
    Field(
        name="item_color1_id",
        dtype=Int32,
    ),
    Field(
        name="item_color2_id",
        dtype=Int32,
    ),
    Field(
        name="first_visible_at",
        dtype=Int64,
    ),
    Field(
        name="promoted_until",
        dtype=Int64,
    ),
    Field(
        name="item_catalog_parent_id",
        dtype=Int64,
    ),
    Field(
        name="item_catalog_parent_id_1",
        dtype=Int64,
    ),
    Field(
        name="item_catalog_parent_id_2",
        dtype=Int64,
    ),
    Field(
        name="item_catalog_parent_id_3",
        dtype=Int64,
    ),
    Field(
        name="item_catalog_parent_id_4",
        dtype=Int64,
    ),
    Field(
        name="item_photo_count",
        dtype=Int32,
    ),
    Field(
        name="description_word_count",
        dtype=Int32,
    ),
    Field(
        name="item_id_7d_impressions_3600srt",
        dtype=Int32,
    ),
    Field(
        name="item_id_7d_clicks_3600srt",
        dtype=Int32,
    ),
    Field(
        name="item_id_7d_ctr_3600srt",
        dtype=Float32,
    ),
    Field(
        name="item_views_14d_hourly",
        dtype=Int32,
    ),
    Field(
        name="item_brand_id",
        dtype=Int32,
    ),
    Field(
        name="item_catalog_id",
        dtype=Int32,
    ),
    Field(
        name="item_country_id",
        dtype=Int32,
    ),
    Field(
        name="item_user_id",
        dtype=Int64,
    ),
    Field(
        name="item_language_id",
        dtype=Int32,
    ),
    Field(
        name="item_size_id",
        dtype=Int32,
    ),
    Field(
        name="item_status_id",
        dtype=Int32,
    ),
    Field(
        name="delay_international_visibility_by",
        dtype=Int32,
    ),
]

BQ_DATASET_NAME = "example_item_stats_per_portal"
BQ_TABLE_NAME = "v1"
BQ_TABLE_REFERENCE = f"{PROJECT_ID}.{BQ_DATASET_NAME}.{BQ_TABLE_NAME}"

batch_source = BigQuerySource(
    table=BQ_TABLE_REFERENCE,
    created_timestamp_column="created_timestamp",
    timestamp_field="event_timestamp",
    description=(
        "Example of BigQuery table that contains item features per portal."
    ),
    owner="VMIP",
    tags={},
)


example_item_stats_per_portal_v1_fv = FeatureView(
    name=f"{BQ_DATASET_NAME}_{BQ_TABLE_NAME}",
    entities=[entity.name for entity in entities],
    ttl=timedelta(weeks=52),
    online=True,
    batch_source=batch_source,
    schema=fields,
    description=(
        "Example of feature view that contains item features per portal."
    ),
    owner="VMIP",
    tags={},
)

After using feast apply and retrieving feature view definition by using feast feature-views describe example_item_stats_per_portal_v1 I am getting:

spec:
  name: example_item_stats_per_portal_v1
  entities:
  - example_item
  - example_portal
  features:
  - name: listing_price_local_currency
    valueType: FLOAT
  - name: listing_currency
    valueType: STRING
  - name: item_color1_id
    valueType: INT32
  - name: item_color2_id
    valueType: INT32
  - name: first_visible_at
    valueType: INT64
  - name: promoted_until
    valueType: INT64
  - name: item_catalog_parent_id
    valueType: INT64
  - name: item_catalog_parent_id_2
    valueType: INT64
  - name: item_catalog_parent_id_3
    valueType: INT64
  - name: item_catalog_parent_id_4
    valueType: INT64
  - name: item_photo_count
    valueType: INT32
  - name: description_word_count
    valueType: INT32
  - name: item_id_7d_impressions_3600srt
    valueType: INT32
  - name: item_id_7d_clicks_3600srt
    valueType: INT32
    - name: item_id_7d_ctr_3600srt
    valueType: FLOAT
  - name: item_views_14d_hourly
    valueType: INT32
  - name: delay_international_visibility_by
    valueType: INT32
  ttl: 31449600s
...

In definition file have 25 features, when from the feature repository getting only 17. Missing are: ['item_catalog_parent_id_1', 'item_brand_id', 'item_catalog_id', 'item_country_id', 'item_user_id', 'item_language_id', 'item_size_id', 'item_status_id']

Want to mention that I have the same situation (identical registered and missing features) with a similar feature view that is defined using Features instead of Field.

Definition here:

from datetime import timedelta

from feast import BigQuerySource, Feature, FeatureView, ValueType

from marketplace_sandbox.config import BQ_POC_RERANKER_TABLE_REFERENCE
from marketplace_sandbox.entities import item, portal
from marketplace_sandbox.utils import generate_data_bq_query

FEATURE_VIEW_NAME = "item_stats_per_portal_v1"

entities = [item, portal]

features = [
    Feature(
        name="listing_price_local_currency",
        dtype=ValueType.FLOAT,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="listing_currency",
        dtype=ValueType.STRING,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_color1_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_color2_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="first_visible_at",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="promoted_until",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_catalog_parent_id",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_catalog_parent_id_1",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_catalog_parent_id_2",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_catalog_parent_id_3",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_catalog_parent_id_4",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_photo_count",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="description_word_count",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_id_7d_impressions_3600srt",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_id_7d_clicks_3600srt",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_id_7d_ctr_3600srt",
        dtype=ValueType.FLOAT,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_views_14d_hourly",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_brand_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_catalog_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_country_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_user_id",
        dtype=ValueType.INT64,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_language_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_size_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="item_status_id",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
    Feature(
        name="delay_international_visibility_by",
        dtype=ValueType.INT32,
        labels={"owner": "VMIP"},
    ),
]

batch_source = BigQuerySource(
    name=f"{BQ_POC_RERANKER_TABLE_REFERENCE}.{FEATURE_VIEW_NAME}",
    query=generate_data_bq_query(
        features=features,
        entities=entities,
        tables=[BQ_POC_RERANKER_TABLE_REFERENCE],
    ),
    event_timestamp_column="event_time",
)

item_stats_per_portal_fv = FeatureView(
    name=FEATURE_VIEW_NAME,
    entities=[entity.name for entity in entities],
    ttl=timedelta(weeks=52),
    features=features,
    batch_source=batch_source,
    owner="VMIP",
)

Steps to reproduce

  1. Create a definition file. Can be simillar that was given in Current Behavior.
  2. run feast apply.
  3. describe feature view using feast feature-views describe <FEATRUE VIEW NAME>

Specifications

  • Version: v0.20.0
  • Platform: GCP
  • Subsystem:

Possible Solution

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8

github_iconTop GitHub Comments

1reaction
KarolisKontcommented, Apr 21, 2022

Closing this issue that is apparently only on our forked repo, sorry for the false alarm.

0reactions
KarolisKontcommented, Apr 22, 2022

Didn’t looked why it behaves like that - not pushing all feature view features.

@felixwang9817 but tried feast-dev/feast installing master branch Head, it behaves the same way (not selecting all features).

Also noticed that there is a git history difference and I can’t merge properly the master branch with v0.20-branch, merging shows weird conflicts, some files don’t modify at all - keeping what is on master.

Tested with these tags: v0.20.0 and v0.20.1 and it seems it works properly.

So I am a bit confused, why some commits are pushed directly to the master branch and other to the dedicated minor release branch.

Is it normal that master Head doesn’t work properly until you add a commit that is tagged?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Quickstart - Feast
1. Register feature definitions through feast apply · 2. Generate a training dataset (using get_historical_features ) · 3. Generate features for batch scoring...
Read more >
Creating a Feature Store with Feast | by Kedion - Medium
First, you define your entities, feature views, feature services, and data sources and register them in your feature store. Feast will register feature...
Read more >
How to add a new dataset to the Feast feature store
In this article, I'll show how to use the Feast feature store in a local environment. We will download a dataset, store it...
Read more >
Getting started with Feast, an open source feature store ... - AWS
Managed serving infrastructure: Feast takes all the work out of setting up ... register the zip code and credit history features we will...
Read more >
How to Use Feast Feature Store for Fintech? - Royal Cyber
Understanding a feature store as a data management tool for machine learning, which allows users to share features and create robust ML pipelines, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found