`feast apply` not registers all features in feature repository
See original GitHub issueExpected Behavior
When I sync the feature repository using feast apply
, all metadata in the feature repository is what I defined in my definitions files.
Current Behavior
This is the Feast objects definition file:
from datetime import timedelta
from feast import BigQuerySource, FeatureView, Field
from feast.types import Float32, Int32, Int64, String
from marketplace_sandbox.config import PROJECT_ID
from marketplace_sandbox.examples.entities import item, portal
entities = [item, portal]
fields = [
Field(
name="listing_price_local_currency",
dtype=Float32,
),
Field(
name="listing_currency",
dtype=String,
),
Field(
name="item_color1_id",
dtype=Int32,
),
Field(
name="item_color2_id",
dtype=Int32,
),
Field(
name="first_visible_at",
dtype=Int64,
),
Field(
name="promoted_until",
dtype=Int64,
),
Field(
name="item_catalog_parent_id",
dtype=Int64,
),
Field(
name="item_catalog_parent_id_1",
dtype=Int64,
),
Field(
name="item_catalog_parent_id_2",
dtype=Int64,
),
Field(
name="item_catalog_parent_id_3",
dtype=Int64,
),
Field(
name="item_catalog_parent_id_4",
dtype=Int64,
),
Field(
name="item_photo_count",
dtype=Int32,
),
Field(
name="description_word_count",
dtype=Int32,
),
Field(
name="item_id_7d_impressions_3600srt",
dtype=Int32,
),
Field(
name="item_id_7d_clicks_3600srt",
dtype=Int32,
),
Field(
name="item_id_7d_ctr_3600srt",
dtype=Float32,
),
Field(
name="item_views_14d_hourly",
dtype=Int32,
),
Field(
name="item_brand_id",
dtype=Int32,
),
Field(
name="item_catalog_id",
dtype=Int32,
),
Field(
name="item_country_id",
dtype=Int32,
),
Field(
name="item_user_id",
dtype=Int64,
),
Field(
name="item_language_id",
dtype=Int32,
),
Field(
name="item_size_id",
dtype=Int32,
),
Field(
name="item_status_id",
dtype=Int32,
),
Field(
name="delay_international_visibility_by",
dtype=Int32,
),
]
BQ_DATASET_NAME = "example_item_stats_per_portal"
BQ_TABLE_NAME = "v1"
BQ_TABLE_REFERENCE = f"{PROJECT_ID}.{BQ_DATASET_NAME}.{BQ_TABLE_NAME}"
batch_source = BigQuerySource(
table=BQ_TABLE_REFERENCE,
created_timestamp_column="created_timestamp",
timestamp_field="event_timestamp",
description=(
"Example of BigQuery table that contains item features per portal."
),
owner="VMIP",
tags={},
)
example_item_stats_per_portal_v1_fv = FeatureView(
name=f"{BQ_DATASET_NAME}_{BQ_TABLE_NAME}",
entities=[entity.name for entity in entities],
ttl=timedelta(weeks=52),
online=True,
batch_source=batch_source,
schema=fields,
description=(
"Example of feature view that contains item features per portal."
),
owner="VMIP",
tags={},
)
After using feast apply
and retrieving feature view definition by using feast feature-views describe example_item_stats_per_portal_v1
I am getting:
spec:
name: example_item_stats_per_portal_v1
entities:
- example_item
- example_portal
features:
- name: listing_price_local_currency
valueType: FLOAT
- name: listing_currency
valueType: STRING
- name: item_color1_id
valueType: INT32
- name: item_color2_id
valueType: INT32
- name: first_visible_at
valueType: INT64
- name: promoted_until
valueType: INT64
- name: item_catalog_parent_id
valueType: INT64
- name: item_catalog_parent_id_2
valueType: INT64
- name: item_catalog_parent_id_3
valueType: INT64
- name: item_catalog_parent_id_4
valueType: INT64
- name: item_photo_count
valueType: INT32
- name: description_word_count
valueType: INT32
- name: item_id_7d_impressions_3600srt
valueType: INT32
- name: item_id_7d_clicks_3600srt
valueType: INT32
- name: item_id_7d_ctr_3600srt
valueType: FLOAT
- name: item_views_14d_hourly
valueType: INT32
- name: delay_international_visibility_by
valueType: INT32
ttl: 31449600s
...
In definition file have 25 features, when from the feature repository getting only 17. Missing are:
['item_catalog_parent_id_1', 'item_brand_id', 'item_catalog_id', 'item_country_id', 'item_user_id', 'item_language_id', 'item_size_id', 'item_status_id']
Want to mention that I have the same situation (identical registered and missing features) with a similar feature view that is defined using Features
instead of Field
.
Definition here:
from datetime import timedelta
from feast import BigQuerySource, Feature, FeatureView, ValueType
from marketplace_sandbox.config import BQ_POC_RERANKER_TABLE_REFERENCE
from marketplace_sandbox.entities import item, portal
from marketplace_sandbox.utils import generate_data_bq_query
FEATURE_VIEW_NAME = "item_stats_per_portal_v1"
entities = [item, portal]
features = [
Feature(
name="listing_price_local_currency",
dtype=ValueType.FLOAT,
labels={"owner": "VMIP"},
),
Feature(
name="listing_currency",
dtype=ValueType.STRING,
labels={"owner": "VMIP"},
),
Feature(
name="item_color1_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_color2_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="first_visible_at",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="promoted_until",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_catalog_parent_id",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_catalog_parent_id_1",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_catalog_parent_id_2",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_catalog_parent_id_3",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_catalog_parent_id_4",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_photo_count",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="description_word_count",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_id_7d_impressions_3600srt",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_id_7d_clicks_3600srt",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_id_7d_ctr_3600srt",
dtype=ValueType.FLOAT,
labels={"owner": "VMIP"},
),
Feature(
name="item_views_14d_hourly",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_brand_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_catalog_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_country_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_user_id",
dtype=ValueType.INT64,
labels={"owner": "VMIP"},
),
Feature(
name="item_language_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_size_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="item_status_id",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
Feature(
name="delay_international_visibility_by",
dtype=ValueType.INT32,
labels={"owner": "VMIP"},
),
]
batch_source = BigQuerySource(
name=f"{BQ_POC_RERANKER_TABLE_REFERENCE}.{FEATURE_VIEW_NAME}",
query=generate_data_bq_query(
features=features,
entities=entities,
tables=[BQ_POC_RERANKER_TABLE_REFERENCE],
),
event_timestamp_column="event_time",
)
item_stats_per_portal_fv = FeatureView(
name=FEATURE_VIEW_NAME,
entities=[entity.name for entity in entities],
ttl=timedelta(weeks=52),
features=features,
batch_source=batch_source,
owner="VMIP",
)
Steps to reproduce
- Create a definition file. Can be simillar that was given in Current Behavior.
- run
feast apply
. - describe feature view using
feast feature-views describe <FEATRUE VIEW NAME>
Specifications
- Version: v0.20.0
- Platform: GCP
- Subsystem:
Possible Solution
Issue Analytics
- State:
- Created a year ago
- Comments:8
Top GitHub Comments
Closing this issue that is apparently only on our forked repo, sorry for the false alarm.
Didn’t looked why it behaves like that - not pushing all feature view features.
@felixwang9817 but tried
feast-dev/feast
installingmaster
branch Head, it behaves the same way (not selecting all features).Also noticed that there is a git history difference and I can’t merge properly the
master
branch withv0.20-branch
, merging shows weird conflicts, some files don’t modify at all - keeping what is onmaster
.Tested with these tags:
v0.20.0
andv0.20.1
and it seems it works properly.So I am a bit confused, why some commits are pushed directly to the
master
branch and other to the dedicated minor release branch.Is it normal that master Head doesn’t work properly until you add a commit that is tagged?