Feast BigQuery offline store gets wrong argument types for `event_timestamp`
See original GitHub issueExpected Behavior
Retrieving training data from an offline store that is BigQuery by using get_historical_features
method from Python SDK.
Current Behavior
Getting this error when running get_historical_features
:
Traceback (most recent call last):
File "get_offline_data.py", line 22, in <module>
"item_user_stats_v1:is_high_risk_seller",
File "/usr/local/lib/python3.7/site-packages/feast/infra/offline_stores/offline_store.py", line 77, in to_df
features_df = self._to_df_internal()
File "/usr/local/lib/python3.7/site-packages/feast/infra/offline_stores/bigquery.py", line 290, in _to_df_internal
df = self._execute_query(query).to_dataframe(create_bqstorage_client=True)
File "/usr/local/lib/python3.7/site-packages/feast/usage.py", line 280, in wrapper
raise exc.with_traceback(traceback)
File "/usr/local/lib/python3.7/site-packages/feast/usage.py", line 269, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/feast/infra/offline_stores/bigquery.py", line 357, in _execute_query
block_until_done(client=self.client, bq_job=bq_job, timeout=timeout)
File "/usr/local/lib/python3.7/site-packages/feast/infra/offline_stores/bigquery.py", line 415, in block_until_done
raise bq_job.exception()
google.api_core.exceptions.BadRequest: 400 No matching signature for operator <= for argument types: TIMESTAMP, DATETIME. Supported signature: ANY <= ANY at [78:13]
The error happens on a query where it compares event timestamps (last line):
item_user_stats_v1__base AS (
SELECT
subquery.*,
entity_dataframe.entity_timestamp,
entity_dataframe.item_user_stats_v1__entity_row_unique_id
FROM item_user_stats_v1__subquery AS subquery
INNER JOIN item_user_stats_v1__entity_dataframe AS entity_dataframe
ON TRUE
AND subquery.event_timestamp <= entity_dataframe.entity_timestamp
Steps to reproduce
Use Feast project with GCP provider, where the offline store is BigQuery. After installing feast[gcp] (v0.19.4) packages you should get this google-cloud-bigquery==3.0.1
dependency version. I think it should brake also with newer versions of google-cloud-bigquery
if it will be available.
Specifications
- Version: v0.19.4
- Platform: GCP
- Subsystem:
Possible Solution
Quick solution: add upper bound of <3.0.0
here.
A better solution: check what is causing this event timestamp type mismatch and fix it 😊 .
Issue Analytics
- State:
- Created a year ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
BigQuery - Feast
The BigQuery offline store provides support for reading BigQuerySources. All joins happen within BigQuery. Entity dataframes can be provided as a SQL query ......
Read more >Changelog - GitHub
... Fix incorrect on demand feature view diffing and improve Java tests ... Fix Spark offline store type conversion to arrow ...
Read more >Creating a Feature Store with Feast | by Kedion - Medium
Combine features from different sources for training, analysis, and feature engineering. · Retrieve fresh feature data for inference. · Reuse features across ...
Read more >91 Versions - Openbase
Easy way to test offline/online store plugins using the existing Feast test ... validations when getting historical features from bigquery #1614 (achals) ...
Read more >How to Choose the Right Big Data Platform for Your Business
3. NiFi Flow Versioning. To be able to store different versions of the flow in the Development environment, the NiFi instance needs to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That’s very interesting, yeah I just need the fields used in the feature view. Can you also provide the output of the table schema, after getting the table using the python API? Looking at this code snippet an example: https://googleapis.dev/python/bigquery/latest/usage/tables.html#getting-a-table
OK seems that my proposed solution was valid for this particular issue. Thanks @achals 🙇