K8s launcher fails via job service
See original GitHub issueExpected Behavior
The following cell from the minimal_rid_hailing.ipynb
notebook should work using the k8s
spark launcher, via the job service:
# get_historical_features will return immediately once the Spark job has been submitted succesfully.
job = client.get_historical_features(feature_refs=[
"driver_statistics:avg_daily_trips", "driver_statistics:conv_rate",
"driver_statistics:acc_rate", "driver_trips:trips_today"
],
entity_source=entities_with_timestamp)
Current Behavior
# get_historical_features will return immediately once the Spark job has been submitted succesfully.
job = client.get_historical_features(feature_refs=[
"driver_statistics:avg_daily_trips", "driver_statistics:conv_rate",
"driver_statistics:acc_rate", "driver_trips:trips_today"
],
entity_source=entities_with_timestamp)
---------------------------------------------------------------------------
_InactiveRpcError Traceback (most recent call last)
<ipython-input-40-43e5c2d3cdd4> in <module>
4 "driver_statistics:acc_rate", "driver_trips:trips_today"
5 ],
----> 6 entity_source=entities_with_timestamp)
~/.local/lib/python3.7/site-packages/feast/client.py in get_historical_features(self, feature_refs, entity_source, output_location)
1069 output_location=output_location,
1070 ),
-> 1071 **self._extra_grpc_params(),
1072 )
1073 return RemoteRetrievalJob(
/usr/local/lib/python3.7/dist-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
824 state, call, = self._blocking(request, timeout, metadata, credentials,
825 wait_for_ready, compression)
--> 826 return _end_unary_response_blocking(state, call, False, None)
827
828 def with_call(self,
/usr/local/lib/python3.7/dist-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
727 return state.response
728 else:
--> 729 raise _InactiveRpcError(state)
730
731
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Exception calling application: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 29 Jan 2021 19:32:55 GMT', 'Transfer-Encoding': 'chunked'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"SparkApplication in version \"v1beta2\" cannot be handled as a SparkApplication: unmarshalerDecoder: Object 'Kind' is missing in '{\"metadata\": {\"labels\": {\"feast.dev/jobid\": \"feast-a6xrisxw\", \"feast.dev/type\": \"HISTORICAL_RETRIEVAL_JOB\"}, \"name\": \"feast-a6xrisxw\", \"namespace\": \"feast-dev\"}, \"spec\": {\"mainApplicationFile\": \"s3a://tmp-data-viaduct-ai/feast/staging/f61c0f705cd03fa561baf45da451f2b2b970c5de51f39920909f63ebabc6ac37.py\", \"arguments\": [\"--feature-tables\", \"W3siZmVhdHVyZXMiOiBbeyJuYW1lIjogImNvbnZfcmF0ZSIsICJ0eXBlIjogIkZMT0FUIn0sIHsibmFtZSI6ICJhdmdfZGFpbHlfdHJpcHMiLCAidHlwZSI6ICJJTlQzMiJ9LCB7Im5hbWUiOiAiYWNjX3JhdGUiLCAidHlwZSI6ICJGTE9BVCJ9XSwgInByb2plY3QiOiAiZGVmYXVsdCIsICJuYW1lIjogImRyaXZlcl9zdGF0aXN0aWNzIiwgImVudGl0aWVzIjogW3sibmFtZSI6ICJkcml2ZXJfaWQiLCAidHlwZSI6ICJJTlQ2NCJ9XSwgIm1heF9hZ2UiOiBudWxsLCAibGFiZWxzIjoge319LCB7ImZlYXR1cmVzIjogW3sibmFtZSI6ICJ0cmlwc190b2RheSIsICJ0eXBlIjogIklOVDMyIn1dLCAicHJvamVjdCI6ICJkZWZhdWx0IiwgIm5hbWUiOiAiZHJpdmVyX3RyaXBzIiwgImVudGl0aWVzIjogW3sibmFtZSI6ICJkcml2ZXJfaWQiLCAidHlwZSI6ICJJTlQ2NCJ9XSwgIm1heF9hZ2UiOiBudWxsLCAibGFiZWxzIjoge319XQ==\", \"--feature-tables-sources\", \"W3siZmlsZSI6IHsiZmllbGRfbWFwcGluZyI6IHt9LCAiZXZlbnRfdGltZXN0YW1wX2NvbHVtbiI6ICJkYXRldGltZSIsICJjcmVhdGVkX3RpbWVzdGFtcF9jb2x1bW4iOiAiY3JlYXRlZCIsICJkYXRlX3BhcnRpdGlvbl9jb2x1bW4iOiAiZGF0ZSIsICJwYXRoIjogInMzOi8vdG1wLWRhdGEtdmlhZHVjdC1haS9mZWFzdC9zdGFnaW5nL3Rlc3RfZGF0YS9kcml2ZXJfc3RhdGlzdGljcyIsICJmb3JtYXQiOiB7Impzb25fY2xhc3MiOiAiUGFycXVldEZvcm1hdCJ9fX0sIHsiZmlsZSI6IHsiZmllbGRfbWFwcGluZyI6IHt9LCAiZXZlbnRfdGltZXN0YW1wX2NvbHVtbiI6ICJkYXRldGltZSIsICJjcmVhdGVkX3RpbWVzdGFtcF9jb2x1bW4iOiAiY3JlYXRlZCIsICJkYXRlX3BhcnRpdGlvbl9jb2x1bW4iOiAiZGF0ZSIsICJwYXRoIjogInMzOi8vdG1wLWRhdGEtdmlhZHVjdC1haS9mZWFzdC9zdGFnaW5nL3Rlc3RfZGF0YS9kcml2ZXJfdHJpcHMiLCAiZm9ybWF0IjogeyJqc29uX2NsYXNzIjogIlBhcnF1ZXRGb3JtYXQifX19XQ==\", \"--entity-source\", \"eyJmaWxlIjogeyJmaWVsZF9tYXBwaW5nIjoge30sICJldmVudF90aW1lc3RhbXBfY29sdW1uIjogImV2ZW50X3RpbWVzdGFtcCIsICJjcmVhdGVkX3RpbWVzdGFtcF9jb2x1bW4iOiAiIiwgImRhdGVfcGFydGl0aW9uX2NvbHVtbiI6ICIiLCAicGF0aCI6ICJzM2E6Ly90bXAtZGF0YS12aWFkdWN0LWFpL2ZlYXN0L3N0YWdpbmcvNzU2ODE3ZTUtZDc0Mi00NzQwLWI4NjUtMTQ5MjhiMDgwNTc1IiwgImZvcm1hdCI6IHsianNvbl9jbGFzcyI6ICJQYXJxdWV0Rm9ybWF0In19fQ==\", \"--destination\", \"eyJmb3JtYXQiOiAicGFycXVldCIsICJwYXRoIjogInMzYTovL3RtcC1kYXRhLXZpYWR1Y3QtYWkvZmVhc3QvaGlzdG9yaWNhbC82YTdiMDMzZS02NThmLTQ1Y2UtODUzZi1kZDNhODI4ZDgxMzMifQ==\"], \"sparkConf\": {\"dev.feast.outputuri\": \"s3a://tmp-data-viaduct-ai/feast/historical/6a7b033e-658f-45ce-853f-dd3a828d8133\"}}}', error found in #10 byte of ...|8d8133\"}}}|..., bigger context ...|istorical/6a7b033e-658f-45ce-853f-dd3a828d8133\"}}}|...","reason":"BadRequest","code":400}
"
debug_error_string = "{"created":"@1611948775.375230564","description":"Error received from peer ipv4:100.64.125.189:6568","file":"src/core/lib/surface/call.cc","file_line":1061,"grpc_message":"Exception calling application: (400)\nReason: Bad Request\nHTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 29 Jan 2021 19:32:55 GMT', 'Transfer-Encoding': 'chunked'})\nHTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"SparkApplication in version \"v1beta2\" cannot be handled as a SparkApplication: unmarshalerDecoder: Object 'Kind' is missing in '{\"metadata\": {\"labels\": {\"feast.dev/jobid\": \"feast-a6xrisxw\", \"feast.dev/type\": \"HISTORICAL_RETRIEVAL_JOB\"}, \"name\": \"feast-a6xrisxw\", \"namespace\": \"feast-dev\"}, \"spec\": {\"mainApplicationFile\": \"s3a://tmp-data-viaduct-ai/feast/staging/f61c0f705cd03fa561baf45da451f2b2b970c5de51f39920909f63ebabc6ac37.py\", \"arguments\": [\"--feature-tables\", \"W3siZmVhdHVyZXMiOiBbeyJuYW1lIjogImNvbnZfcmF0ZSIsICJ0eXBlIjogIkZMT0FUIn0sIHsibmFtZSI6ICJhdmdfZGFpbHlfdHJpcHMiLCAidHlwZSI6ICJJTlQzMiJ9LCB7Im5hbWUiOiAiYWNjX3JhdGUiLCAidHlwZSI6ICJGTE9BVCJ9XSwgInByb2plY3QiOiAiZGVmYXVsdCIsICJuYW1lIjogImRyaXZlcl9zdGF0aXN0aWNzIiwgImVudGl0aWVzIjogW3sibmFtZSI6ICJkcml2ZXJfaWQiLCAidHlwZSI6ICJJTlQ2NCJ9XSwgIm1heF9hZ2UiOiBudWxsLCAibGFiZWxzIjoge319LCB7ImZlYXR1cmVzIjogW3sibmFtZSI6ICJ0cmlwc190b2RheSIsICJ0eXBlIjogIklOVDMyIn1dLCAicHJvamVjdCI6ICJkZWZhdWx0IiwgIm5hbWUiOiAiZHJpdmVyX3RyaXBzIiwgImVudGl0aWVzIjogW3sibmFtZSI6ICJkcml2ZXJfaWQiLCAidHlwZSI6ICJJTlQ2NCJ9XSwgIm1heF9hZ2UiOiBudWxsLCAibGFiZWxzIjoge319XQ==\", \"--feature-tables-sources\", \"W3siZmlsZSI6IHsiZmllbGRfbWFwcGluZyI6IHt9LCAiZXZlbnRfdGltZXN0YW1wX2NvbHVtbiI6ICJkYXRldGltZSIsICJjcmVhdGVkX3RpbWVzdGFtcF9jb2x1bW4iOiAiY3JlYXRlZCIsICJkYXRlX3BhcnRpdGlvbl9jb2x1bW4iOiAiZGF0ZSIsICJwYXRoIjogInMzOi8vdG1wLWRhdGEtdmlhZHVjdC1haS9mZWFzdC9zdGFnaW5nL3Rlc3RfZGF0YS9kcml2ZXJfc3RhdGlzdGljcyIsICJmb3JtYXQiOiB7Impzb25fY2xhc3MiOiAiUGFycXVldEZvcm1hdCJ9fX0sIHsiZmlsZSI6IHsiZmllbGRfbWFwcGluZyI6IHt9LCAiZXZlbnRfdGltZXN0YW1wX2NvbHVtbiI6ICJkYXRldGltZSIsICJjcmVhdGVkX3RpbWVzdGFtcF9jb2x1bW4iOiAiY3JlYXRlZCIsICJkYXRlX3BhcnRpdGlvbl9jb2x1bW4iOiAiZGF0ZSIsICJwYXRoIjogInMzOi8vdG1wLWRhdGEtdmlhZHVjdC1haS9mZWFzdC9zdGFnaW5nL3Rlc3RfZGF0YS9kcml2ZXJfdHJpcHMiLCAiZm9ybWF0IjogeyJqc29uX2NsYXNzIjogIlBhcnF1ZXRGb3JtYXQifX19XQ==\", \"--entity-source\", \"eyJmaWxlIjogeyJmaWVsZF9tYXBwaW5nIjoge30sICJldmVudF90aW1lc3RhbXBfY29sdW1uIjogImV2ZW50X3RpbWVzdGFtcCIsICJjcmVhdGVkX3RpbWVzdGFtcF9jb2x1bW4iOiAiIiwgImRhdGVfcGFydGl0aW9uX2NvbHVtbiI6ICIiLCAicGF0aCI6ICJzM2E6Ly90bXAtZGF0YS12aWFkdWN0LWFpL2ZlYXN0L3N0YWdpbmcvNzU2ODE3ZTUtZDc0Mi00NzQwLWI4NjUtMTQ5MjhiMDgwNTc1IiwgImZvcm1hdCI6IHsianNvbl9jbGFzcyI6ICJQYXJxdWV0Rm9ybWF0In19fQ==\", \"--destination\", \"eyJmb3JtYXQiOiAicGFycXVldCIsICJwYXRoIjogInMzYTovL3RtcC1kYXRhLXZpYWR1Y3QtYWkvZmVhc3QvaGlzdG9yaWNhbC82YTdiMDMzZS02NThmLTQ1Y2UtODUzZi1kZDNhODI4ZDgxMzMifQ==\"], \"sparkConf\": {\"dev.feast.outputuri\": \"s3a://tmp-data-viaduct-ai/feast/historical/6a7b033e-658f-45ce-853f-dd3a828d8133\"}}}', error found in #10 byte of ...|8d8133\"}}}|..., bigger context ...|istorical/6a7b033e-658f-45ce-853f-dd3a828d8133\"}}}|...","reason":"BadRequest","code":400}\n\n","grpc_status":2}"
>
Steps to reproduce
pip install feast==0.9.0
Run the minimal_ride_hailing.ipynb
up to that cell calling get_historical_features
env variable configuration for the job service:
FEAST_SPARK_LAUNCHER: "k8s"
FEAST_SPARK_STAGING_LOCATION: "s3a://my-bucet/feast/spark-staging/"
FEAST_SPARK_K8S_NAMESPACE: "feast-dev"
Specifications
- Version: 0.9.0
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Kubernetes Jobs | Use Cases, Scheduling, and Failure
Learn more about Kubernetes best practices and job cases. This article will even teach you how to create kubernetes jobs and how to...
Read more >Jobs | Kubernetes
spec.template.spec.restartPolicy = "Never" . When a Pod fails, then the Job controller starts a new Pod. This means that your application ...
Read more >3 Kubernetes Plugin | RStudio Job Launcher 2022.06.0-daily+ ...
The Kubernetes Job Launcher Plugin provides the capability to launch executables on a Kubernetes cluster. 3.1 Configuration. It is recommended not to change...
Read more >SAS Launcher Service
The Launcher service is a SAS Viya microservice that provides API ... The pods are launched through a Kubernetes job, but do not...
Read more >The Kubernetes executor for GitLab Runner
A Runtime class to use for all created pods. If the feature is unsupported by the cluster, jobs exit or fail. pull_policy, Specify...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@beatgeek The SparkOp is unfortunately not providing very descriptive exceptions yet. Definitely something to work on. It’s worth debugging your actual operator by looking at the jobs it creates. It may be related to a missing service account, for example https://github.com/feast-dev/feast/tree/v0.9.3/tests
Thanks for raising this issue @jpugliesi. I’ll try to reproduce this problem today.