[sdk] failure launching Dataproc jobs
See original GitHub issueEnvironment
- KFP version: Vertex AI
- KFP SDK version: 1.7
- All dependencies version: kfp 1.8.2 kfp-pipeline-spec 0.1.11 kfp-server-api 1.7.0
Steps to reproduce
@dsl.pipeline(
name='pipeline',
pipeline_root='gs://bucket'
)
def pipeline():
dataproc_submit_pyspark_op(
project_id=project,
region=region,
cluster_name=cluster_name,
main_python_file_uri='gs://bucket/valid_job.py',
)
Expected result
submission of a dataproc job.
Materials and Reference
error from vertex runner:
File "/ml/kfp_component/google/dataproc/_submit_job.py", line 49, in submit_job "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
" client = DataprocClient() "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
" File "/ml/kfp_component/google/common/_utils.py", line 170, in __init__ "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
" self._build_client() "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
"TypeError: _build_client() takes 0 positional arguments but 1 was given"
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:3
- Comments:6
Top Results From Across the Web
Troubleshoot Dataproc error messages - Google Cloud
Cause: This error can occur when you attempt to setup a Dataproc cluster using a VPC network in another project and the Dataproc...
Read more >Data proc job is failing with Class not found exception
Below is the error i am getting when i am running from "run" on the IntelliJ window panel enter image description here.
Read more >Google Dataproc Spark jobs failing with “AgentException
In Dataproc case Spark job submitting machine is NOT remote from “spark infrastructure”. It is the master node in the Dataproc cluster. Also,...
Read more >Using the Google Cloud Dataproc WorkflowTemplates API to ...
Always start by ensuring you have the latest Google Cloud SDK updates and are ... gcloud dataproc workflow-templates add-job spark \ ... echo...
Read more >(How to) Create a Spark cluster on Google Dataproc - GATK
The name must be all lowercase, start and end with a letter and contain ... ERROR: (gcloud.dataproc.jobs.submit.spark) NOT_FOUND: No current ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Falling back to
gcr.io/ml-pipeline/ml-pipeline-gcp:1.6.0
worked.Fixed by https://github.com/kubeflow/pipelines/pull/7364