question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[sdk] failure launching Dataproc jobs

See original GitHub issue

Environment

  • KFP version: Vertex AI
  • KFP SDK version: 1.7
  • All dependencies version: kfp 1.8.2 kfp-pipeline-spec 0.1.11 kfp-server-api 1.7.0

Steps to reproduce

@dsl.pipeline(
  name='pipeline',
  pipeline_root='gs://bucket'
)
def pipeline():
  dataproc_submit_pyspark_op(
    project_id=project,
    region=region,
    cluster_name=cluster_name,
    main_python_file_uri='gs://bucket/valid_job.py',
  )

Expected result

submission of a dataproc job.

Materials and Reference

error from vertex runner:

 File "/ml/kfp_component/google/dataproc/_submit_job.py", line 49, in submit_job "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
" client = DataprocClient() "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
" File "/ml/kfp_component/google/common/_utils.py", line 170, in __init__ "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
" self._build_client() "
Error
2021-10-05 02:41:50.279 PDT
workerpool0-0
"TypeError: _build_client() takes 0 positional arguments but 1 was given"

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:3
  • Comments:6

github_iconTop GitHub Comments

1reaction
montenegrodrcommented, Jan 3, 2022

Falling back to gcr.io/ml-pipeline/ml-pipeline-gcp:1.6.0 worked.

0reactions
chensuncommented, Mar 1, 2022
Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot Dataproc error messages - Google Cloud
Cause: This error can occur when you attempt to setup a Dataproc cluster using a VPC network in another project and the Dataproc...
Read more >
Data proc job is failing with Class not found exception
Below is the error i am getting when i am running from "run" on the IntelliJ window panel enter image description here.
Read more >
Google Dataproc Spark jobs failing with “AgentException
In Dataproc case Spark job submitting machine is NOT remote from “spark infrastructure”. It is the master node in the Dataproc cluster. Also,...
Read more >
Using the Google Cloud Dataproc WorkflowTemplates API to ...
Always start by ensuring you have the latest Google Cloud SDK updates and are ... gcloud dataproc workflow-templates add-job spark \ ... echo...
Read more >
(How to) Create a Spark cluster on Google Dataproc - GATK
The name must be all lowercase, start and end with a letter and contain ... ERROR: (gcloud.dataproc.jobs.submit.spark) NOT_FOUND: No current ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found