question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make region parameter required in Google Dataproc operators and hooks

See original GitHub issue

Description

Deprecate using global as the default region value in Google Dataproc operators and hooks. Make the default value None and then check if another value was passed. If not then raise a warning and set the value to global to preserve backward compatibility. Example of how it can be done: https://github.com/apache/airflow/blob/805781b024fdcc8e93d695443b49d96747c085bf/airflow/providers/google/cloud/hooks/bigquery.py#L81-L90

Also, it would make sense to make this change only in not-deprecated operators and hook’s methods. That is DataprocDeleteClusterOperator, DataprocInstantiateWorkflowTemplateOperator, DataprocInstantiateInlineWorkflowTemplateOperator.

Use case / motivation

This parameter should be required by operators because running in any default (unexpected) region may be seen as undesirable behavior.

Related Issues

https://github.com/apache/airflow/pull/10673#discussion_r481057300

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
dmitrikuksikcommented, Sep 6, 2020

I’ve made my first PR 💪 (https://github.com/apache/airflow/pull/10772). There was submit method in dataproc hooks that is going to be deprecated, so I left it without changes. Also, I’ve ignored DataprocScaleClusterOperator in dataproc operators for the same reason, but, added condition to DataprocJobBaseOperator. I can see that operators that inherits this class are going to be deprecated, but I guess they are still in use to generate input for new operator DataprocSubmitJobOperator.

0reactions
turbaszekcommented, Sep 4, 2020

Awesome @dmitrikuksik, I assigned you 👌

The renaming of region -> location is not a part of this issue, I just wanted to raise the question as it’s a recurring problem and there was no agreement.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Source code for airflow.providers.google.cloud.hooks.dataproc
The ID of the Google Cloud project that the cluster belongs to. :type project_id: str :param region: Required. The Cloud Dataproc region in...
Read more >
airflow.contrib.hooks.gcp_dataproc_hook - Read the Docs
Hook for Google Cloud Dataproc APIs. All the methods in the hook where project_id is used must be called with keyword arguments rather...
Read more >
[GitHub] [airflow] jaketf commented on issue #10687
[GitHub] [airflow] jaketf commented on issue #10687: Make region parameter required in Google Dataproc operators and hooks.
Read more >
How to pass args and template_fields to dataproc from airflow 1
from airflow.providers.google.cloud.operators.dataproc import ... seg_members_prediction.py , I use argparse to create the needed arguments.
Read more >
pip install apache-airflow-providers-google==1.0.0b2 - PyPI
Provider package apache-airflow-providers-google for Apache Airflow. ... using global as the default region in Google Dataproc operators and hooks (#10772).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found