question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parameter optional_components =True don’t create dataproc cluster with web interface(component gateway)

See original GitHub issue

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

Google Cloud Composer-2.0.0-preview.5 Airflow-2.1.4

Apache Airflow version

2.1.4

Operating System

UNIX

Deployment

Composer

Deployment details

No response

What happened

I write this dag(above) in cloud composer to create and delete dataproc cluster. But when the cluster is created even with the option enable_component_gateway=True it does not enable the component gateway with access to jupyter notebook as parameterized in the dag. But the additional components are enabled as per the image.

`from airflow.contrib.sensors.gcs_sensor import GoogleCloudStoragePrefixSensor from airflow import DAG from datetime import datetime, timedelta from airflow.contrib.operators.gcs_to_bq import GoogleCloudStorageToBigQueryOperator from airflow.providers.google.cloud.operators.dataproc import DataprocCreateClusterOperator, DataprocDeleteClusterOperator, ClusterGenerator

yesterday = datetime.combine(datetime.today() - timedelta(1), datetime.min.time())

default_args = { ‘owner’: ‘teste3’, ‘depends_on_past’: False, ‘start_date’ :yesterday, ‘email’: [‘airflow@example.com’], ‘email_on_failure’: False, ‘email_on_retry’: False, ‘retries’: 0, ‘retry_delay’: timedelta(minutes=5),

}

dag = DAG( ‘teste-dag-3’,catchup=False, default_args=default_args, schedule_interval=None)

CLUSTER_GENERATOR = ClusterGenerator( project_id=“sandbox-coe”, cluster_name=‘teste-ge-{{ ds }}’, num_masters=1, master_machine_type=‘n2-standard-8’, worker_machine_type=‘n2-standard-8’, worker_disk_size=500, master_disk_size=500, master_disk_type=‘pd-ssd’, worker_disk_type=‘pd-ssd’, image_version=‘1.5.56-ubuntu18’, tags=[‘allow-dataproc-internal’], region=‘us-central1’, zone=‘us-central1-f’, storage_bucket = ‘bucket-dataproc-ge’, labels = {‘product’ : ‘sample-label’}, enable_component_gateway=True, # this is not working optional_components = [ ‘JUPYTER’, ‘ANACONDA’ ] ).make()

create_cluster=DataprocCreateClusterOperator( dag=dag, task_id=‘start_cluster_example’, cluster_name=‘teste-ge-{{ ds }}’, project_id=“sandbox-coe”, cluster_config=CLUSTER_GENERATOR, region=‘us-central1’ )

stop_cluster_example = DataprocDeleteClusterOperator( dag=dag, task_id=‘stop_cluster_example’, cluster_name=‘teste-ge-{{ ds }}’, project_id= ‘sandbox-coe’, region=‘us-central1’, ) #stops a running dataproc cluster

create_cluster >> stop_cluster_example ` Captura de Tela 2022-02-24 às 14 46 35 Captura de Tela 2022-02-24 às 14 46 45

What you expected to happen

I hope it happens that when I create the cluster, the web interface components that I activated when creating the cluster appear as in the image Captura de Tela 2022-02-24 às 15 02 05

How to reproduce

Execute the dag above in cloud composer with DataprocCreateClusterOperator.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:7
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
aoelvp94commented, Mar 8, 2022

Yeah the new version works for me. Thanks for the update @ThiagoPositeli !

1reaction
aoelvp94commented, Mar 3, 2022

🙏 🙏 same problem here 😢

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dataproc Component Gateway - Google Cloud
Component Gateway provides secure access to web endpoints for Dataproc default and optional components. Clusters created with Dataproc image version 1.3.29 and ...
Read more >
Component Gateway activation on dataproc does not work ...
It seems that the operator creating a dataproc cluster does not enable enabling the optional components to enable jupyter notebook and anaconda.
Read more >
Apache Spark and Jupyter Notebooks made easy ... - Medium
Use the new Dataproc optional components and component gateway features to easily set-up and use Jupyter Notebooks. Apache Spark and Jupyter ...
Read more >
Accelerating Spark 3.0 Google DataProc Project with NVIDIA ...
Create a New Project · Activate Cloud Shell for putting your commands · Cluster Architecture · Dataproc Clusters Page · Web Interface Page...
Read more >
Apache Spark & Jupyter on Google Cloud Dataproc Cluster ...
... you to create a Dataproc Cluster with Apache Spark, Jupyter component and Component Gateway. We'll use the JupyterLab web UI on Dataproc...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found