question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Logs for EmrStepSensor

See original GitHub issue

Description

Add feature to EmrStepSensor to bring back the spark task url & logs after task execution

Use case/motivation

After starting an EMR step task using EmrAddStepsOperator we generally have an EmrStepSensor to track the status of the step. The job ID is available for the sensor and is being poked at regular interval.

[2022-04-26, 22:07:43 UTC] {base_aws.py:100} INFO - Retrieving region_name from Connection.extra_config['region_name']
[2022-04-26, 22:07:44 UTC] {emr.py:316} INFO - Poking step s-123ABC123ABC on cluster j-123ABC123ABC
[2022-04-26, 22:07:44 UTC] {emr.py:74} INFO - Job flow currently PENDING
[2022-04-26, 22:08:44 UTC] {emr.py:316} INFO - Poking step s-123ABC123ABC on cluster j-123ABC123ABC
[2022-04-26, 22:08:44 UTC] {emr.py:74} INFO - Job flow currently PENDING
[2022-04-26, 22:09:44 UTC] {emr.py:316} INFO - Poking step s-123ABC123ABC on cluster j-123ABC123ABC
[2022-04-26, 22:09:44 UTC] {emr.py:74} INFO - Job flow currently COMPLETED
[2022-04-26, 22:09:44 UTC] {base.py:251} INFO - Success criteria met. Exiting.
[2022-04-26, 22:09:44 UTC] {taskinstance.py:1288} INFO - Marking task as SUCCESS. dag_id=datapipeline_sample, task_id=calculate_pi_watch_step, execution_date=20220426T220739, start_date=20220426T220743, end_date=20220426T220944

After the task is completed the status is displayed. If the user wants to review the logs of the task, it is a multistep process to get hold of the job logs from EMR cluster.

It will be a great addition to add the log url and possibly relay the logs to Airflow EmrStepSensor post completion of the task. This will be very handy when there are failures of many tasks and will make it a great user experience.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
syedahsncommented, Nov 4, 2022

Can I be assigned this issue? Thanks @eladkal 😃

0reactions
shubham22commented, Nov 4, 2022

Oh, TIL! @syedahsn you know what to do : )

Read more comments on GitHub >

github_iconTop Results From Across the Web

Source code for airflow.contrib.sensors.emr_step_sensor
[docs]class EmrStepSensor(EmrBaseSensor): """ Asks for the state of the step ... EmrHook(aws_conn_id=self.aws_conn_id).get_conn() self.log.info('Poking step ...
Read more >
How do I troubleshoot a failed Spark step in Amazon EMR?
For Spark jobs submitted with --deploy-mode cluster: Check the step logs to identify the application ID. Then, check the application master logs ......
Read more >
Airflow EMR execute step from Sensor - Stack Overflow
I made the following DAG in airflow where I am executing a set of EMRSteps to run my pipeline. default_args = { 'owner':...
Read more >
Bet you didn't know this about Airflow! | by Jyoti Dhiman
Using EMRStepSensor tasks, we wait for the submitted job to complete and the state of this task indicates whether the job is in...
Read more >
test_dag.py - Amazon AWS
... from airflow.contrib.sensors.emr_step_sensor import EmrStepSensor from ... emr = EmrHook(aws_conn_id=self.aws_conn_id).get_conn() self.log.info('Poking ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found