question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

EmrEtlRunner: add backend for a `generate` command

See original GitHub issue

Two sub-command options:

  • generate emr-cluster, which produces a Dataflow Runner-compatible EMR cluster config in Avro format
  • generate playbook, which products a Dataflow Runner-compatible playbook for running the job

Let’s take these in turn:

generate emr-cluster

This command will use the cluster specification in the config.yml to generate a Dataflow Runner-compatible EMR cluster config.

generate playbook

This command will use the config.yml plus the contents of the --enrichments folder, plus any relevant command line arguments (such as --skip staging) to generate a Dataflow Runner-compatible playbook for running the identical job via Dataflow Runner.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
alexanderdeancommented, Feb 14, 2017

Okay cool - assigned you to the 0.2.0 milestone per our convo a little earlier…

1reaction
alexanderdeancommented, Feb 13, 2017

Hi @BenFradet - you raise good points. I think it’s okay then to write this in Ruby, same as the rest of the codebase…

Read more comments on GitHub >

github_iconTop Results From Across the Web

EmrEtlRunner | Snowplow Documentation
Snowplow EmrEtlRunner is a deprecated application that ran Snowplow's batch processing jobs in AWS EMR, such as the RDB shredder.
Read more >
Snowplow: Full Setup With Google Analytics Tracking
Create a new cluster and database in Redshift. Add users and all the necessary tables to the database. Configure the EmrEtlRunner to ...
Read more >
A system to programmatically run data pipelines | RustRepo
Unfriendly error if user attempts to add argument to command. Changing: "command": "/opt/mt-scripts/common/scripts/r77/emr-etl-runner-r77.sh",. to:
Read more >
Quickstart
The Amplify Command Line Interface (CLI) is a unified toolchain to create, ... in the category's subdirectory amplify/backend/<category> , and insert its ...
Read more >
snowplow/snowplow r91-stonehenge on GitHub - NewReleases.io
EmrEtlRunner robustness. Blog post. EmrEtlRunner ... Add backend for a generate command (#3105); Add --resume-from option (#3128); Remove support for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found