question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

EmrEtlRuner: add S3 bootstrap action removing empty $folder$ files

See original GitHub issue

This jobstep will be responsible for maintaining a clean state within S3. Namely the empty *$folder$ files that get left around as part of the S3DistCp routine.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:5
  • Comments:13 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
BenFradetcommented, Sep 26, 2017

Will investigate whether s3distcp is even capable of removing those files.

1reaction
sebastianvillarroelcommented, Aug 1, 2021
Read more comments on GitHub >

github_iconTop Results From Across the Web

Delete empty files with the "_$folder$" suffix in S3 buckets
Can I safely delete the empty files with the _$folder$ suffix that appear in my Amazon S3 bucket when I use Amazon EMR...
Read more >
Configure EmrEtlRunner | Snowplow Documentation
The EmrEtlRunner makes use of Amazon Elastic Mapreduce (EMR) to process the raw log files and output the cleaned, enriched Snowplow events table ......
Read more >
EmrEtlRunner returns 403 error. - Google Groups
Just ran the EmrEtlRunner with using pre-defined Snowplow enrichments but received 403 : ... Access denied trying to read bootstrap action file 's3://files....
Read more >
Avoid creation of _$folder$ keys in S3 with hadoop (EMR)
You can safely delete any empty files with the ... destination prefixes in S3. set -ex RPM=bootstrap-actions/s3-dist-cp-2.2.0/s3-dist-cp-2.2 ...
Read more >
Bootstrap Action & Managing secrets in AWS EMR PySpark job
Bootstrap Action :-----------------------------Basically, Bootstrap action is used to install required packages before the cluster is created ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found