question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature Request] Support for s3fs

See original GitHub issue

🚀 Feature Request

Hi, I’ve been using Hydra for 6 mos or so, so this is more of a question than a feature request, since I don’t know for sure that this isn’t already supported. I’d like to modify hydra.run.dir to an s3 path (e.g. s3://<bucket_name>/<output_dir>) and then have Hydra write all outputs to that s3 bucket. It seems like if a user already has AWS CLI configured, then (maybe?) this functionality wouldn’t be too hard to support using something like the s3fs package.

Motivation

I’d like to run my program in AWSBatch, so I need a way to pipe any output of the program to S3, similar to how Hydra already captures all output in a single directory.

Pitch

What would be ideal for me as a user would be for Hydra to automatically determine whether to use the local filesystem or s3fs, based on whether I override hydra.run.dir with a directory with an s3:// prefix.

The immediate alternative would be to periodically copy the contents of the output directory to S3 within my program.

I’d be willing to open a pull request but I would probably need some guidance on how to get started.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:3
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
omrycommented, Jun 30, 2020

Hi @samuelstanton! I have plans to abstract the working directory in Hydra, probably in 1.1. This will open up the path for a plugin that will support s3 as the working directory backend.

I am keeping it open as a reminder for the desire for s3 support for it. In the short term you are on your own. one thing you can consider is mounting your s3 bucket locally using something like s3fs-fuse and configure the Hydra sweepr/run dir to that mount point.

2reactions
oliversssf2commented, Jun 15, 2022

Maybe consider adopting fsspec so that not only S3, Azure, GCS, and many more file systems can also be used?

Read more comments on GitHub >

github_iconTop Results From Across the Web

S3Fs — S3Fs 2022.11.0+4.g5917684 documentation
S3Fs ¶. S3Fs is a Pythonic file interface to S3. It builds on top of botocore. The top-level class S3FileSystem holds connection information...
Read more >
Feature Request: Migrate an existing set of local files to S3
First, install the AWS CLI package from Amazon. It has the ability to sync a local folder into an S3 bucket, which you...
Read more >
Virtual hosting of buckets - Amazon Simple Storage Service
An ordinary Amazon S3 REST request specifies a bucket by using the first slash-delimited component of ... New Amazon S3 features are not...
Read more >
How to submit a feature request? : r/aws - Reddit
S3 still doesn't support a multi-byte-range request. I couldn't figure out why my code wasn't working at first until I read that it...
Read more >
s3fs - Google Code
ID Status Summary 445 Done Fine tuning S3FS Type‑Support Priority‑Medium 444 Done Performance problem Type‑Defect Priority‑Medium 443 Done kernel: s3fs: segfault Type‑Defect Priority‑Medium
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found