question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

upload directories

See original GitHub issue

Sometimes I’m working on a python package which is just a directory with contents constantly being altered. I just want to send that directory to my dask workers without making a zipfile or egg or whatever. Currently I use these functions to get the job done:

def fn_to_targz_string(fn):
    with io.BytesIO() as bt:
        with tarfile.open(fileobj=bt,mode='w:gz') as tf:
            tf.add(fn,arcname=os.path.basename(fn))
        bt.seek(0)
        s=bt.read()
    return s

def extract_targz_string(s,*args,**kwargs):
    import io,tarfile
    with io.BytesIO() as bt:
        bt.write(s)
        bt.seek(0)
        with tarfile.open(fileobj=bt,mode='r:gz') as tf:
            tf.extractall(*args,**kwargs)

If we used tarfile in this way for Client.upload_file, it wouldn’t matter whether the user wanted to upload a file or a directory. It should just work.

I believe the only necessary changes would be replacing the beginning of Client._upload_file with something like fn_to_targz_string (shown above) and the beginning of Worker.upload_file with something like extract_targz_string (shown above). If you wanted to be complete, you could also add something to the belly of Worker.upload_file so that the module would be properly inserted into names_to_import.

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:2
  • Comments:15 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
GreatYYXcommented, Oct 7, 2018

any update? I found this PR https://github.com/dask/distributed/pull/939 but it was closed. I’m wondering if there’s a way to upload a directory and import only importable files in that directory.

2reactions
sonicxmlcommented, Nov 13, 2020

After thinking about this more, it does seem that having a separate upload_dir function would be useful beyond just clarity and ergonomics. In my case, for example, the code is set up like:

src/
  foo/
  bar/
  baz/
    a.py
    b.py

and Python files import each other using import src.baz.a. When I upload files to dask workers, I want to only upload the baz folder but preserve the src/baz hierarchy.

I was thinking an API like def upload_dir(path_to_dir, path_prefix="") (kwarg naming tbd) would be helpful in this case. To go through some examples:

  • upload_dir('src/baz') would upload just the baz directory and add baz to the Python path on the workers
  • upload_dir('src/baz', path_prefix='src/') would upload the bazdirectory to asrc/bazdirectory on the workers and addsrc/baz` to the python path

Edit:

On second thought, it might even be more intuitive to have upload_dir('src/baz') upload the contents of the baz folder with no parent directory, and upload_dir('src/baz', path_prefix='src/baz') would upload the contents of baz into a ‘src/baz’ folder on the remote workers.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Upload files and folders to Google Drive - Android
On your Android phone or tablet, open the Google Drive app. Tap Add Add question . Tap Upload. Find and tap the files...
Read more >
Directory Upload Example
There are two ways we can upload files and directories: a file input with the "directory" attribute, or an element with a drop...
Read more >
Upload files and folders to a library - Microsoft Support
Open the OneDrive or SharePoint site library. · On your computer select Start · Navigate to the folder with the documents that you...
Read more >
Using web interface only, how to upload folders or directories
Described question : using gitlab accross web interface only, it is possible to upload files, to create folder. But it seems not possible...
Read more >
Upload Directories — ExpressionEngine 7 Documentation
Upload Directories. An Upload Directory is a representation of a location where the files are being stored. The files can be stored on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found