question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Upload file to new workers only

See original GitHub issue

Currently, upload_file will only send files to workers connected to scheduler at an instant = t. If a worker connects to the scheduler afterwards, it does not have the data send with upload_file beforehand.

This is a problem because then we cannot assume that all the workers have the file available.

  1. One simple workaround is to store the identity of the connected workers just before calling upload_file and only send jobs requiring this file to these workers in the future. But this prevent scalability.

  2. An other possibility would be to register a SchedulerPlugin and use its add_worker event function to trigger there resending the file with upload_file to all the workers when a new worker joins. If this solution is preferred, then it might be good that upload_file provides a worker parameter, so that we can target the newly connected worker and avoid reuploading the file to other workers already having the file.

  3. As an extension of solution 2, add a future_workers bool parameter to upload_file and extend upload_file to automatically register the SchedulerPlugin described above.

  4. Other ideas?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
jrbourbeaucommented, Mar 7, 2020

@H4dr1en FWIW I’ve used the following WorkerPlugin in the past:


import os
import asyncio
from distributed.diagnostics.plugin import WorkerPlugin

class UploadFile(WorkerPlugin):
    """ Upload local package to workers

    Parameters
    ----------
    filepath : str, List[str]
        Filename of .py, .egg or .zip file to send to workers

    Examples
    --------
    Upload a local Python module, for example ``foo.py``, to workers:

    >>> plugin = UploadFile("foo.py")
    >>> client.register_worker_plugin(plugin)
    """

    name = "upload_file"

    def __init__(self, filepath):
        if isinstance(filepath, str):
            filepath = [filepath]

        self.data = {}
        for file_ in filepath:
            with open(file_, "rb") as f:
                filename = os.path.basename(file_)
                self.data[filename] = f.read()

    async def setup(self, worker):

        responses = await asyncio.gather(
            *[
                worker.upload_file(comm=None, filename=filename, data=data, load=True)
                for filename, data in self.data.items()
            ]
        )

        assert all(
            len(data) == r["nbytes"] for r, data in zip(responses, self.data.values())
        )

It may, or may not, serve as a good starting point for your use case

0reactions
MatthewLenniecommented, Aug 25, 2020

@mrocklin It’s a bit above my skill level as it currently stands, having just transitioned from mechanical engineering to SE. Interested yes, capable not yet. Sorry to be no help yet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Employee Document Upload - BambooHR Support
To do this, go to My Info and click on the Documents tab. Select Upload and browse for an applicable document or drag...
Read more >
eCase Document Upload - Workers' Compensation Board
The upload feature is only available during administrative hours. ... Select the Upload Documents button to launch eCase Document Upload in a new...
Read more >
Employee Files - Zoho
In your home page, click Files, then Employee Files. · Click Add File. · Upload the required file. · Provide a file name....
Read more >
Upload ONLY folder, Request files with OneDrive - YouTube
Request files from staff or students without them being able to edit or even see other files submitted. Add people manually or from...
Read more >
Upload Files in Microsoft Forms - New Feature! - YouTube
Microsoft Forms allows you to create a survey with multiple question types. One of the NEW question types is the " File upload...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found