question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Run local code remotely on a worker

See original GitHub issue

I find myself often wanting to run code on a worker, rather than on my local client. This happens in a few settings:

  1. My workers have access to a data store that I don’t, so I need to call something like dd.read_parquet remotely (cc @martindurant @jcrist )
  2. My workers are far away from my client, so client-heavy operations like joblib or Hyperband incur a serious bottleneck from client-scheduler communication (cc @stsievert )
  3. My workers have hardware or libraries like GPUs/RAPIDS that I don’t have locally (cc @quasiben @kkraus14)

Today I can do this by writing a function and submitting that function as a task

def f():
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    return df.sum().compute()

result = client.submit(f).result()

It might make sense to provide syntax around this to make it more magical (or it might not). We might do something like the following:

with dask.distributed.remote as result:
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    result = df.sum().compute()

I know that @eriknw has done magic like this in the past. We could enlist this help. However, we may not want to do this due to the magical and novel behavior.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:35 (35 by maintainers)

github_iconTop GitHub Comments

1reaction
jrbourbeaucommented, Aug 30, 2021

Ah great, I look forward to taking them for a spin. Thanks @eriknw!

1reaction
eriknwcommented, Aug 30, 2021

Update: afar 0.5 now supports IPython magics!

%load_ext afar
%%afar
import dask_cudf
df = dask_cudf.read_parquet("s3://...")
result = df.sum().compute()

instead of the original

def f():
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    return df.sum().compute()

result = client.submit(f).result()

More examples:

%%afar x, y  # save both x and y as Dask Futures
x = 1
y = x + 1
z = %afar x + y

or

%afar z = x + y

I think this is starting to get pretty nice.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Execute arbitrary python code remotely - can it be done?
Take a look at PyRO (Python Remote objects) It has the ability to set up services on all the computers in your cluster,...
Read more >
Visual Studio Code Remote Development
Visual Studio Code Remote Development allows you to use a container, remote machine, or the Windows Subsystem for Linux (WSL) as a full-featured...
Read more >
Connect to a remote server from IntelliJ IDEA
On the IntelliJ IDEA welcome screen, select Remote Development. In the Run the IDE Remotely section, click SSH Connection. If you have the...
Read more >
Remote Python Development in Visual Studio Code
Remote -Containers: develop in workspaces running inside of local docker ... The remote server allows Visual Studio Code to run extensions ...
Read more >
Remote Debugging with SSH and VS Code - Render
A walkthrough of using Render SSH to remotely debug Node.js. ... to attach a debugger to the node process if it were running...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found