Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`distributed.print()` breaks easily because of stringified kwargs

See original GitHub issue

Describe the issue: distributed.print() is a great idea, but shouldn’t it pickle its kwargs for proper deserialization by the client instead of stringifying them? My understanding (and the apparent intention from looking at the implementation in worker.py and client.py) is for distributed.print() to be a drop-in replacement for builtins.print() which workers can use to print stuff back to the client session, but by not truly serializing/deserializing the arguments, it breaks as a drop-in replacement.

Minimal Complete Verifiable Example:

from dask import distributed
from distributed import print as dask_print
client = distributed.Client()

# built-in print() works fine as expected
print("hello 1", file=None)

# dask_print works fine from the client session
dask_print("hello 2", file=None)

def do_print():
    dask_print("hello 3", file=None)
    
# this demonstrates the bug.
# it rasises `AttributeError: 'str' object has no attribute 'write'`
# because `file=None` has become `file="None"`!
client.submit(do_print)

I’m sure you can think of other ways that this breaks. For example, print(..., end=None) becomes print(..., end="None") (hehe) and print(..., flush=False) becomes print(..., flush="False") (so it WILL flush).

Environment:

Dask version: Tested with 2022.9.1 but master still looks to be affected.
Python version: 3.10.x
Operating System: Linux
Install method (conda, pip, source): pip

Issue Analytics

State:
Created a year ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

maxbanecommented, Oct 7, 2022

I’ve submitted PR #7129 – we can continue the discussion over there and eventually close this issue if it looks good.

1reaction

maxbanecommented, Oct 6, 2022

Cool, yeah, I’ll take a crack at it.

Top Results From Across the Web

distributed/client.py at main · dask ...

A distributed task scheduler for Dask. Contribute to dask/distributed development by creating an account on GitHub.

Source code for distributed.client

_state = self.client.futures[tkey] = FutureState() if inform: self.client. ... tuple): # worker.print() will always send us a tuple of args, even if it's...

What's New In Python 3.11 — Python 3.11.1 documentation

When printing tracebacks, the interpreter will now point to the exact expression that caused the error, instead of just the line. For example:....

Is there any way to print **kwargs in Python

Show activity on this post. and the print() function supports separate arguments by writing them out one by one with the sep value...

IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation

The corresponding writer functions are object methods that are accessed like DataFrame.to_csv() . Below is a table containing available readers and writers ....