`distributed.print()` breaks easily because of stringified kwargs
See original GitHub issueDescribe the issue:
distributed.print()
is a great idea, but shouldn’t it pickle its kwargs for proper deserialization by the client instead of stringifying them? My understanding (and the apparent intention from looking at the implementation in worker.py and client.py) is for distributed.print()
to be a drop-in replacement for builtins.print()
which workers can use to print stuff back to the client session, but by not truly serializing/deserializing the arguments, it breaks as a drop-in replacement.
Minimal Complete Verifiable Example:
from dask import distributed
from distributed import print as dask_print
client = distributed.Client()
# built-in print() works fine as expected
print("hello 1", file=None)
# dask_print works fine from the client session
dask_print("hello 2", file=None)
def do_print():
dask_print("hello 3", file=None)
# this demonstrates the bug.
# it rasises `AttributeError: 'str' object has no attribute 'write'`
# because `file=None` has become `file="None"`!
client.submit(do_print)
I’m sure you can think of other ways that this breaks. For example, print(..., end=None)
becomes print(..., end="None")
(hehe) and print(..., flush=False)
becomes print(..., flush="False")
(so it WILL flush).
Environment:
- Dask version: Tested with
2022.9.1
but master still looks to be affected. - Python version: 3.10.x
- Operating System: Linux
- Install method (conda, pip, source): pip
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
I’ve submitted PR #7129 – we can continue the discussion over there and eventually close this issue if it looks good.
Cool, yeah, I’ll take a crack at it.