pure delayed gives wrong result for dataclass methods
See original GitHub issueWhat happened:
from dask import delayed
from dataclasses import dataclass, field
@dataclass(frozen=True)
class A:
param: float = field(repr=False)
def get_param(self):
return self.param
def get_delayed_param(self, *args, **kwargs):
return delayed(self.get_param, pure=True)(*args, **kwargs)
(A(1).get_delayed_param() - A(0).get_delayed_param()).compute()
Out[2]: 0
This is an incorrect result.
Apparently, A(1).get_delayed_param().key
is erratically the same as A(0).get_delayed_param().key
. It seems like the tokenize fails and falls back on something that gives the same result for both as str(A(1)) == str(A(0))
.
What you expected to happen: The correct result is obtained without delayed:
A(1).get_param() - A(0).get_param()
Out[3]: 1
One also gets the correct result when not using dataclasses
class B:
def __init__(self, param):
self.param = param
def __repr__(self):
return 'B()'
def get_param(self):
return self.param
def get_delayed_param(self, *args, **kwargs):
return delayed(self.get_param, pure=True)(*args, **kwargs)
(B(1).get_delayed_param() - B(0).get_delayed_param()).compute()
Out[4]: 1
Here, we get the correct result even though str(B(1)) == str(B(0))
.
Environment:
- Dask version: 2021.10.0
- Python version: 3.9.4.final.0
- Operating System: Windows 10
- Install method (conda, pip, source): pip
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (7 by maintainers)
Top Results From Across the Web
Combining a descriptor class with dataclass and field
Descriptor effectively supercede fields; I think you want something like age = ClassVar(Descriptor(3)) , and Descriptor itself will handle ...
Read more >Dataclass traversal breaks delayed __init__ and ...
We use dataclasses.dataclass to instantiate resources. Since PR 4165, they are traversed and rebuilt on delayed. This causes: not being able ...
Read more >dataclasses — Data Classes — Python 3.11.1 documentation
This module provides a decorator and functions for automatically adding generated special methods such as __init__() and __repr__() to user-defined classes. It ...
Read more >Data Classes in Python 3.7+ (Guide)
This code will immediately crash with a TypeError complaining that “non-default argument 'country' follows default argument.” The problem is that our new ...
Read more >Everything you need to know about dataclasses - rmcomplexity
How to create a data class; Field definition. Specify a default value; Include or exclude fields in automatically implemented dunder methods ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Yeah this slipped off my list. I’d be pleased for you to take it over!
Closed via https://github.com/dask/dask/pull/8527