question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pure delayed gives wrong result for dataclass methods

See original GitHub issue

What happened:


from dask import delayed
from dataclasses import dataclass, field


@dataclass(frozen=True)
class A:
    param: float = field(repr=False)

    def get_param(self):
        return self.param

    def get_delayed_param(self, *args, **kwargs):
        return delayed(self.get_param, pure=True)(*args, **kwargs)

(A(1).get_delayed_param() - A(0).get_delayed_param()).compute()
Out[2]: 0

This is an incorrect result.

Apparently, A(1).get_delayed_param().key is erratically the same as A(0).get_delayed_param().key. It seems like the tokenize fails and falls back on something that gives the same result for both as str(A(1)) == str(A(0)).

What you expected to happen: The correct result is obtained without delayed:

A(1).get_param() - A(0).get_param()
Out[3]: 1

One also gets the correct result when not using dataclasses


class B:

    def __init__(self, param):
        self.param = param

    def __repr__(self):
        return 'B()'

    def get_param(self):
        return self.param

    def get_delayed_param(self, *args, **kwargs):
        return delayed(self.get_param, pure=True)(*args, **kwargs)

(B(1).get_delayed_param() - B(0).get_delayed_param()).compute()
Out[4]: 1

Here, we get the correct result even though str(B(1)) == str(B(0)).

Environment:

  • Dask version: 2021.10.0
  • Python version: 3.9.4.final.0
  • Operating System: Windows 10
  • Install method (conda, pip, source): pip

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jsignellcommented, Dec 1, 2021

Yeah this slipped off my list. I’d be pleased for you to take it over!

0reactions
jrbourbeaucommented, Feb 23, 2022
Read more comments on GitHub >

github_iconTop Results From Across the Web

Combining a descriptor class with dataclass and field
Descriptor effectively supercede fields; I think you want something like age = ClassVar(Descriptor(3)) , and Descriptor itself will handle ...
Read more >
Dataclass traversal breaks delayed __init__ and ...
We use dataclasses.dataclass to instantiate resources. Since PR 4165, they are traversed and rebuilt on delayed. This causes: not being able ...
Read more >
dataclasses — Data Classes — Python 3.11.1 documentation
This module provides a decorator and functions for automatically adding generated special methods such as __init__() and __repr__() to user-defined classes. It ...
Read more >
Data Classes in Python 3.7+ (Guide)
This code will immediately crash with a TypeError complaining that “non-default argument 'country' follows default argument.” The problem is that our new ...
Read more >
Everything you need to know about dataclasses - rmcomplexity
How to create a data class; Field definition. Specify a default value; Include or exclude fields in automatically implemented dunder methods ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found