question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unconditional gather of operands causes unexpected exception.

See original GitHub issue

Description

This may sound a little ridiculous - but bear with me for a minute. It seems that there is no way for a flow to reference an object which starts out serializable, but later becomes unserializable. I think this is because all task results are eventually gathered and returned by Flow.run(). This causes an exception to be raised, even if the mutated object is never referenced in the flow after mutation. It seems to happen regardless of how the object entered the flow, whether constructed in flow context, in a task, or passed in as a Parameter. In Prefect 0.7.1 this does not generate an exception of the object is constructed from within the flow context. In this case the Flow contains a task: <Task: Constant[Mutator]>. Running with 0.9.2 there is no corresponding task, which may violate the goal of avoiding implicit dependencies?

This could be a crazy edge-case, or an unanticipated use case, but it does have some interesting implications even for ordinary cases. For example the default behavior is to gather all operands to the submitting machine which I’ve found to be really convenient for debug, until it crashes the node which submitted the job in the first place!

Expected Behavior

Don’t gather the object after it has been mutated?

Reproduction

#!/usr/bin/env python

import sys, os

# Flow specific
from prefect import task, Flow
from prefect.engine.executors import DaskExecutor
from site_specific import get_dask_client


def start_executor():
    client = get_dask_client()
    executor = DaskExecutor(client.scheduler_info()['address'], debug=True)
    return executor


def gen_gen():
    yield 20
    yield 30


class Mutator(object):
    def __init__(self):
       self.value = 1

    def mutate(self):
        self.generator = gen_gen()


@task
def mutate(mutated):
    mutated.mutate()
    return True


@task
def make_mutator():
    return Mutator()

def main():
    with Flow("mutate_flow") as f:
        # This also raises the exception
        # mutator = Mutator()
        mutator = make_mutator()
        mutated = mutate(mutator)

    f.run(executor=start_executor())


if __name__ == "__main__":
    sys.exit(main())

This gives us the following stack trace:

[2020-02-21 14:18:24,410] INFO - prefect.FlowRunner | Starting flow run.
distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "lib/python3.7/site-packages/distributed/protocol/core.py", line 124, in loads
    value = _deserialize(head, fs, deserializers=deserializers)
  File "lib/python3.7/site-packages/distributed/protocol/serialize.py", line 268, in deserialize  
    return loads(header, frames
  File "lib/python3.7/site-packages/distributed/protocol/serialize.py", line 80, in serialization_error_loads 
    raise TypeError(msg) 
TypeError: Could not serialize object of type Success
Traceback (most recent call last)
  File "lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps  
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL
_pickle.PicklingError: Can't pickle <class '__main__.Mutator'>: attribute lookup Mutator on __main__ failed 

During handling of the above exception, another exception occurred

Traceback (most recent call last):
  File "lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps  
    return {"serializer": "pickle"}, [pickle.dumps(x
  File "lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps 
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) 
  File "lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 1125, in dumps
    cp.dump(obj)
  File "lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 482, in dump  
    return Pickler.dump(self, obj
  File "lib/python3.7/pickle.py", line 437, in dump 
    self.save(obj) 
  File "lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "lib/python3.7/pickle.py", line 662, in save_reduce  
    save(state
  File "lib/python3.7/pickle.py", line 504, in save 
    f(self, obj) # Call unbound method with explicit self  
  File "lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items(
  File "lib/python3.7/pickle.py", line 885, in _batch_setitems  
    save(v
  File "lib/python3.7/pickle.py", line 504, in save 
    f(self, obj) # Call unbound method with explicit self  
  File "lib/python3.7/pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items(
  File "lib/python3.7/pickle.py", line 890, in _batch_setitems  
    save(v
  File "lib/python3.7/pickle.py", line 549, in save 
    self.save_reduce(obj=obj, *rv) 
  File "lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "lib/python3.7/pickle.py", line 504, in save  
    f(self, obj) # Call unbound method with explicit self
  File "lib/python3.7/pickle.py", line 859, in save_dict 
    self._batch_setitems(obj.items()
  File "lib/python3.7/pickle.py", line 885, in _batch_setitems
    save(v)
  File "lib/python3.7/pickle.py", line 549, in save  
    self.save_reduce(obj=obj, *rv
  File "lib/python3.7/pickle.py", line 662, in save_reduce 
    save(state) 
  File "lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self 
  File "lib/python3.7/pickle.py", line 859, in save_dict  
    self._batch_setitems(obj.items
  File "lib/python3.7/pickle.py", line 885, in _batch_setitems 
    save(v)
  File "lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
TypeError: can't pickle generator objects

Environment

I’m using prefect 0.9.1 and distributed 2.10.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
markkoobcommented, Mar 9, 2020

That sounds like a good solution to me. Thanks! Feel free to close this as not-a-bug or whatever makes sense to you!

0reactions
cicdwcommented, Mar 9, 2020

I’ll close, but feel free to comment back here if you run into further issues!

Read more comments on GitHub >

github_iconTop Results From Across the Web

The case against checked exceptions - Stack Overflow
"Checked exceptions are bad because programmers just abuse them by always catching them and dismissing them which leads to problems being hidden ...
Read more >
LLVM Language Reference Manual
Abstract¶. This document is a reference manual for the LLVM assembly language. LLVM is a Static Single Assignment (SSA) based representation that provides ......
Read more >
XSL Transformations (XSLT) Version 3.0 - W3C
This example will cause the transformation to fail with an error message, unless the global context item is valid against the top-level element ......
Read more >
RISC-V "V" Vector Extension - GitHub
Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior.
Read more >
Java SDK fixes, version 7.0 - IBM
SPNEGO CLIENT VERIFICATION OF REQUESTED MESSAGE CONFIDENTIALITY FAILS AFTER CALL TO SPNEGOCONTEXT. CONTEXT.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found