question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] [flytekit] Expected behavior for returning FlyteSchema inside a dataclass

See original GitHub issue

Describe the bug Hey, we have been running into an issue where we’re trying to return a FlyteSchema within a dataclass and it’s giving us the following error (flytekit 0.21.5):

{"asctime": "2021-09-24 12:52:26,103", "name": "flytekit", "levelname": "INFO", "message": "Task executed successfully in user level, outputs: SomeResult(some_int=1, some_schema=<flytekit.types.schema.types.FlyteSchema.__class_getitem__.<locals>._TypedSchema object at 0x7f94c63e2700>)"}
INFO:flytekit:Task executed successfully in user level, outputs: SomeResult(some_int=1, some_schema=<flytekit.types.schema.types.FlyteSchema.__class_getitem__.<locals>._TypedSchema object at 0x7f94c63e2700>)
ERROR:root:!! Begin System Error Captured by Flyte !!
ERROR:root:Traceback (most recent call last):

      File "/usr/local/lib/python3.8/dist-packages/flytekit/common/exceptions/scopes.py", line 165, in system_entry_point
        return wrapped(*args, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/flytekit/core/base_task.py", line 546, in dispatch_execute
        raise AssertionError(f"failed to convert return value for var {k}") from e

Message:

    failed to convert return value for var o0

SYSTEM ERROR! Contact platform administrators.
ERROR:root:!! End Error Captured by Flyte !!

We brought this down to a minimal example that triggers the error:

import pandas as pd

from dataclasses import dataclass
from dataclasses_json import dataclass_json
from flytekit import kwtypes, task, workflow
from flytekit.types.schema import FlyteSchema

SomeSchema = FlyteSchema[kwtypes(some_str=str)]


@dataclass_json
@dataclass
class SomeResult:
    some_int: int
    some_schema: SomeSchema


@task
def some_task() -> SomeResult:
    some_schema = SomeSchema()
    some_df = pd.DataFrame(data={"some_str": ["a", "b", "c"]})
    some_schema.open().write(some_df)

    return SomeResult(some_int=1, some_schema=some_schema)


@workflow
def some_workflow() -> SomeResult:
    return some_task()

Our main question is; is it expected to not be possible to use a FlyteSchema in this way, and if so is there a nice way to do it otherwise? We’re a bit unsure at the moment, so not saying there actually is anything wrong with flytekit 😅

We’re trying to return both a FlyteDirectory with a FlyteSchema in one object so the FlyteDirectory gets uploaded to its remote_directory and we can still use the other element in the object for other operations. Having a way to force an upload of a FlyteDirectory without returning it would also be a viable solution for us.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
kumare3commented, Sep 24, 2021

@mdjong1 seems the welcome bot is confused haha.

On the other hand

  1. Look forward to the next release- it has a huge debugging improvement. All errors are Prettified and more actionable. We would appreciate feedback

  2. So dataclasses are passed as opaque jsons in the system today and so it’s not possible to embed higher level Flyte objects in there currently. That being said this seems to be one of the top requests from most users. We are thinking of a smart way of achieving this, while keeping the type safety and experience high class. Off the top, I feel, we can make it happen within dataclasses, but the experience will be hard to translate to the UI maybe.

Cc @pingsutw / @eapolinario yet another one

1reaction
kumare3commented, Oct 12, 2021

@nicklofaso thank you for the comment, Some other community members have implemented something similar

Read more comments on GitHub >

github_iconTop Results From Across the Web

flytekit.types.structured.structured_dataset - Flyte
If we add more things, we should put all the returned items in a dataclass instead of just a tuple. :param t: The...
Read more >
flytesnacks/custom_objects.py at master - GitHub
Flytekit supports passing custom data objects between tasks. Currently, data classes decorated with ``@dataclass_json`` are supported. One good use case ...
Read more >
Expected behavior or a bug of python's dataclasses?
field creates and returns a single instance of dataclasses.Field . ... It raises an exception since y doesn't actually exist in the class....
Read more >
flytekit Changelog - pyup.io
Add assert_type in dataclass transformer by pingsutw in ... this was inconsistent with the expected behavior found in remote Flyte and has now...
Read more >
flytekit.models.literals.StructuredDatasetMetadata Example
Learn how to use python api flytekit.models.literals. ... if not dataclasses.is_dataclass(expected_python_type): return python_val if ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found