[BUG] [flytekit] Expected behavior for returning FlyteSchema inside a dataclass
See original GitHub issueDescribe the bug Hey, we have been running into an issue where we’re trying to return a FlyteSchema within a dataclass and it’s giving us the following error (flytekit 0.21.5):
{"asctime": "2021-09-24 12:52:26,103", "name": "flytekit", "levelname": "INFO", "message": "Task executed successfully in user level, outputs: SomeResult(some_int=1, some_schema=<flytekit.types.schema.types.FlyteSchema.__class_getitem__.<locals>._TypedSchema object at 0x7f94c63e2700>)"}
INFO:flytekit:Task executed successfully in user level, outputs: SomeResult(some_int=1, some_schema=<flytekit.types.schema.types.FlyteSchema.__class_getitem__.<locals>._TypedSchema object at 0x7f94c63e2700>)
ERROR:root:!! Begin System Error Captured by Flyte !!
ERROR:root:Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/flytekit/common/exceptions/scopes.py", line 165, in system_entry_point
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/flytekit/core/base_task.py", line 546, in dispatch_execute
raise AssertionError(f"failed to convert return value for var {k}") from e
Message:
failed to convert return value for var o0
SYSTEM ERROR! Contact platform administrators.
ERROR:root:!! End Error Captured by Flyte !!
We brought this down to a minimal example that triggers the error:
import pandas as pd
from dataclasses import dataclass
from dataclasses_json import dataclass_json
from flytekit import kwtypes, task, workflow
from flytekit.types.schema import FlyteSchema
SomeSchema = FlyteSchema[kwtypes(some_str=str)]
@dataclass_json
@dataclass
class SomeResult:
some_int: int
some_schema: SomeSchema
@task
def some_task() -> SomeResult:
some_schema = SomeSchema()
some_df = pd.DataFrame(data={"some_str": ["a", "b", "c"]})
some_schema.open().write(some_df)
return SomeResult(some_int=1, some_schema=some_schema)
@workflow
def some_workflow() -> SomeResult:
return some_task()
Our main question is; is it expected to not be possible to use a FlyteSchema in this way, and if so is there a nice way to do it otherwise? We’re a bit unsure at the moment, so not saying there actually is anything wrong with flytekit 😅
We’re trying to return both a FlyteDirectory with a FlyteSchema in one object so the FlyteDirectory gets uploaded to its remote_directory and we can still use the other element in the object for other operations. Having a way to force an upload of a FlyteDirectory without returning it would also be a viable solution for us.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:6 (5 by maintainers)
@mdjong1 seems the welcome bot is confused haha.
On the other hand
Look forward to the next release- it has a huge debugging improvement. All errors are Prettified and more actionable. We would appreciate feedback
So dataclasses are passed as opaque jsons in the system today and so it’s not possible to embed higher level Flyte objects in there currently. That being said this seems to be one of the top requests from most users. We are thinking of a smart way of achieving this, while keeping the type safety and experience high class. Off the top, I feel, we can make it happen within dataclasses, but the experience will be hard to translate to the UI maybe.
Cc @pingsutw / @eapolinario yet another one
@nicklofaso thank you for the comment, Some other community members have implemented something similar