Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Deserialize traceback from stack trace string in Temporal failures if able

See original GitHub issue

Is your feature request related to a problem? Please describe.

~~We chain errors when converting from failures by setting __cause__, but there is a report that the chained errors are not logged like normally chained errors~~

EDIT: We don’t rehydrate the traceback from the stack trace string

Describe the solution you’d like

~~Make sure we log chained errors normally and write a test to ensure it~~

EDIT: Java parses their string stack trace back to stack trace elements, so we should too. See https://github.com/ionelmc/python-tblib/blob/dd926c1e5dc5bbe5e1fc494443bbac8970c7d3ee/src/tblib/__init__.py#L200 for an example of how to do this.

Issue Analytics

State:
Created a year ago
Comments:27 (27 by maintainers)

Top GitHub Comments

1reaction

cretzcommented, Jul 14, 2022

I have opened #75. In addition to other things we have suggested, it contains a test that appends the stack to all Temporal failure errors. Basically you can use this helper:

def append_temporal_stack(exc: Optional[BaseException]) -> None:
    while exc:
        # Only append if it doesn't appear already there
        if (
            isinstance(exc, FailureError)
            and exc.failure
            and exc.failure.stack_trace
            and len(exc.args) == 1
            and "\nStack:\n" not in exc.args[0]
        ):
            exc.args = (f"{exc}\nStack:\n{exc.failure.stack_trace.rstrip()}",)
        exc = exc.__cause__

I am hesitant to add it as a supported public utility at the moment (it’s very simple, only a couple of lines if the formatter didn’t break it out).

Since we cannot put the traceback back on the error and we don’t want stack to be on every string representation of the error, opt-in on the user part is the only way.

I researched custom logging formatters, adapters, etc and it boiled down to just being easier to alter the exceptions in the chain than altering the logging calls. The traceback.format_exception and similar calls by logging and others handle the chain for you, so you can’t add the stack after the fact there. And I don’t want to recreate my own chaining string formatter because I want to reuse Python’s. I would have preferred shallow copying all the exceptions (e.g. copy.copy()) and only adding stack to the shallow copies, but that had problems maintaining the chain too. Same with customizing log record and other approaches.

So basically, altering the exception is easiest. I chose to put the stack after the message instead of before because the internal Python exception formatter always put’s the exception class name first which means I can’t inject anything before that.

Using that helper above, given the following:

    try:
        raise ValueError("error1")
    except Exception as err:
        raise RuntimeError("error2") from err

Once serialized and sent back from Temporal, after running through append_temporal_stack, your output might look like:

temporalio.exceptions.ApplicationError: ValueError: error1
Stack:
  File "/path/to/file.py", line 100, in my_function
    raise ValueError("error1")

The above exception was the direct cause of the following exception:

temporalio.exceptions.ApplicationError: RuntimeError: error2
Stack:
  File "/path/to/file.py", line 201, in my_function
    raise RuntimeError("error2") from err

Which I think is about the best we can do.

1reaction

nathanielobrowncommented, Jul 7, 2022

After internal discussions, I can change those default worker-side log levels from debug to warning (but doesn’t change the fact that client-side is missing stack traces on their logs by default). I like it!

I guess I personally don’t care so much that the stack traces I see are client side vs. just seeing stack traces by default when testing, but I can see how it’s important to have both. Well actually, it might get a little weird if we have both (duplicate output for testing case) but it’s important to at least have both options!

You and the rest of the team are doing such a great job of nailing all the details, I really enjoy following along. Thank you for engaging with my questions/issues.