question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Should streamz handle exceptions?

See original GitHub issue

What do we do when a stream receives bad data that causes an exception to be raised. For ex:

def foo(x):
    if x is None:
        raise Exception
    else:
        return x + 1
s = Stream()
s2 = s.map(foo)
s3.sink(print)

s.emit(1)
s.emit(None)
s.emit(2)

Here, foo is a point of vulnerability in the stream, where it may or may not cause the whole stream architecture to halt.

Is it worth trying to incorporate some quiet exception handling? I am not sure exactly how to tackle this so I’m being a little vague at this point. I can think of many ways of doing this. Here are a few:

  1. Catch the exceptions and emit them somehow (will require defining a data type). We can also not emit (and perhaps sink errors to a global list) but this may cause unintended synchronization consequences to the user.
  2. catch the exceptions in s.emit. Note that in this case catching the exception may be harder to find

I’ll think about it, but I would like to hear opinions from @mrocklin and @CJ-Wright (who has already handled this in his streams extension). My current method is to wrap all mapped functions to look for exceptions, and return a document that flags the document as having encountered an exception. This works in my subclassed module only though. It would be nice to unify this I think.

(Note: exceptions can occur not just in map but other things like filter etc. Other modules like zip may also want to be exception aware, that something passing through is bad data, and pass this on etc.)

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
jdye64commented, Dec 3, 2019

or sending the data + exception to another pipeline.

This is exactly how we handle exceptions in Apache NiFi and the feedback from community members seems to almost always be positive.

1reaction
mrocklincommented, Oct 10, 2017

If you’re curious then I recommend looking at what ReactiveX does here

On Tue, Oct 10, 2017 at 11:26 AM, Christopher J. Wright < notifications@github.com> wrote:

This reminds me a little of the Try-Monads talk from pygotham https://2017.pygotham.org/talks/try-monads-with-big-data-using-pyspark/ (not that I think we should necessarily use Try-Mondads).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mrocklin/streamz/issues/86#issuecomment-335510692, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszCEVyyl1paUEmZyBy4Dz7qoO2zMIks5sq4yXgaJpZM4P0Dku .

Read more comments on GitHub >

github_iconTop Results From Across the Web

Should streamz handle exceptions? · Issue #86 - GitHub
Yes, at some point we should handle exceptions. Also stopping signals. I don't have a short term plan for this or any concrete...
Read more >
How to handle exceptions properly within streams - Medium
From a stream processing, we can throw a RuntimeException. It is meant to be used if there is a real problem, the stream...
Read more >
Exception Handling in Java Streams - DZone
We take a look at exception handling in Java Streams, focusing on wrapping it into a RuntimeException by creating a simple wrapper tool...
Read more >
Repackaging Exceptions In Streams - nipafx.dev
My first stab at exception handling in Java streams. Explores how to repackage checked exceptions so that they can be thrown without the ......
Read more >
Handling checked exceptions in Java streams - O'Reilly
Here you'll see three primary ways to handle exceptions in a stream pipeline and each approach will include several examples.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found