question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to get the last good value on the stream when an error occurs?

See original GitHub issue

I’m trying to do some error reporting, and I’d like to have both the error and the input value that caused the error simultaneously so that I can publish both.

Something like:

_([1, 2, 3, 4])
  .map((x) => {
    if (x > 2) throw new Error('Too big!');
    return x + 10;
  })
  .consume((err, x, push, next) => {
    console.log(err.message); // Should be 'Too Big!'
    console.log(x); // Should be 3, not 2, not 12 and not 13

    if (err) {
      publishErrorEvent({ err, x });
    } else {
      next();
    }
  });

The problem is, it looks like .map will always push x === undefined into the stream whenever an error is thrown.

Is there a way to get the last valid value in the stream? I’m really hoping I don’t have to cache it to local state or something like that. I’ve tried using latest and last, but those don’t seem to do what I’m looking for.

Note: I’m trying to do this for some arbitrary pipelines, so I can’t just publish the error in the .map

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

4reactions
eccentric-jcommented, May 7, 2020

I see. That is a bit tricker but I do have 3 possible solutions. I’ll explain the solutions first then show how it fits into the test code.

I’m not sure this is really the best approach to the problem. This seems to be the exact use case for flatMap where for every input (your x) you are returning a stream of 0, 1, or infinite values or map().sequence(), or parallel, merge, mergeWithLimit. This would let you very easily tie inputs to outputs but also allow you to control how many items are processed at once.

Another even simpler option is to not make the stream responsible for relating the inputs to outputs and instead rely on logging to make that connection. If it’s just for reporting purposes then you could try something like:

console.log('\nUsing a through stream');
const pipeline = (s) => s
  .map((x) => {
    if (x > 2) {
      throw new Error('Too big!');
    }

    return x + 10;
  });

stream([1, 2, 3, 4])
  .tap(console.log) // record input
  .through(pipeline)
  .errors((err, push) => {
    console.log(`${err.name}: ${err.message}`); // Record errors
  })
  .each(console.log);

This way the stream doesn’t need to track the required state to do what you’re looking for. However, this can certainly be done if the recommended paths above don’t apply.

/**
 * catchPipelineError
 * Relates incoming inputs to errors in the output by means of storing minimal
 * state. Likely the most performant of the two options.
 * Takes a function to operate on a stream, just like the through method.
 * Returns a stream of values or emits errors with a .x prop for last input.
 */
function catchPipelineError (fn) {
  let lastX = null;

  return source => source
    .tap(x => {
      lastX = x;
    })
    .through(fn)
    // update the error and emit it again
    .errors((err, push) => {
      err.x = lastX;
      push(err);
    });
}

There is a bit of state but if you look at the source of latest or last, this is more or less what they do.

This should cover most cases but there is potential for issues if say the pipeline uses more complex, async steps where inputs don’t match the order of outputs within the target pipeline.

/**
 * catchPipelineError
 * Relates incoming stream inputs to output of a pipeline by joining streams
 * Takes a function to transform the stream, just like the through method.
 * Returns a stream of values or emits errors with a .x prop for last input.
 */
function catchPipelineError (fn) {
  return source => {
    const inputs = source.observe().latest();

    // Normalize the outputs.
    const outputs = fn(source)
      .map(x => ({ err: null, x }))
      .errors((err, push) => {
        push(null, { err, x: null });
      });

    return outputs
      // This order ensures we read from inputs whenever there is a new output
      .zip(inputs)
      .map(([ { err, x }, input ]) => {
        if (err) {
          err.x = input;
          throw err;
        }

        return x;
      });
  };
}

This one is more stream focused using zip to relate inputs to outputs of the target pipeline. Should hold up to asynchronous pipelines better but as I said, there’s still some unknowns.

/**
 * catchPipelineError
 * Relates incoming stream inputs to output of a pipeline by joining streams
 * Takes a function to transform the stream, just like the through method.
 * Returns a stream of values or emits errors with a .x prop for last input.
 */
function catchPipelineError (fn) {
  const ok = x => ({ ok: true, x });
  const error = x => ({ ok: false, x });
  const isErr = either => either && either.ok === false;

  return source => {
    const inputs = source.observe().latest();

    // Normalize the outputs.
    return source
      .through(fn)
      .map(ok)
      .errors((err, push) => {
        push(null, error(err));
      })
      .zip(inputs)
      .flatMap(([ eithr, input ]) => {
        if (isErr(eithr)) {
          const err = eithr.x;
          err.x = input;

          // Much better than throwing errors in map for instance
          return stream.fromError(err);
        }

        // Return a stream wth a single value
        return stream.of(eithr.x);
      });
  };
}

The big difference here is that we’re not throwing errors in a map function but using flatMap to properly map incoming values to either an error stream or a single value stream. This to me is the most robust solution but given the nature there may be some edge cases with async pipelines.

As for using it, pretty much like catchErrors but works with the through function.

console.log('\nUsing a through stream');
const pipeline = (s) => s
  .map((x) => {
    if (x > 2) {
      throw new Error('Too big!');
    }

    return x + 10;
  });

stream([1, 2, 3, 4])
  .through(catchPipelineError(pipeline))
  .errors((err, push) => {
    console.log(`${err.name}: ${err.message} -> ${err.x}`);
  })
  .each(console.log);

Let me know if there’s more nuances that this still doesn’t cover.

2reactions
eccentric-jcommented, May 7, 2020

I do my best to read these issues carefully so I best understand what you’re trying to do but from time to time I may misunderstand so it may take a couple of tries.

So the problem with the approach above as you noticed, is the x value that breaks is never pushed downstream. I would reach for a general higher-order-function to wrap your mapping function like:

const stream = require("highland");

/**
 * catchError
 * Wraps a function in try...catch so it can report the value that caused the
 * processing error.
 *
 * Takes a function that will receive a value to process and returns the
 * processed value.
 *
 * Returns a function that takes a value to process which will return either
 * the processed value or throw an object with the error and the value that caused it.
 */
function catchError (fn) {
  return (x) => {
    try {
      return fn(x);
    }
    catch (err) {
      throw { err, x };
    }
  };
}

// Example
stream([1, 2, 3, 4])
  .map(catchError(x => {
    if (x > 2) {
      throw new Error("Too big!");
    }

    return x + 10;
  }))
  .each(console.log);

The added bonus is that it can be used with Highland’s map, filter, each, etc… as well as JS’s array map, filter, and forEach functions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How can I pass last stream value to error handler?
1 Answer 1 ... Just add it as a property of the Error object before you throw it or in wrapper code like...
Read more >
Error handling in RxJS. or how to fail not with Observables
The simplest way to handle an error on a stream — is to turn an error into another stream. Another stream — another...
Read more >
Error Handling in Streams - Documentation - Akka
recover recover allows you to emit a final element and then complete the stream on an upstream failure. Deciding which exceptions should be...
Read more >
23.5 — Stream states and input validation - Learn C++
If an error occurs and a stream is set to anything other than goodbit, further stream operations on that stream will be ignored....
Read more >
Troubleshoot throughput error in Amazon Kinesis Data Streams
Divide the SampleCount value by 60 seconds to calculate the average number of GetRecords calls made per second for each shard. If the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found