question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Handling failures, retrying gracefully, and configuring timeouts and retries

See original GitHub issue

Hi, I’m a mobile developer who uses DirectLineJS in our React Native application. This issue is going to outline a few topics surrounding failures, retries and configuration questions I have with DirectLineJS. Most of these topics are asking what is recommended or whether the feature exists or if there is an underlying bug. Overall, my intent is to make our app more robust and handle failure more gracefully. And I’m looking for some guidance on these topics.

To prelude, we use Redux + Redux Thunk + React Native + DirectLine in our application. I’m not too familiar with RxJS so bare with me. As I’m learning more about RxJS I’m learning that it may help us solve some of the issues we have below. For example, if we used redux-observable but instead I’m hoping I can chain RxJs functions with DirectLine i.e. .retries() and .delay() since it uses RxJS or configure DirectLineJS instead of changing our architecture.

EDIT: As I learn more I think adding redux-observable, redux-sagas, or redux-logic will help us a lot.

Retries

I use postActivity to send messages to the bot, but when the id is equal to “retry” how should I gracefully handle this case? What does retry really mean here? From my understanding, its a recoverable error case and used for token expiration i.e. 403? Code in question:

 directLine.postActivity(giftedMessageToBotMessage(message)).subscribe(
    id => {
      if (id === "retry") {
        dispatch(
          messageFailed(message, I18n.t("messageFailedToSend"), "message retry")
        );
        return;
      }
...

Image of what this looks like:

retry-example

Reconnect

Whats the recommended way to handle token expiration? Should I do the following (reconnect when the connectionStatus is ConnectionStatus.ExpiredToken and what happens if the reconnect fails?):

 directLine.connectionStatus$.subscribe(connectionStatus => {
    dispatch(updateConnectionStatus(connectionStatus));
    if (connectionStatus === ConnectionStatus.ExpiredToken) {
      directLine.reconnect(conversation.token);
    }
  });

I currently do this but sometimes I get a “retry” id from my post activity without the connectionStatus changing to ConnectionStatus.ExpiredToken, is this a bug? I thought “retry” and ConnectionStatus.ExpiredToken are analogous (basically equivalent).

Configure Timeout

I want to control the timeout of a postActivity. Because I want the user to see the failure within a timeout of 10 seconds from sending a simple message (its a better user experience instead of the app hanging). i.e. If the message fails then the user can click the red text below the message to attempt to send the message again. As a naive solution I can use setTimeout to check the status of the message, but this is ugly and not very robust. I’m looking for a better solution to configure the timeout of postActivity and cancel the subscription. I’m hoping I can do something like this: directLine.postActivity(message).subscribe(...).timeout(5000).doSomething(). Is this possible?

retry-example

Naive code example:

export function sendMessageToBot(message: GiftedMessage, dispatch: Dispatch) {
let received = false;
  directLine.postActivity(giftedMessageToBotMessage(message)).subscribe(
    id => {
      if (id === "retry") {
        dispatch(
          messageFailed(message, I18n.t("messageFailedToSend"), "message retry")
        );
        return;
      }
     received = true;
      dispatch(updateMessage({ ...message, sent: true, received: true }));
    },
    error => {
      dispatch(messageFailed(message, I18n.t("messageFailedToSend"), error));
    }
  );
  dispatch(updateMessage({ ...message, sent: true }));
 // If we haven't heard from the bot in 10 seconds assume the message failed to send
  setTimeout(() => {
    if (received === false) {
      dispatch(
        messageFailed(message, I18n.t("messageFailedToSend"), "message timeout")
      );
    }
  }, 10000);
}

Configure Retries

Similar to the above, I want to configure the retries for postActivity. So something like this: directLine.postActivity(message).subscribe(...).retries(5). Can I do this?

DirectLine Activity Stream Failure

Listening for messages via the directLine.activity$ stream is a point of failure in our application. We use axios to fetch a token from the server (a point of failure). We then send a message to the bot to let the bot know to start a conversation (a point of failure). We then listen to the activity stream (a point of failure). And all of this we want to retry at an exponential delay. I want to simplify this:

I’m not sure why but we need to manually send a message to the bot to let the bot know to start a conversation. In the bot emulator this is not the case, the bot just starts talking. I’m wondering why this might be?

export const listenForMessages = () => async (dispatch: Dispatch) => {
  // Create the conversation which includes the DirectLine token from the server
  const conversation = await createConversation();
  dispatch(updateConversation(conversation));

  // Create DirectLine and send a message to let the bot know we are here
  directLine = await createDirectline(conversation.token);
  dispatch(sendMessages([createJoinMessage()], true));

  directLine.activity$
    .filter(activity => activity.from.id === env.BOT_ID)
    .subscribe((botMessage: BotMessage) => {
      const newMessage: GiftedMessage = botMessageToGiftedMessage(botMessage);
      dispatch(addMessage(newMessage));
    });

  directLine.connectionStatus$.subscribe(connectionStatus => {
    dispatch(updateConnectionStatus(connectionStatus));
    if (connectionStatus === ConnectionStatus.ExpiredToken) {
      reconnect(conversation.token);
    }
  });
};

@billba Any guidance or recommendations would be greatly appreciated 😄

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:13 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
billbacommented, Jul 25, 2018

I cannot tell a lie; I do rock.

2reactions
billbacommented, Jul 12, 2018

Hi @watadarkstar,

I’m excited to hear that you’re using DirectLineJS in React Native. I’ll try to help as best I can, but I am no longer the person maintaining DLJS. cc: @compulim who might have additional thoughts.

Retries

First of all, mea culpa for this awkward approach for a return value. One of those things I meant to fix later and never got around to. retry is meant to encompass situations where the post failed due to issues around connectivity, including token expiration. In this case DLJS doesn’t attempt to automatically retry because it doesn’t have enough context. You might have just gotten on a plane with no wifi, or entered an underground bunker due to alien invasion, or your personal connectivity is fine but something is broken between the client and your bot.

In Web Chat we leave it up to the user to retry when they think things might be working better.

Reconnect

As mentioned above, there are other situations where postActivity might return “retry”, have a look here.

If your token has expired then you need to get a new one to pass to reconnect. reconnect itself doesn’t actually do any network operations, it just sets up the token for the next Direct Line operation.

Configure Timeout

10 seconds is probably too quick for a timeout unless you know your connectivity is excellent and your bot is very quick.

DLJS uses a single timeout value for all operations. Here’s where it’s used in postActivity.

You can create a customized version to change this value, or to use different timeout values for different operations.

Configure Retries

Similarly, you’d probably need to create a customized version to add retries to postActivity. However see above for reasons why I didn’t build in retries in the first place.

DirectLine Activity Stream Failure

Your bot should get a conversationUpdate activity as soon as you call startConveration. The only reason you’d need to manually send an activity is if you want to access the user id, which is only sent as part of an activity.

I’m not sure if I agree that subscribing to activity$ is a point of failure, in that I’m not sure it ever actually fails. I think that in both the websocket and polling cases it will wait forever for messages from the bot, including waiting for you to fix the connection/token as necessary. I’d like to know if this isn’t the case.


You are pushing on several edges of DLJS, which is great, but it also means you may be experiencing issues that others have not. I’ll stay on this thread and try to help out where I can.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Retries and Timeouts | Linkerd
Automatic retries are one the most powerful and useful mechanisms a service mesh has for gracefully handling partial or transient application failures.
Read more >
Retry pattern - Azure Architecture Center | Microsoft Learn
Enable an application to handle transient failures when it tries to connect to a service or network resource, by transparently retrying a failed...
Read more >
Dealing with Failures - Metaflow Docs
Retrying a failed task is the simplest way to try to handle errors. It is a particularly effective strategy with platform issues which...
Read more >
Error handling in Step Functions - AWS Documentation
This state machine uses a Retry field to retry a function that fails and outputs the error name HandledError . It retries this...
Read more >
Cascading Failures - Google - Site Reliability Engineering
Use clear response codes and consider how different failure modes should be handled. For example, separate retriable and nonretriable error conditions. Don't ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found