question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] [FlyteAdmin] Notifications SQS subscriber stops processing messages when "connection reset by peer"

See original GitHub issue

Describe the bug Notifications SQS subscriber stopped process messages

Expected behavior Gracefully reconnecting if the application is running

Flyte component

  • Overall
  • Flyte Setup and Installation scripts
  • Flyte Documentation
  • Flyte communication (slack/email etc)
  • FlytePropeller
  • FlyteIDL (Flyte specification language)
  • Flytekit (Python SDK)
  • FlyteAdmin (Control Plane service)
  • FlytePlugins
  • DataCatalog
  • FlyteStdlib (common libraries)
  • FlyteConsole (UI)
  • Other

To Reproduce Steps to reproduce the behavior:

  1. Run flyte with enabled notifications
  2. Wait for this happens

Environment Flyte component

  • Sandbox (local or on one machine)
  • Cloud hosted
    • AWS
    • GCP
    • Azure
  • Baremetal
  • Other

Additional context Logs:

{"json":{"src":"base.go:103"},"level":"error","msg":"error with starting processor err: [RequestError: send request failed\ncaused by: Post https://sqs.us-east-1.amazonaws.com/: read tcp 10.200.8.116:59882-\u003e52.46.137.144:443: read: connection reset by peer] ","ts":"2020-07-03T10:35:10Z"}
{"json":{"src":"processor.go:113"},"level":"warning","msg":"The stream for the subscriber channel closed with err: RequestError: send request failed\ncaused by: Post https://sqs.us-east-1.amazonaws.com/: read tcp 10.200.8.116:59882-\u003e52.46.137.144:443: read: connection reset by peer","ts":"2020-07-03T10:35:10Z"}

I guess solution will be similar to this one: https://github.com/lyft/flyteadmin/pull/92

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
kumare3commented, Jul 7, 2020

@rstanevich I think we found a very good way of solving this problem. @katrogan will merge the PR soon. Thank you for raising the issue.

0reactions
kumare3commented, Jul 7, 2020

it is merged and will be part of the next release

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found