question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

bgpalerter seems to have lost visibility ~ April 8 - RIS issues ?

See original GitHub issue

Took me a few days to open report this as I wanted to make sure it is not some local issue:

Since April 8, ~1200 UTC I am not seeing any monitored events being triggered. RIPE Service status indicates all is well, RIS Live should be functioning.

Running on RHEL 8 as a systemd service. About a year in prod. Worked fine after udpating to v1.27.1.

I checked that notifications work with the -t flag. They do. It spams Email and Telegram when I use the -t flag. I checked that the process has sufficient resources and permissions - all good. I checked bgpalerter’s reports.log and it is indeed empty but I know I created plenty of mayhem “events” 😛

I then tried creating a new prefixes.yml list and config.yml by stopping the service, renaming the existing ones, executing the binary manually once with bgpalerter-linux-x64 generate -a ASN-o prefixes.yml -i -m

This completed without errors and created sensible files. I restart service, withdraw a monitored prefix and tail -f reports.log. Nothing.

I hijack my prefix from a lab ASN. Nothing.

Final sanity check before reaching out: I spun up a new Ubuntu server, installed docker and created the bgpalerter docker container. Created config, started it, withdrew a prefix. This instance, too, does not “see” an event.

I do see in error.log of the original prod instance around the thing things stopped working:

2021-04-10T14:34:44+00:00 info: ris connector connected
2021-04-10T14:39:45+00:00 info: ris connector connected
2021-04-10T15:05:15+00:00 error: Error: Unexpected server response: 500
2021-04-10T15:05:15+00:00 error: It was not possible to establish a connection with RIPE RIS
2021-04-10T15:06:20+00:00 error: Error: Unexpected server response: 500
2021-04-10T15:06:20+00:00 error: It was not possible to establish a connection with RIPE RIS
2021-04-10T15:07:25+00:00 info: ris connector connected

But these 500 responses have occured from time to time in the past. Of note is that during my testing of prefix withdrawals the last entry in error.log indicated that we were connected at that time: info: ris connector connected

Did RIPE change anything in RIS Live that could be breaking me ?

Unrelated, I did have some 530 responses from Cloudflare when using cloudflare as my vrpProvider for rpki which broke RPKI detection but I had switched back to ntt since and it worked fine.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
massimocandelacommented, May 30, 2021

As promised, in addition to the fix on the RIS side reported above, in the next release of BGPalerter there will be a check for silent socket sessions.

1reaction
massimocandelacommented, Apr 11, 2021

Yes. I was able to reproduce your issue and I contacted the main dev behind RIS and he did some digging. Somebody was flooding the service with connections (now banned), as a result other new legit connections were slow to be served. You spotted this because you were one of those unlucky, we were already connected and we did not.

We are planning some improvements, including a missing/delayed messages monitoring in both BGPalerter and RIS. You will see a PR linked to this issue soon. In the meanwhile a new rule to limit the number of connections per user has been set in RIS (since one connection can have unlimited subscriptions to prefixes, there is no reason at all to open multiple connections…just a lack of reading-the-doc skills).

Thanks for reporting this!!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tweets with replies by Deli (@deligong) / Twitter
Pre-configured for real-time detection of visibility loss, RPKI invalid ... BGP hijacks are big issues and projects should probably start monitoring the BGP ......
Read more >
Understanding the Capabilities of Route Collectors to Observe ...
Prior simulations showed that hijacks that affect more than 2% of the Internet are always visible to the public route collector infrastructure. However,...
Read more >
Easy BGP monitoring with BGPalerter - LACNIC
BGPalerter. BGPalerter is a tool for analyzing streams of eBGP data. • We developed it for monitoring NTT prefixes. • hijacks, visibility loss,...
Read more >
ISOC-FCC-NOI-Comments-04112022.pdf - Internet Society
entities to employ specific routing security measures may seem like a natural solution but are more than likely to have negative ...
Read more >
It's time for a simpler route: object management - APNIC Blog
This causes all kinds of problems for a database, which tries to ensure ... accept a loss of referential integrity and periodically check....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found