Thread pool gets blocked inside rate limit code
See original GitHub issueA while back I spent several days debugging a strange issue in my bot (~15k guilds, relies on reactions quite a lot), where all the worker threads would one by one get blocked until the entire bot would crash once they’re all blocked. After… too much digging around, I figured out all the threads were getting blocked inside the EnterAsync
function in Discord.Net.Rest/Net/Queue/RequestQueueBucket.cs
.
I’m not sure exactly what was going wrong, but the while loop never exited even as the rate limit bucket hit its reset time, and it just sat and spun forever. I “solved” the issue by adding a return from the method if the current time is past the reset time, but I’m not sure if this is the correct solution, hence the issue rather than a PR. This gets easier to reproduce locally by setting the min/max thread count to 1, and firing off some reaction-based commands (they seem to make it occur faster, unsure if specifically those, though). At default ThreadPool settings, it only seems to happen at a relatively large scale (around the ~15k guilds previously mentioned).
This is the single-line patch I made on my personal fork: https://github.com/xSke/Discord.Net/commit/2aaa5e7ce50f056969e4b000386fc9bc336880a5
I hope someone more well-versed with D.NET internals can help figure out what’s actually going on, and solve the issue in the main repository. If it helps, the server was running .NET Core 3.1 on Ubuntu 18.04 inside a Docker container.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:11 (4 by maintainers)
Top GitHub Comments
So basically: When you’re executing a lot of REST requests in a single bucket, you’ll end up in this loop:
Which is fine. But the problem is that due to the fact that it wants to sleep a negative amount of milliseconds, Luckily, there’s a check so that we don’t run into an exception:
(line 205 in Discord.Net.Rest.Net.Queue.RequestQueueBucket.cs)
But the problem is that now there is no delay at all and the while loop goes on and on as seen in my screenshot. So for testing purposes, I changed the code to this:
And the high CPU load is fixed. But I cannot say that this is the correct solution / intended behaviour. Maybe the author of the RequestQueueBucket code can look over this?
(If this is still an issue, feel free to comment to let me know)