question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] - apscheduler skipping alerts

See original GitHub issue

Firstly, thanks for maintaining the project.

Elastalert version - latest Python version - Python 3.8.5 OS - Ubuntu 20.04.1 LTS

Problem description. - This problem comes from the original elastalert. We noticed that amount of rules actually being run by Elastalert was different every time it ran - this was viewed in the Elastalert Elasticsearch index. We never had this issue with a “small” amount of rules and only noticed it when a large set of rules was loaded.

In the Elastalert logs you would see this intermittently:

May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:03 UTC)" was missed by 0:00:02.944895
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:05 UTC)" was missed by 0:00:02.912215
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:06 UTC)" was missed by 0:00:02.827846
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:05 UTC)" was missed by 0:00:02.758194
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:07 UTC)" was missed by 0:00:02.758226
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:08 UTC)" was missed by 0:00:02.617983
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:06 UTC)" was missed by 0:00:02.407513
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:07 UTC)" was missed by 0:00:02.351592
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:05 UTC)" was missed by 0:00:02.262315
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:05 UTC)" was missed by 0:00:02.244299
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:04 UTC)" was missed by 0:00:02.242278
May 24 09:27:06 minimdr python[2498368]: WARNING:apscheduler.executors.default:Run time of job "ElastAlerter.handle_rule_execution (trigger: interval[0:08:00], next run at: 2021-05-24 09:35:04 UTC)" was missed by 0:00:02.237550

We modified elastalert.py and added misfire_grace_time to job as a hack to ensure all the rules runs. The parameter was found here : https://apscheduler.readthedocs.io/en/stable/modules/job.html

This is the result of change: image

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:13 (10 by maintainers)

github_iconTop GitHub Comments

3reactions
jertelcommented, May 25, 2021

Yes, I think both options would be useful:

  1. Customize number of scheduler task threads
  2. Customize misfire grace time (seconds)
1reaction
markus-nclosecommented, May 25, 2021

Yeah, like I mentioned initially, 95% of our queries is very basic. Unfortunately we have a few (~15 alerts) that run regex and take 20-40s which skews the numbers, all the rest runs for about 0.1-0.6. I will quickly do a test with an Elastic instance with no data to ensure searching time is not the issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

apscheduler.scheduler:Execution of job skipped: maximum ...
But after a period of time, i get apscheduler.scheduler:Execution of job skipped: maximum number of running instances reached error.
Read more >
WARNING:apscheduler.scheduler:Execution of job skipped
I used the proc.terminate() to stop the execution of my method. So that the instance of the 1st thread is terminated before a...
Read more >
APScheduler Documentation - Read the Docs
Note: If the execution of a job is delayed due to no threads or processes being available in the pool, the executor may...
Read more >
Frequency interval based scheduling and maximum number ...
I have measured the time it takes to execute the function that the scheduler has to execute, which floats by average around 0.05...
Read more >
User guide — APScheduler 3.9.1 documentation
Missed job executions and coalescing​​ Sometimes the scheduler may be unable to execute a scheduled job at the time it was scheduled to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found