question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

I set a job executor per job, and set max_instances= 3, but I got EVENT_JOB_MISSED

See original GitHub issue

My code:

job_stores = {
    'default': SQLAlchemyJobStore(url='sqlite:///{db}'.format(db=scheduler_db))
}
job_executors = {
    'default': ThreadPoolExecutor(1),
    'cpu': ProcessPoolExecutor(1),
    'disk': ProcessPoolExecutor(1),
    'node': ProcessPoolExecutor(1),
    'utilization': ProcessPoolExecutor(1),
    'check_cpu': ProcessPoolExecutor(2)
}
job_defaults = {
    # 'misfire_grace_time': 0,
    'max_instances': 3
}
scheduler = BlockingScheduler(
    logger=sched_logger,
    jobstores=job_stores,
    executors=job_executors,
    job_defaults=job_defaults
)

then add the jobs:

    if "cpu_collect" not in _job_ids:
        cpu_config = crontab_config.get('cpu_collect', {})
        minute = cpu_config.get('minute', 0)
        hour = cpu_config.get('hour', 4)

        scheduler.add_job(cpu_manage, id="cpu_collect", trigger='cron', minute=minute, hour=hour,
                          executor='cpu', args=("tasks.cpu_collect.log.cpu_logger",))

    if "node_collect" not in _job_ids:
        # executed every hour
        scheduler.add_job(node_manage, id='node_collect', trigger='cron', minute=0,
                          executor='node', args=('tasks.cpu_collect.log.node_logger',))

    if "utilization_collect" not in _job_ids:
        # executed every 10 mintues
        utilization_config = crontab_config.get('utilization', {})
        minute = utilization_config.get('minute', '*/10')
        hour = utilization_config.get('hour', None)
        scheduler.add_job(utilization_manage, id='utilization_collect', trigger='cron',
                          minute=minute, hour=hour,
                          executor='utilization', args=('tasks.cpu_collect.log.node_logger',))

I want cpu_collect job run at 04:00, and it will run for about 1 hour. The node_collect will run for about 1 minute, and utilization_collect run for about 1 minute too.

But the scheduler’s logger shows:

2018-03-18 03:14:19,658 - apscheduler.scheduler - DEBUG - Next wakeup is due at 2018-03-18 03:14:35.663200+08:00 (in 58.531782 seconds)
2018-03-18 03:15:18,214 - apscheduler.scheduler - DEBUG - Looking for jobs to run
2018-03-18 03:15:18,222 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 03:15:18,223 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_MISSED
2018-03-18 03:15:18,285 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 03:15:18,285 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 03:15:18,285 - apscheduler.scheduler - DEBUG - Next wakeup is due at 2018-03-18 03:15:35.663200+08:00 (in 17.448571 seconds)
2018-03-18 03:15:35,734 - apscheduler.scheduler - DEBUG - Looking for jobs to run
2018-03-18 03:15:35,740 - apscheduler.scheduler - INFO - Tick time 1521314135.738131
2018-03-18 03:15:35,740 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 03:15:35,740 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_EXECUTED
2018-03-18 03:15:35,781 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 03:15:35,781 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 03:15:35,781 - apscheduler.scheduler - DEBUG - Next wakeup is due at 2018-03-18 03:16:35.663200+08:00 (in 59.928309 seconds)
2018-03-18 03:17:12,275 - apscheduler.scheduler - DEBUG - Looking for jobs to run
------------------------------------- look here ---------------------------------------------
2018-03-18 05:08:51,051 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:21:12,155 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_MISSED
2018-03-18 05:50:58,405 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:50:58,513 - apscheduler.scheduler - INFO - Job ID: utilization_collect, Code ID: EVENT_JOB_MISSED
2018-03-18 05:50:58,832 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:50:58,832 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 05:50:58,833 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:50:58,833 - apscheduler.scheduler - INFO - Job ID: utilization_collect, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 05:50:58,833 - apscheduler.scheduler - DEBUG - Next wakeup is due at 2018-03-18 03:21:35.663200+08:00 (in 15.297820 seconds)
2018-03-18 05:51:14,193 - apscheduler.scheduler - DEBUG - Looking for jobs to run
2018-03-18 05:51:14,210 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:14,210 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_MISSED2018-03-18 05:51:14,249 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:14,250 - apscheduler.scheduler - INFO - Job ID: utilization_collect, Code ID: EVENT_JOB_MISSED
2018-03-18 05:51:14,279 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:14,280 - apscheduler.scheduler - INFO - Job ID: node_collect, Code ID: EVENT_JOB_MISSED
2018-03-18 05:51:14,400 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - Job ID: utilization_collect, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - Job ID: node_collect, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - The scheduler worked :)
----------------------------cpu_colllect commit at 05:51 , why? -------------------------------
2018-03-18 05:51:14,401 - apscheduler.scheduler - INFO - Job ID: cpu_collect, Code ID: EVENT_JOB_SUBMITTED
2018-03-18 05:51:14,401 - apscheduler.scheduler - DEBUG - Next wakeup is due at 2018-03-18 05:51:35.663200+08:00 (in 21.469705 seconds)
2018-03-18 05:51:14,412 - apscheduler.scheduler - INFO - The scheduler worked :)
---------------------------- and MISSED , why? ----------------------------------------------
2018-03-18 05:51:14,413 - apscheduler.scheduler - INFO - Job ID: cpu_collect, Code ID: EVENT_JOB_MISSED
2018-03-18 05:51:35,872 - apscheduler.scheduler - DEBUG - Looking for jobs to run
2018-03-18 05:51:35,896 - apscheduler.scheduler - INFO - Tick time 1521323495.876146
2018-03-18 05:51:35,896 - apscheduler.scheduler - INFO - The scheduler worked :)
2018-03-18 05:51:35,897 - apscheduler.scheduler - INFO - Job ID: heart_beat, Code ID: EVENT_JOB_EXECUTED
2018-03-18 05:51:35,932 - apscheduler.scheduler - INFO - The scheduler worked :)

last logger time is ‘03:17:12’ , the next line show ‘05:08:51’ suddenly, and tell me my job commit at 05:51, why ? and soon get missed , why?

I think the job will run in a single processor, and max_instances = 3 (actually only one instance will run ), so why my job got missed? Anywhere I config wrong? or some reason else?

Thanks !

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:12 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
agronholmcommented, Oct 7, 2019

APScheduler 4.0 will have a much more generous default for misfire_grace_time which should make this less of an issue.

1reaction
agronholmcommented, Oct 7, 2019

Yeah, I agree it’s not obvious from the documentation. The correct value is None. I had to look it up myself here.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Spark max number of executor to 1 - Cloudera Community
Hi everybody, i'm submitting jobs to a Yarn cluster via SparkLauncher. ... Now, i'd like to have only 1 executor for each job...
Read more >
Job Scheduling - Spark 2.0.0 Documentation
A Spark application with dynamic allocation enabled requests additional executors when it has pending tasks waiting to be scheduled. This condition necessarily ...
Read more >
python apscheduler - skipped: maximum number of running ...
It means that the task is taking longer than one second and by default only one concurrent execution is allowed for a given...
Read more >
Troubleshoot AWS Glue job failing with the error "Container ...
My AWS Glue extract, transform, and load (ETL) job fails with the error "Container killed by YARN for exceeding memory limits".
Read more >
The Job Executor: What Is Going on in My Process Engine?
A job is available, but not acquired for execution or only with great delay. Identifying the problem: Set the loggers org.camunda.bpm.engine.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found