question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parallel cronjobs spawned without configuring

See original GitHub issue

Today I had django-cron initiate a runcron, and during that runtime, start another instance and another one. Seems like there is something wrong with the cache lock?

Looking at my cronjob logs, it could also be that django-cron ended the job, before the actual end of the job.

My code looks something like this:

from django_cron import CronJobBase, Schedule
import functions

class Foo(CronJobBase):
    RUN_EVERY_MINS = 360
    schedule = Schedule(run_every_mins=RUN_EVERY_MINS)
    code = 'bar.foo'
    
    def do():
        do_something_for_5_mins()
        do_another_thing_for_1_hour()

It seems that after do_something_for_5_mins(), django-cron removed the lock and allowed another instance to run?

I thought I’d attempt a custom lock:

@transaction.atomic
def set_lock_payment_run(name):
    try:
        lock = app.models.Lock.objects.select_for_update().filter(job_name=name, lock=False).update(lock=True)
        if not lock:
            logger.fatal(' lock was True while cron run was initiated.')
            raise ConcurrencyError

and

@transaction.atomic
def release_lock_payment_run(name):
    lock = app.models.Lock.objects.select_for_update().filter(job_name=name, lock=True).update(lock=False)
    if not lock:
        logger.fatal('lock was False while cron run was ended.')
        raise ConcurrencyError

These locks can be used to avoid parallel runs

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
kukoskcommented, Jul 26, 2018

What caching backend are you guys using? It could be that you’re using LocMemCache which is per-process.

0reactions
kaulgudcommented, Sep 26, 2019

I ran into a similar issue recently. In our case the job gets triggered from multiple web servers on production exactly at the same time. Our caching backend is django.core.cache.backends.db.DatabaseCache and all web servers share a common cache database (SQL Server). The issue with DatabaseCache is that it doesn’t have database level unique key constraints for cache key. Django handles cache key uniqueness through code so we often ran into race condition.

We ended up creating a custom lock.

class CustomCronJobLock(DjangoCronJobLock):

    def lock(self):
        try:
            with transaction.atomic():
                CronJobLock.objects.create(job_name=self.job_name)
                return True
        except:
            return False

    def release(self):
        with transaction.atomic():
            CronJobLock.objects.filter(job_name=self.job_name).delete()

class CronJobLock(models.Model): job_name = models.CharField(max_length=255, primary_key=True)

job_name column has primary key constraint in the database.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Are Linux cron Jobs each executed sequentially or in parallel?
Each and Every cron on your system is isolated from each other, but the cron job execution time will solely depends upon the...
Read more >
Avoid multiple cron jobs running for one cron execution point ...
The most common problem of running CronJobs on k8s is: spawning to many pods which consume all cluster resources.
Read more >
Looks like cron spawns a shell which in turn spawns a script
The script detects if run more than once using a .pid file and exits, so only a single instance will keep running at...
Read more >
Running tasks in pods using jobs - OpenShift Documentation
When any pod from the job terminates with success, no new pods are created. ... A cron job creates a Job object based...
Read more >
Kubernetes CronJobs - Part 2: Parallelism - Alibaba Cloud
We see a clear gap during which no jobs for this cron job ran. ... Kubernetes is able to handle 3 main types...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found