Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parallel cronjobs spawned without configuring

See original GitHub issue

Today I had django-cron initiate a runcron, and during that runtime, start another instance and another one. Seems like there is something wrong with the cache lock?

Looking at my cronjob logs, it could also be that django-cron ended the job, before the actual end of the job.

My code looks something like this:

from django_cron import CronJobBase, Schedule
import functions

class Foo(CronJobBase):
    RUN_EVERY_MINS = 360
    schedule = Schedule(run_every_mins=RUN_EVERY_MINS)
    code = 'bar.foo'
    
    def do():
        do_something_for_5_mins()
        do_another_thing_for_1_hour()

It seems that after do_something_for_5_mins(), django-cron removed the lock and allowed another instance to run?

I thought I’d attempt a custom lock:

@transaction.atomic
def set_lock_payment_run(name):
    try:
        lock = app.models.Lock.objects.select_for_update().filter(job_name=name, lock=False).update(lock=True)
        if not lock:
            logger.fatal(' lock was True while cron run was initiated.')
            raise ConcurrencyError

and

@transaction.atomic
def release_lock_payment_run(name):
    lock = app.models.Lock.objects.select_for_update().filter(job_name=name, lock=True).update(lock=False)
    if not lock:
        logger.fatal('lock was False while cron run was ended.')
        raise ConcurrencyError

These locks can be used to avoid parallel runs

Issue Analytics

State:
Created 6 years ago
Comments:6

Top GitHub Comments

1reaction

kukoskcommented, Jul 26, 2018

What caching backend are you guys using? It could be that you’re using LocMemCache which is per-process.

0reactions

kaulgudcommented, Sep 26, 2019

I ran into a similar issue recently. In our case the job gets triggered from multiple web servers on production exactly at the same time. Our caching backend is django.core.cache.backends.db.DatabaseCache and all web servers share a common cache database (SQL Server). The issue with DatabaseCache is that it doesn’t have database level unique key constraints for cache key. Django handles cache key uniqueness through code so we often ran into race condition.

We ended up creating a custom lock.

class CustomCronJobLock(DjangoCronJobLock):

    def lock(self):
        try:
            with transaction.atomic():
                CronJobLock.objects.create(job_name=self.job_name)
                return True
        except:
            return False

    def release(self):
        with transaction.atomic():
            CronJobLock.objects.filter(job_name=self.job_name).delete()

class CronJobLock(models.Model): job_name = models.CharField(max_length=255, primary_key=True)

job_name column has primary key constraint in the database.