question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

i have a task that needs to poll a 3rd party API. What's a good way to introduce backoff?

See original GitHub issue

this is my tasks.py

create_digitalocean_server_task = db_task()(create_digitalocean_server)

this is my utils.py

def create_digitalocean_server(server_id: int, dict_data: dict = None):
    """Create digitalocean server

    Note when changing this function params need to restart the huey else the changes won't stick

    :param server: the Server object that needs to update with the digitalocean id
    :param dict_data: A dictionary of data args for creating
        DigitalOcean Droplet, defaults to {}

    :return: DigitalOcean droplet object
    :rtype: droplet
    """

    droplet = digitalocean.Droplet(
        token=data["token"],
        # ...
    )
    droplet.create()

	# ... skip all the intermediate stuff

    actions = droplet.get_actions()
    action_status = "not completed"
    while action_status != "completed":
        # usually it's just one action inside actions
        for action in actions:
            # the same action id so can just load it
            action.load()
            # Once it shows "completed", droplet is up and running
            action_status = action.status
            if action_status == "completed":
                droplet = droplet.load()
                break

	# ... skip all the intermediate stuff

    return droplet

So from a high level, I have a form on my Django app that the user fills in data. Then, the form is POSTED to my Django app which triggers a create_digitalocean_server_task(#...)

The form POST request and response ends. Now the user can go do other stuff.

In the meantime, the long running task will try to create a DigitalOcean server via DigitalOcean’s API.

The unfortunate thing is that in itself is also a long running task on DigitalOcean’s end. In order to know when the create server is done, my app needs to poll DigitalOcean’s API repeatedly until the action.status == "completed" in the task.

I want to introduce backoff for polling the 3rd party API. Currently, as you can see from the code block above, there’s only 1 task that keeps polling in a while loop until I get “completed” from DigitalOcean action API then the task finishes processing.

I am unsure what approach to go with and how in order to introduce backoff:

  1. the huey worker runs the task which first calls the DigitalOcean create server API then checks once against the action API call. Regardless, if the status is completed, the task is marked completed. But this task will create a new task and enqueue it into Redis (I use Redis with Huey) but with delay (like 10 seconds scheduled to run). This new task will just check once and finishes regardless what the DigitalOcean action API says. But if it doesn’t say “completed” it will create a new task and enqueue it with greater delay. And so on until the status turns “completed” or after x tries. I will store the tries attempted stored somewhere in the database.

  2. the original task itself is not marked as “SIGNAL_COMPLETE” as long as the DigitalOcean action API doesn’t return as “completed”. The same task will be added back to the queue with a delay that increases exponentially.

Approach 1 keeps spawning a new task with increasing delay until DIgitalOcean says “completed” or after x tries.

Approach 2 keeps re-enqueing the same task with increasing delay until DIgitalOcean says “completed” or after x tries.

Both approaches will store the tasks in the database and that’s how the app can keep track.


@signal()
def all_signal_handler(signal, task, exc=None):
    status = signal.upper()
    job_id = task.id

    if settings.DEBUG:
        print(f"{signal} - {task.id}")

    HueyJob.objects.update_or_create(
        job_id=job_id,
        defaults={"status": status},
    )

My questions are:

  1. what is the best approach to achieve backoff in the task itself? Approach 1 or 2 or some other 3rd option?
  2. I am unsure how to spawn a new task from the ending of the current task if Approach 1 is best.
  3. I am unsure how to re-enqueue the current task at the end of its execution if Approach 2 is best.
  4. For both 1 and 2, I am unsure how to add delay regardless spawn new task or re-enqueue current task.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
coleifercommented, Apr 21, 2022

I don’t think it’s necessarily correct to try and re-enqueue the same task over and over again, unless you want to try raising a RetryTask exception. See https://huey.readthedocs.io/en/latest/guide.html#retrying-tasks-that-fail - specifically:

It is also possible to explicitly retry a task from within the task, by raising a RetryTask exception. When this exception is used, the task will be retried regardless of whether it was declared with retries. Similarly, the task’s remaining retries (if they were declared) will not be affected by raising RetryTask.

0reactions
simkimsiacommented, Apr 27, 2022

Sorry for late reply as I needed time to grok this.

unless you want to try raising a RetryTask exception. See huey.readthedocs.io/en/latest/guide.html#retrying-tasks-that-fail - specifically:

Thank you 🙇🏻‍♂️ This helped.

Also this https://github.com/coleifer/huey/issues/286#issuecomment-356818549 helped as well.

As this comment made me realize that I needed to set blocking=False for the spawned task that will keep checking the 3rd party API otherwise the creation task will get stuck.

In case there’s people who might be interested:


def is_create_digitalocean_server_complete(server_id:int, token:str, droplet_id:int, parent_id:str, task=None):
	# do stuff...

	# the stuff to check the 3rd party API 
    actions = droplet.get_actions()
    action_status = "not completed"
    while action_status != "completed":
        # usually it's just one action inside actions
        for action in actions:
            # the same action id so can just load it
            action.load()
            # Once it shows "completed", droplet is up and running
            action_status = action.status
            if action_status == "completed":
                droplet = droplet.load()
            # we break no matter what
            break

    if action_status == "completed":
        # .. do stuff
        return droplet

    raise RetryTask

# this is key as need to set blocking as False and retries attempt
is_create_digitalocean_server_complete_task = db_task(blocking=False, retries=10, retry_delay=10, context=True)(is_create_digitalocean_server_complete)

def create_digitalocean_server(server_id: int, dict_data: dict = None):
    """Create digitalocean server

    Note when changing this function params need to restart the huey else the changes won't stick

    :param server: the Server object that needs to update with the digitalocean id
    :param dict_data: A dictionary of data args for creating
        DigitalOcean Droplet, defaults to {}

    :return: DigitalOcean droplet object
    :rtype: droplet
    """
    # do stuff ....

    # spawn the task to do the checks
    is_create_digitalocean_server_complete_task(
        server_id=server_id, token=data["token"], droplet_id=droplet.id, parent_id = task.id
    )

    return droplet

create_digitalocean_server_task = db_task(context=True)(create_digitalocean_server)
Read more comments on GitHub >

github_iconTop Results From Across the Web

From Poll to Push: Transform APIs using Amazon API ...
Poll -to-push solution with API Gateway. Imagine a scenario where you want to be updated as soon as the data is created. Instead...
Read more >
Polling API - Metamug
Polling a third party API is the only choice, when the API is not under our control. If the API was within clients...
Read more >
Implementing exponential backoff | Cloud IoT Core ...
The appropriate value depends on the use case. The client can continue retrying after it has reached the maximum_backoff time. Retries after this...
Read more >
Asynchronous Request-Reply pattern - Azure - Microsoft Learn
One solution to this problem is to use HTTP polling. Polling is useful to client-side code, as it can be hard to provide...
Read more >
Backoff and Retry – Polling – Complete Intro to Realtime
What is a polling request fails? You don't want to thundering-herd yourself by hammering your own API with more requests, but you also...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found