i have a task that needs to poll a 3rd party API. What's a good way to introduce backoff?
See original GitHub issuethis is my tasks.py
create_digitalocean_server_task = db_task()(create_digitalocean_server)
this is my utils.py
def create_digitalocean_server(server_id: int, dict_data: dict = None):
"""Create digitalocean server
Note when changing this function params need to restart the huey else the changes won't stick
:param server: the Server object that needs to update with the digitalocean id
:param dict_data: A dictionary of data args for creating
DigitalOcean Droplet, defaults to {}
:return: DigitalOcean droplet object
:rtype: droplet
"""
droplet = digitalocean.Droplet(
token=data["token"],
# ...
)
droplet.create()
# ... skip all the intermediate stuff
actions = droplet.get_actions()
action_status = "not completed"
while action_status != "completed":
# usually it's just one action inside actions
for action in actions:
# the same action id so can just load it
action.load()
# Once it shows "completed", droplet is up and running
action_status = action.status
if action_status == "completed":
droplet = droplet.load()
break
# ... skip all the intermediate stuff
return droplet
So from a high level, I have a form on my Django app that the user fills in data. Then, the form is POSTED to my Django app which triggers a create_digitalocean_server_task(#...)
The form POST request and response ends. Now the user can go do other stuff.
In the meantime, the long running task will try to create a DigitalOcean server via DigitalOcean’s API.
The unfortunate thing is that in itself is also a long running task on DigitalOcean’s end. In order to know when the create server is done, my app needs to poll DigitalOcean’s API repeatedly until the action.status == "completed"
in the task.
I want to introduce backoff for polling the 3rd party API. Currently, as you can see from the code block above, there’s only 1 task that keeps polling in a while loop until I get “completed” from DigitalOcean action API then the task finishes processing.
I am unsure what approach to go with and how in order to introduce backoff:
-
the huey worker runs the task which first calls the DigitalOcean create server API then checks once against the
action
API call. Regardless, if the status is completed, the task is marked completed. But this task will create a new task and enqueue it into Redis (I use Redis with Huey) but with delay (like 10 seconds scheduled to run). This new task will just check once and finishes regardless what the DigitalOceanaction
API says. But if it doesn’t say “completed” it will create a new task and enqueue it with greater delay. And so on until the status turns “completed” or after x tries. I will store the tries attempted stored somewhere in the database. -
the original task itself is not marked as “SIGNAL_COMPLETE” as long as the DigitalOcean action API doesn’t return as “completed”. The same task will be added back to the queue with a delay that increases exponentially.
Approach 1 keeps spawning a new task with increasing delay until DIgitalOcean says “completed” or after x tries.
Approach 2 keeps re-enqueing the same task with increasing delay until DIgitalOcean says “completed” or after x tries.
Both approaches will store the tasks in the database and that’s how the app can keep track.
@signal()
def all_signal_handler(signal, task, exc=None):
status = signal.upper()
job_id = task.id
if settings.DEBUG:
print(f"{signal} - {task.id}")
HueyJob.objects.update_or_create(
job_id=job_id,
defaults={"status": status},
)
My questions are:
- what is the best approach to achieve backoff in the task itself? Approach 1 or 2 or some other 3rd option?
- I am unsure how to spawn a new task from the ending of the current task if Approach 1 is best.
- I am unsure how to re-enqueue the current task at the end of its execution if Approach 2 is best.
- For both 1 and 2, I am unsure how to add delay regardless spawn new task or re-enqueue current task.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
I don’t think it’s necessarily correct to try and re-enqueue the same task over and over again, unless you want to try raising a
RetryTask
exception. See https://huey.readthedocs.io/en/latest/guide.html#retrying-tasks-that-fail - specifically:Sorry for late reply as I needed time to grok this.
Thank you 🙇🏻♂️ This helped.
Also this https://github.com/coleifer/huey/issues/286#issuecomment-356818549 helped as well.
As this comment made me realize that I needed to set blocking=False for the spawned task that will keep checking the 3rd party API otherwise the creation task will get stuck.
In case there’s people who might be interested: