question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug: timeoutSeconds / responseTimeoutSeconds not making task re-scheduled

See original GitHub issue

Hello,

I tried to use responseTimeoutSeconds but task never gets re-scheduled for another processing.

I also tried with timeoutSeconds instead of responseTimeoutSeconds- and got the same behavior. also note that when I tried timeoutSeconds, and looked in the UI at the json of the task, it didn’t specify “timeoutSeconds” there. in addition, in the docs it says “Time in milliseconds” even though the name suggests its “seconds” … so one of them is probably wrong or misleading

for your convenience, here are the exact curl commands to reproduce the bug:

  1. create the task definition
curl -X POST \
  http://localhost:8080/api/metadata/taskdefs \
  -H 'Content-Type: application/json' \
  -d '[
	{
	  "name": "retry-task",
	  "retryCount": 1,
	  "responseTimeoutSeconds": 10,
	  "inputKeys": ["date"],
	  "outputKeys": ["year"],
	  "timeoutPolicy": "RETRY",
	  "retryLogic": "FIXED",
	  "retryDelaySeconds": 0
	}
]'
  1. create the workflow definition
curl -X POST \
  http://localhost:8080/api/metadata/workflow \
  -H 'Content-Type: application/json' \
  -d '{
  "name": "retry-task",
  "description": "Gets date and returns year from it",
  "version": 1,
  "tasks": [
    {
      "name": "retry-task",
      "taskReferenceName": "retry-task",
      "type": "SIMPLE",
      "inputParameters": {
        "date": "${workflow.input.date}"
      }
    }
  ],
  "outputParameters": {
  },
  "schemaVersion": 2
}'
  1. start a new workflow instance
curl -X POST \
  http://localhost:8080/api/workflow/retry-task \
  -H 'Content-Type: application/json' \
  -d '{"date": "1800-01-01"}'
  1. poll pending task as worker (move it to IN_PROGRESS)
curl -X GET \
  'http://localhost:8080/api/tasks/poll/batch/retry-task?count=1&timeout=1000&workerid=postman' \
  -H 'Content-Type: application/json' \
  1. send ack as worker (take task id from previous response)
curl -X POST \
  http://localhost:8080/api/tasks/e134b6c8-7221-42a4-ab2f-9c3c382ab14a/ack \
  -H 'Content-Type: application/json' \
  1. no matter how much time I wait, task is still in status IN_PROGRESS, and if I repeat step (4) to poll pending tasks I get an empty response

If i can do anything else to help investigate this, please let me know

my required flow is as follows:

task is polled by worker worker sends ack that he received the task worker machine crashes worker come back to life after 10 seconds without any result status of ‘COMPLETED’, the task would be retried task is polled (again) by worker … … this process would happen again and again up to 3 times, or until ‘COMPLETED’ status is reported

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:10 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
77yangcommented, Sep 27, 2020

@kishorebanala I restarted it, issue gone😓

0reactions
kishorebanalacommented, Sep 22, 2020

@77yang Can you provide more details to help reproduce this please? cc: @apanicker-nflx

Read more comments on GitHub >

github_iconTop Results From Across the Web

netflix-conductor/community - Gitter
I.e on task failure, if the retries of a task are not exhausted, ... https://netflix.github.io/conductor/tasklifecycle/#response-timeout-seconds.
Read more >
Creating Task Definitions | Orkes Conductor Documentation
responseTimeoutSeconds, Must be greater than 0 and less than timeoutSeconds. The task is rescheduled if not updated with a status after this time...
Read more >
Pinned tasks cannot be postponed - Bug/Problem : r/ticktick
I have to unpin those every time and then postpone.. This is seriously annoying, as i have setup task progress as well, there...
Read more >
EZA5xxxx messages - IBM
A tab value of tabcol , is not in the range of 1 - 12. ... to create a new SMTPNJE.HOSTINFO file. ......
Read more >
A First Worker - Netflix Conductor
Options can be found here; "responseTimeoutSeconds" : Must be greater than 0 and less than timeoutSeconds. The task is rescheduled if not updated...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found