question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mistral: 2 bugs in one: workflow timeout/cancellation and wait-before

See original GitHub issue

Hi,

Discovered two bugs while I was trying to reproduce only one 😃

Running latest st2 v2.2.0.

  1. Something is off with cancellation and timed out tasks when nested tasks are mistral workflows. You cancel the parent one, it doesn’t affect children. Also, sometimes cancelling them causes all sorts of unexpected results, like indefinite chatops notifications being triggered. In my example the parent task times out, but the child one keeps running, and running, and running, and running, over and over. If you remove wait-before parameter, then it’s gonna finish after all retries are exhausted. Still, doesn’t mean it’s a valid workaround.

  2. Adding wait-before to a task causes it re-init previously published variables (at least it feels like so).

Providing simple workflows and alias to reproduce it (sorry for the names I gave to them):

wf_cancelation_issue.meta.yaml:

---
name: wf_cancelation_issue
parameters:
  skip_notify:
    default:
      - task
      - error
      - success
    type: array
    description: List of tasks to skip notifications for.
  task:
    type: string
    description: The name of the task to run for reverse workflow.
  workflow:
    type: string
    description: The name of the workflow to run if the entry_point is a workbook
      of many workflows. The name should be in the format "<pack_name>.<action_name>.<workflow_name>".
      If entry point is a workflow or a workbook with a single workflow, the runner
      will identify the workflow automatically.
  context:
    default: {}
    type: object
    description: Additional workflow inputs.
tags: []
description: Reproducing a bug with mistral when task is cancelled or timedout
enabled: true
entry_point: workflows/wf_cancelation_issue.yaml
notify: {}
uid: action:c_int:wf_cancelation_issue
pack: c_int
ref: c_int.wf_cancelation_issue
runner_type: mistral-v2

workflows/wf_cancelation_issue.yaml:

---
version: '2.0'

c_int.wf_cancelation_issue:

  tasks:
    task:
      action: core.noop
      on-success:
        - success

    success:
      action: c_int.wf_cancelation_issue_inner
      timeout: 30

wf_cancelation_issue_inner.meta.yaml:

---
name: wf_cancelation_issue_inner
parameters:
  skip_notify:
    default:
      - task1
      - increase_attempt_number
      - task3
      - end
    type: array
    description: List of tasks to skip notifications for.
  task:
    type: string
    description: The name of the task to run for reverse workflow.
  workflow:
    type: string
    description: The name of the workflow to run if the entry_point is a workbook
      of many workflows. The name should be in the format "<pack_name>.<action_name>.<workflow_name>".
      If entry point is a workflow or a workbook with a single workflow, the runner
      will identify the workflow automatically.
  context:
    default: {}
    type: object
    description: Additional workflow inputs.
  retries:
    type: integer
    required: false
    default: 5

tags: []
description: Reproducing a bug with mistral when task is cancelled or timedout
enabled: true
entry_point: workflows/wf_cancelation_issue_inner.yaml
notify: {}
uid: action:c_int:wf_cancelation_issue_inner
pack: c_int
ref: c_int.wf_cancelation_issue_inner
runner_type: mistral-v2

workflows/wf_cancelation_issue_inner.yaml:

---
version: '2.0'

c_int.wf_cancelation_issue_inner:
  type: direct
  input:
    - retries

  tasks:
    task1:
      action: core.noop
      on-success:
        - increase_attempt_number

    increase_attempt_number:
      action: core.noop
      publish:
        attempt: <% ($.get('attempt') or 0) + 1 %>
      on-success:
        - task3

    task3:
      wait-before: 10
      action: core.local
      input:
        cmd: 'echo <% $.attempt %>; exit 1'
      on-success:
        - end
      on-error:
        - increase_attempt_number: <% $.attempt < $.retries %>

    end:
      action: core.noop

aliases/wf_cancelation_issue.yaml

---
name: alias_wf_cancelation_issue
enabled: true
action_ref: c_int.wf_cancelation_issue
description: Testing timeout and cancelation issue
formats:
  - display: "wf_cancel_test"
    representation:
      - "wf_cancel_test"
ack:
  enabled: true
  format: 'WF Cancelation and Timeout workflow started...'
  append_url: true
result:
  extra:
    slack:
      color: "{% if execution.result is defined and execution.result.extra is defined and execution.result.extra.state is defined and execution.result.extra.state == 'SUCCESS' %}#219939{% else %}#d80015{% endif %}"
  format: |
    WF cancelation and timeout task is complete. {~}
    ```
    {% if execution.result is defined and execution.result.extra is defined and execution.result.extra.state is defined and execution.result.extra.state == 'SUCCESS' %}
    All good.
    {% else %}
    No good.
    {% endif %}
    ```

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:18 (18 by maintainers)

github_iconTop GitHub Comments

2reactions
m4dcodercommented, Apr 10, 2017

The source of the wait-before bug has been identified at https://bugs.launchpad.net/mistral/+bug/1681562. Please follow the link to review comments. We will need to wait for the rest of the Mistral core team to provide feedback on the use of the cache and how to workaround this issue.

1reaction
m4dcodercommented, Apr 25, 2017

Again, please separate issues in different post next time.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Mistral Workflow Language (v2) - OpenStack Zed
Defines a delay in seconds that Mistral Engine should wait before starting a task. wait-after. Defines a delay in seconds that Mistral Engine ......
Read more >
Mistral - Launchpad Bugs
This OpenStack service aims to provide a convenient API based on 5min-to-learn generic DSL(Domain Specific Language) to run and manage workflows.
Read more >
Mistral — StackStorm 3.2.0 documentation
Mistral is an OpenStack project that manages and executes workflows as a service. Mistral is automatically installed as a separate service named “mistral” ......
Read more >
Advanced Workflows Features & Examples - Civis Analytics
Pause a workflow. version: '2.0' #you always need this key to specify version 2 of the mistral DSL pause: tasks: python:
Read more >
st2 - Mistral: 2 errores en uno: tiempo de espera / cancelación del ...
St2: Mistral: 2 errores en uno: tiempo de espera / cancelación del flujo de trabajo y ... If entry point is a workflow...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found