question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AWS Batch - cancel job has no opportunity to catch jobs for cancellation

See original GitHub issue

Based on the implementation, it should be possible to cancel an AWS Batch job before it is STARTING, i.e.

https://github.com/spulec/moto/blob/master/moto/batch/models.py#L1378-L1382

Contains:

    def cancel_job(self, job_id, reason):
        job = self.get_job_by_id(job_id)
        if job.job_state in ["SUBMITTED", "PENDING", "RUNNABLE"]:
            job.terminate(reason)
        # No-Op for jobs that have already started - user has to explicitly terminate those

In short, moto 1.x provided jobs with status in that list, while moto 2.x does not.

The actual effective implementation for batch jobs seems to have changed between moto 1.x and 2.x, where the latter seems to create jobs that enter a STARTING state immediately, or nearly immediately. This is based on experience with moto 1.x and 2.x from tests in https://github.com/dazza-codes/aio-aws

moto 1.x

https://github.com/dazza-codes/aio-aws/blob/main/tests/test_aio_aws_batch.py

It might be too much to ask to go clone that repo/branch and run it, but here’s the summary and some log snippets:

$ pytest -s tests/test_aio_aws_batch.py -k cancel

2021-10-16T21:23:33.290Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_submit:264 | AWS batch-submit-job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) try: 1 of 4
2021-10-16T21:23:33.296Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_cancel:487 | AWS Batch job to cancel: 0f6f1963-e5e0-428c-8b4f-303cf6b968a8, test-job-cancel
2021-10-16T21:23:33.318Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: PENDING
2021-10-16T21:23:33.690Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: PENDING
2021-10-16T21:23:34.010Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: PENDING
2021-10-16T21:23:34.344Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: RUNNABLE
2021-10-16T21:23:34.702Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: RUNNABLE
2021-10-16T21:23:35.012Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: RUNNABLE
2021-10-16T21:23:35.402Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: STARTING
2021-10-16T21:23:36.004Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:0f6f1963-e5e0-428c-8b4f-303cf6b968a8) status: FAILED

moto 2.x

versions:

$ poetry show | grep oto
aio-botocore                      1.3.3       Async client for aws services using botocore and aiohttp
boto3                             1.18.52     The AWS SDK for Python
botocore                          1.21.52     Low-level, data-driven core of boto 3.

moto                              2.2.8       A library that allows your python tests to easily mock out the boto library

test failures:

$ pytest -s tests/test_aio_aws_batch.py -k cancel

FAILED tests/test_aio_aws_batch.py::test_async_batch_job_cancel - KeyError: 'statusReason'
FAILED tests/test_aio_aws_batch.py::test_batch_jobs_cancel - assert <AWSBatchJobStates.SUCCEEDED: 6> == <AWSBatchJobStates.FAILED: 7>
FAILED tests/test_aio_aws_batch.py::test_async_batch_cancel_jobs - assert <AWSBatchJobStates.SUCCEEDED: 6> == <AWSBatchJobStates.FAILED: 7>

log snippets

2021-10-16T21:12:00.218Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_cancel:487 | AWS Batch job to cancel: eca15aa4-d49a-4ff2-9369-4b5239511f66, test-job-cancel
2021-10-16T21:12:00.258Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: STARTING
2021-10-16T21:12:00.527Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: STARTING
2021-10-16T21:12:01.015Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:01.472Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:01.749Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:02.364Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:02.640Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:03.052Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:03.416Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:03.920Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:04.466Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:04.990Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:05.455Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:05.971Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: RUNNING
2021-10-16T21:12:06.306Z | INFO | aio_aws.aio_aws_batch:aio_batch_job_status:594 | AWS Batch job (sleep-5-job:eca15aa4-d49a-4ff2-9369-4b5239511f66) status: SUCCEEDED

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

0reactions
bblommerscommented, May 6, 2022

Moto 3.1.8 now contains a state manager, that would allow you to artificially slow down how fast Moto moves through the individual states. This makes it possible to get back to the Moto 1.x behaviour, where we had sleep-statements in between submitted/pending/runnable, except that the delay is now configurable.

Note that the default behaviour is still to cycle through states as quickly as possible.

A test for this exact scenario, where we want to cancel a Batch-job before it starts, can be found here: https://github.com/spulec/moto/blob/master/tests/test_moto_api/state_manager/test_batch_integration.py

The general documentation can be found here: http://docs.getmoto.org/en/latest/docs/configuration/state_transition/index.html

I believe that this solves the problem outlined, so I’ll close this. Let us know if you have any questions around this though.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cancel all RUNNABLE jobs in AWS Batch
The AWS Management Console allows you to cancel or terminate only one AWS Batch job from a job queue at a time. To...
Read more >
Reader Question: What is the difference between canceling ...
An image of the effects of canceling an AWS Batch job. Jobs in the SUBMITTING. Can I get a do-over? One more thing...
Read more >
Understanding the AWS Batch termination process
This blog helps you understand the AWS Batch job termination process and how you may take actions to gracefully terminate a job by...
Read more >
cancel-job — AWS CLI 1.27.32 Command Reference
Jobs that are in the SUBMITTED , PENDING , or RUNNABLE state are canceled. ... However, the API operation still succeeds, even if...
Read more >
CancelJob - AWS Batch
Cancels a job in an AWS Batch job queue. Jobs that are in the SUBMITTED , PENDING , or RUNNABLE state are canceled....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found