question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sending "Environment" while calling "create_training_job" [sagemaker] results in a 500

See original GitHub issue

Boto3 version: 1.17.46 Python: 3.8.8 Botocore: 1.20.50 MacOS Catalina

Pretty much the subject has all the information I have. The actual output is:

  File "sagemaker.py", line 17, in create_training_job
    response = client.create_training_job(
  File "/Users/my_user/miniconda3/envs/jinn/lib/python3.8/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/my_user/miniconda3/envs/jinn/lib/python3.8/site-packages/botocore/client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (500) when calling the CreateTrainingJob operation (reached max retries: 4): Internal Server Error

The test code includes:

        Environment={
            'the_environment': 'some_string'
        }

removing that, the job can be created.

This is a container based on a pytorch image if that makes any difference.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
stobrien89commented, May 17, 2021

Hi @flaker,

Thanks for the update! I just double-checked and it looks like the fix has been merged. Closing for now, but let us know if anything else comes up.

1reaction
flakercommented, May 7, 2021

Hi @stobrien89

Thanks! ok, then I will keep my workaround for the moment (image instead of algorithmarn) and come back to this later.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CreateTrainingJob - Amazon SageMaker
Starts a model training job. After training completes, SageMaker saves the resulting model artifacts to an Amazon S3 location that you specify.
Read more >
create-training-job — AWS CLI 2.4.19 Command Reference
Starts a model training job. After training completes, Amazon SageMaker saves the resulting model artifacts to an Amazon S3 location that you specify....
Read more >
Amazon SageMaker - Developer Guide
path of the image in an Amazon SageMaker CreateTrainingJob API call. ... to perform tasks on your behalf (for example, reading training results,...
Read more >
An error occurred (ModelError) when calling the ... - AWS re:Post
Hello, I received the following error message when I tried to send an array to my ... when calling the InvokeEndpoint operation: Received...
Read more >
Using the SageMaker Python SDK
ArgumentParser() # hyperparameters sent by the client are passed as ... For more on training environment variables, please visit SageMaker Containers.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found