question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] InvalidAMIID.NotFound when no AMI specified in non us-west-2 AZs

See original GitHub issue

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Clusters

What happened + What you expected to happen

Looks like we’re trying to use the us-west-2 deep learning AMI (ami-0a2363a9cff180a64) by default, even in different availability zones.

It looks like we do have the default AMI’s for other AZs here https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/_private/aws/config.py#L39-L50, so this may be a regression

Versions / Dependencies

ray 1.10, python 3.8.12

Reproduction script

config.yaml:

# An unique identifier for the head node and workers of this cluster.
cluster_name: minimal

# Cloud-provider specific configuration.
provider:
    type: aws
    region: us-west-1

ray cluster up -y config.yaml

Cluster: minimal

2022-02-09 15:37:31,566	INFO util.py:282 -- setting max workers for head node type to 0
2022-02-09 15:37:31,566	INFO util.py:286 -- setting max workers for ray.worker.default to 2
Checking AWS environment settings
AWS config
  IAM Profile: ray-autoscaler-v1 [default]
  EC2 Key pair (all available node types): ray-autoscaler_us-west-1 [default]
  VPC Subnets (all available node types): subnet-8610b9e0, subnet-9435f0ce [default]
  EC2 Security groups (all available node types): sg-085ac69229ef344eb [default]
  EC2 AMI (all available node types): ami-0a2363a9cff180a64

No head node found. Launching a new cluster. Confirm [y/N]: y [automatic, due to --yes]

Acquiring an up-to-date head node
  create_instances: Attempt failed with An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0a2363a9cff180a64]' does not exist, retrying.
  create_instances: Attempt failed with An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0a2363a9cff180a64]' does not exist, retrying.
  create_instances: Attempt failed with An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0a2363a9cff180a64]' does not exist, retrying.
  create_instances: Attempt failed with An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0a2363a9cff180a64]' does not exist, retrying.
  Failed to launch instances. Max attempts exceeded.
Traceback (most recent call last):
  File "/Users/cwong/anaconda3/envs/ray110/bin/ray", line 8, in <module>
    sys.exit(main())
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/scripts/scripts.py", line 1938, in main
    return cli()
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/cli_logger.py", line 808, in wrapper
    return f(*args, **kwargs)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/scripts/scripts.py", line 941, in up
    create_or_update_cluster(
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/commands.py", line 236, in create_or_update_cluster
    get_or_create_head_node(config, config_file, no_restart, restart_only, yes,
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/commands.py", line 635, in get_or_create_head_node
    provider.create_node(head_node_config, head_node_tags, 1)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/aws/node_provider.py", line 310, in create_node
    created_nodes_dict = self._create_node(node_config, tags, count)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/aws/node_provider.py", line 437, in _create_node
    cli_logger.abort(
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/cli_logger.py", line 605, in abort
    raise exc
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/ray/autoscaler/_private/aws/node_provider.py", line 408, in _create_node
    created = self.ec2_fail_fast.create_instances(**conf)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/boto3/resources/factory.py", line 520, in do_action
    response = action(self, *args, **kwargs)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(*args, **params)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/cwong/anaconda3/envs/ray110/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0a2363a9cff180a64]' does not exist

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
worldveilcommented, Feb 22, 2022

Hey, just to add onto this issue, the same problem arises for defaults in other places.

Both

  • provider.availiability_zone
  • available_node_types.ray.worker.default.node_config.ImageId

Also default to things in us-west-2.

For example, if do something like this:

provider:
    type: aws
    region: us-east-2
    cache_stopped_nodes: True # If not present, the default is True.

the availability_zone will also default to us-west-2x AZs!

When possible, defaults should sensibly take context from other places into account, esp from important and highly used parameters like region or cloud provider that are top-level.

0reactions
ckw017commented, Mar 8, 2022

Closing this since original bug is fixed, feel free to open a new issue if it seems separate

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot use Ubuntu Pro with Juju on AWS - Launchpad Bugs
I bootstrapped a controller with Juju 2.8.1 in AWS us-west-2 and tried ... image id '[ami-09109b16d2d9d5779]' does not exist (InvalidAMIID.
Read more >
Persistent AWS Spot Instances (How to) - Part 1 (2017)
“An error occurred (InvalidAMIID.NotFound) when calling the RequestSpotInstances operation: The image id '[ami-6edd3078]' does not exist
Read more >
Juju launching Ubuntu Pro instances in AWS/Azure - Charmhub
I tried the above, obtaining the AMI ID from AWS marketplace two ways: Using the AMI displayed on the “Configure” page for the...
Read more >
Error Code: xxx" when using CloudFormation for ElastiCache?
How do I troubleshoot the error "Status Code: 400; Error Code: xxx" when using ... Error: "cache.xxx (VPC) is not currently supported in...
Read more >
AWS Study Guide Review Questions (SAA-C01) - Quizlet
Your AWS CLI command to launch an AMI as an EC2 instance has failed, giving you an error message that includes InvalidAMIID.NotFound.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found