question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failing to find Subscription ID when targeting AzureUSGovernment Tenant

See original GitHub issue
  • Package Name: MLClient
  • Package Version: SDK V2
  • Operating System: AML Compute Instance STANDARD_DS11_V2
  • Python Version: Python 3.10

Describe the bug After initializing an instance of the MLClient module, executing any of it’s methods results in the error below.

To Reproduce Steps to reproduce the behavior:

Pre-requirements:

  1. Have an AzureUSGovernment tenant and subscrition
  2. Have an AML Workspace created, along with a Compute Instance
  3. Have a Service Principal created in the above subscription, and given a “Contributor” role assignment to the AML Workspace
  4. Run a notebook in AML using the compute instance, and updating the placeholder environment variables:
from azure.ai.ml.entities import AmlCompute
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, AzureAuthorityHosts, EnvironmentCredential

import traceback

# Set ENV Variables

os.environ["AZURE_CLIENT_SECRET"] = "<value>"
os.environ["AZURE_CLIENT_ID"] = "<value>"
os.environ["AZURE_TENANT_ID"] = "<value>"
os.environ["AZURE_AUTHORITY_HOST"] = AzureAuthorityHosts.AZURE_GOVERNMENT


credentials = DefaultAzureCredential(
    interactive_browser_tenant_id=os.environ["AZURE_TENANT_ID"],
    authority=AzureAuthorityHosts.AZURE_GOVERNMENT
    )

ml_client = MLClient(
    credential=credentials,
    subscription_id="<value>",
    resource_group_name="<value>",
    workspace_name="<value>",
    cloud="AzureUSGovernment",
)

# Name assigned to the compute cluster
cpu_compute_target = "cpu-cluster-2"

try:
    # let's see if the compute target already exists
    cpu_cluster = ml_client.compute.get(cpu_compute_target)
    print(
        f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
    )

except Exception:
    print("Creating a new cpu compute target...")

    # Let's create the Azure ML compute object with the intended parameters
    cpu_cluster = AmlCompute(
        name=cpu_compute_target,
        # Azure ML Compute is the on-demand VM service
        type="amlcompute",
        # VM Family
        size="STANDARD_DS3_V2",
        # Minimum running nodes when there is no job running
        min_instances=0,
        # Nodes in cluster
        max_instances=4,
        # How many seconds will the node running after the job termination
        idle_time_before_scale_down=180,
        # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination
        tier="Dedicated",
    )

    # Now, we pass the object to MLClient's create_or_update method
    cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)

print(
    f"AMLCompute with name {cpu_cluster.name} is created, the compute size is {cpu_cluster.size}"
)

Expected behavior The above code should result in either a new CPU Cluster being created, or printing out the message You already have a cluster named {cpu_compute_target}, we'll reuse it as is."

Screenshots

The actual behavior is an error:

ResourceNotFoundError: (SubscriptionNotFound) The subscription 'xxxxxxxxxxxxxxxx' could not be found.
Code: SubscriptionNotFound
Message: The subscription 'xxxxxxxxxxxxxxxx' could not be found.

The stack trace is:

Creating a new cpu compute target...
---------------------------------------------------------------------------
ResourceNotFoundError                     Traceback (most recent call last)
Input In [8], in <cell line: 13>()
     13 try:
     14     # let's see if the compute target already exists
---> 15     cpu_cluster = ml_client_6.compute.get(cpu_compute_target)
     16     print(
     17         f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
     18     )

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:169, in monitor_with_activity.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    168 with log_activity(logger, activity_name or f.__name__, activity_type, custom_dimensions):
--> 169     return f(*args, **kwargs)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_operations/compute_operations.py:75, in ComputeOperations.get(self, name)
     67 """Get a compute resource
     68 
     69 :param name: Name of the compute
   (...)
     72 :rtype: Compute
     73 """
---> 75 response, rest_obj = self._operation.get(
     76     self._operation_scope.resource_group_name,
     77     self._workspace_name,
     78     name,
     79     cls=get_http_response_and_deserialized_from_pipeline_response,
     80 )
     81 # TODO: Remove warning logging after 05/31/2022 (Task 1776012)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:83, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     82 if span_impl_type is None:
---> 83     return func(*args, **kwargs)
     85 # Merge span is parameter is set, but only if no explicit parent are passed

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_restclient/v2022_01_01_preview/operations/_compute_operations.py:577, in ComputeOperations.get(self, resource_group_name, workspace_name, compute_name, **kwargs)
    576 if response.status_code not in [200]:
--> 577     map_error(status_code=response.status_code, response=response, error_map=error_map)
    578     error = self._deserialize.failsafe_deserialize(_models.ErrorResponse, pipeline_response)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/exceptions.py:105, in map_error(status_code, response, error_map)
    104 error = error_type(response=response)
--> 105 raise error

ResourceNotFoundError: (SubscriptionNotFound) The subscription '50ff9458-6372-4522-8227-327043deaef5' could not be found.
Code: SubscriptionNotFound
Message: The subscription '50ff9458-6372-4522-8227-327043deaef5' could not be found.

During handling of the above exception, another exception occurred:

ResourceNotFoundError                     Traceback (most recent call last)
Input In [8], in <cell line: 13>()
     24     cpu_cluster = AmlCompute(
     25         name=cpu_compute_target,
     26         # Azure ML Compute is the on-demand VM service
   (...)
     37         tier="Dedicated",
     38     )
     40     # Now, we pass the object to MLClient's create_or_update method
---> 41     cpu_cluster = ml_client_6.compute.begin_create_or_update(cpu_cluster)
     43 print(
     44     f"AMLCompute with name {cpu_cluster.name} is created, the compute size is {cpu_cluster.size}"
     45 )

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:169, in monitor_with_activity.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    166 @functools.wraps(f)
    167 def wrapper(*args, **kwargs):
    168     with log_activity(logger, activity_name or f.__name__, activity_type, custom_dimensions):
--> 169         return f(*args, **kwargs)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_operations/compute_operations.py:116, in ComputeOperations.begin_create_or_update(self, compute, **kwargs)
    107 @monitor_with_activity(logger, "Compute.BeginCreateOrUpdate", ActivityType.PUBLICAPI)
    108 def begin_create_or_update(self, compute: Compute, **kwargs: Any) -> LROPoller:
    109     """Create a compute
    110 
    111     :param compute: Compute definition.
   (...)
    114     :rtype: LROPoller
    115     """
--> 116     compute.location = self._get_workspace_location()
    117     compute._set_full_subnet_name(self._operation_scope.subscription_id, self._operation_scope.resource_group_name)
    119     compute_rest_obj = compute._to_rest_object()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_operations/compute_operations.py:308, in ComputeOperations._get_workspace_location(self)
    307 def _get_workspace_location(self) -> str:
--> 308     workspace = self._workspace_operations.get(self._resource_group_name, self._workspace_name)
    309     return workspace.location

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:83, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     81 span_impl_type = settings.tracing_implementation()
     82 if span_impl_type is None:
---> 83     return func(*args, **kwargs)
     85 # Merge span is parameter is set, but only if no explicit parent are passed
     86 if merge_span and not passed_in_parent:

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_restclient/v2022_01_01_preview/operations/_workspaces_operations.py:615, in WorkspacesOperations.get(self, resource_group_name, workspace_name, **kwargs)
    612 response = pipeline_response.http_response
    614 if response.status_code not in [200]:
--> 615     map_error(status_code=response.status_code, response=response, error_map=error_map)
    616     error = self._deserialize.failsafe_deserialize(_models.ErrorResponse, pipeline_response)
    617     raise HttpResponseError(response=response, model=error, error_format=ARMErrorFormat)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/exceptions.py:105, in map_error(status_code, response, error_map)
    103     return
    104 error = error_type(response=response)
--> 105 raise error

Additional context

I looked through the source code in _azure_environments.py file and also the _ml_client.py file to infer what environment variables and values I needed to pass into the MLClient constructor. However, something doesn’t appear to be working correctly.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:3
  • Comments:18 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
harneetvirkcommented, Dec 6, 2022

The new CI image has been released with SDK v2 package installed from pypi. Please create a new Compute Instance.

0reactions
adrian-gonzalezcommented, Dec 6, 2022

Thanks @harneetvirk!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to find my subcription under Add Subscription for a ...
When I opened Azure portal to check the deployed App Service, I could see the AppService was deployed in another new Tenant which...
Read more >
Trigger-PolicyInitiativeRemediation.ps1 1.3 - PowerShell Gallery
This script takes a PolicyAssignmentId, SubscriptionID or ManagementGroupID as parameters, analyzes the scope targeted to determine what Azure Policy ...
Read more >
Move 4.3 - Azure to AHV - Nutanix Support Portal
Once you have the Subscription ID, Tenant ID, Application ID and the client secret value, you can add the Azure provider in the...
Read more >
Configuration | Grafana Loki documentation
A full list of available targets can be printed when running Loki with the ... When configured it separates the tenant query queues...
Read more >
The subscription of xxx' doesn't exist in cloud 'AzureCloud'
When I enter az account list I get all of my details, like tenantId, subscriptionID etc.. I automatically login when I access Azure...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found