question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AzureVMCluster constructor just hangs after creating the scheduler.

See original GitHub issue

What happened:

After creating a new cluster with the AzureVMCluster constructor, the run just hangs after creating the scheduler.

In the Azure Portal one can see the scheduler after it is created, it runs for a couple of minutes then it is stopped, presumably an indication that something went wrong, but the run does not fail, it just hangs

What you expected to happen: The cluster to be created with the number of workers specified

Minimal Complete Verifiable Example: In a new Conda environment

pip install dask-cloudprovider[azure] az login

from dask_cloudprovider.azure import AzureVMCluster
resource_group = "NGC-AML-Quick-Launch"
workspace_name = "NGC_AML_Quick_Launch_WS"
vnet="NGC-AML-Quick-Launch-vnet"
security_group="NGC-AML-Quick-Launch-nsg"
initial_node_count = 2
vm_size = "Standard_NC6s_v3"
location = "South Central US"
base_dockerfile = "rapidsai/rapidsai-core:cuda10.2-runtime-ubuntu18.04-py3.8"
base_dockerfile = "rapidsai/rapidsai-core-dev-nightly:0.18-cuda10.2-devel-ubuntu18.04-py3.8"
env_vars = {"EXTRA_CONDA_PACKAGES":"pywin32","EXTRA_PIP_PACKAGES": "dask-cloudprovider[azure] dask-cloudprovider[azure] --upgrade  gcsfs dask_xgboost azureml"}
env_vars = {"EXTRA_PIP_PACKAGES": "dask-cloudprovider[azure]"}

cluster = AzureVMCluster(
resource_group=resource_group,
location = location,
vnet=vnet,
security_group=security_group,
n_workers=initial_node_count,
vm_size=vm_size,
docker_image=base_dockerfile,
docker_args="--privileged",
security=False,
env_vars=env_vars,
worker_class="dask_cuda.CUDAWorker")

Anything else we need to know?: Screenshot (16)

VM dask-7984db15-scheduler is created and can be seen on the Azure Portal, it runs for a few minutes then it is closed, but the run never crashes it just hangs

Environment:

  • Dask version: 2021.02.0
  • Python version: 3.8
  • Operating System: windows
  • Install method (conda, pip, source): pip

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:30 (16 by maintainers)

github_iconTop GitHub Comments

1reaction
jacobtomlinsoncommented, Feb 23, 2021

That file will exist on the Dask nodes, not the Jupyter Lab instance.

1reaction
heiqscommented, Feb 19, 2021

rapidsai/rapidsai-core-nightly:0.18-cuda10.2-runtime-ubuntu18.04-py3.8

what was your extra_pip in this case?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deployment gets Stuck when deploying VM from custom Image
Hey, I recently uploaded a .vhd file to my azure storage account and created custom image from it. Image creation finished with no...
Read more >
Create and manage a compute instance - Azure Machine ...
You can create a schedule that creates the compute instance in a stopped state. Stopped compute instances are useful when you create a...
Read more >
Find out when your virtual machine hardware is degraded with ...
Azure continuously monitors for hardware that shows signs of degradation or potential failure. When these conditions are detected, Azure will ...
Read more >
VM restarting or resizing issues in Azure - Virtual Machines
Navigate to the VM that's stuck in the failed state. Under Help, select Redeploy + reapply. Select the Reapply option. Next steps. If...
Read more >
Reacting to maintenance events... before they happen
In order to trigger and test your logic dealing with scheduled events on your VM, simply go to the Azure portal and either...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found