Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

prepare_data from arcgis.learn modules fails to read the data in Azure N series VM

See original GitHub issue

Describe the bug We had recently acquired azure’s N series VM which are powered by the NVIDIA Tesla K80 card and the Intel Xeon E5-2690 v3 (Haswell) processor. We have been working with arcgis.learn module with VM’s without GPU and were successfully able to work on unet semantic segmentation using the arcgis.learn.UnetClassifier classifier.

However, when we shifted to GPU enabled Azure N series VM we started facing issue with arcgis.learn prepare_data.

To Reproduce Steps to reproduce the behavior:

import arcgis
from arcgis.learn import prepare_data
import fastai
import torch
import torchvision
print(arcgis.__version__)
print(fastai.__version__)
print(torch.__version__)
print(torchvision.__version__)

1.6.2
1.0.39
1.0.0
0.2.2

data = prepare_data(path=r'Path/to/training/data',batch_size=16)

error:

Exception                                 Traceback (most recent call last)
<ipython-input-2-b64d83b2729f> in <module>
----> 1 data = prepare_data(path=r'Path/to/training/data',batch_size=16)

~\AppData\Local\ESRI\conda\envs\arcgispro-py3-deeplearningpro\lib\site-packages\arcgis\learn\_data.py in prepare_data(path, class_mapping, chip_size, val_split_pct, batch_size, transforms, collate_fn, seed, dataset_type)
    130 
    131     if not HAS_FASTAI:
--> 132         _raise_fastai_import_error()
    133 
    134     if type(path) is str:

~\AppData\Local\ESRI\conda\envs\arcgispro-py3-deeplearningpro\lib\site-packages\arcgis\learn\_data.py in _raise_fastai_import_error()
     20 
     21 def _raise_fastai_import_error():
---> 22     raise Exception('This module requires fastai, PyTorch and torchvision as its dependencies. Install it using "conda install -c pytorch -c fastai fastai=1.0.39 pytorch=1.0.0 torchvision"')
     23 
     24 def _bb_pad_collate(samples, pad_idx=0):

Exception: This module requires fastai, PyTorch and torchvision as its dependencies. Install it using "conda install -c pytorch -c fastai fastai=1.0.39 pytorch=1.0.0 torchvision"

Screenshots

Expected behavior The data should have been processed, the same data works on Azure VM without GPU component.

Platform (please complete the following information):

OS: Windows server 2019
VM: Standar_NC6, to know more visit: https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-gpu
Browser: Google chrome
Python API Version: 1.6.2, tested on 1.7.0 issue persists

Additional context I have tested on ArcGIS Pro’s environment as well as tried creating a new environment using Anaconda, the issue still persists on the GPU enabled Azure’s N series VM.

Issue Analytics

State:
Created 4 years ago
Comments:13 (1 by maintainers)

Top GitHub Comments

1reaction

cdflintcommented, Apr 27, 2020

To reproduce use the install command in the Exceptions raised by arcgis.learn.prepare_data by either of the error statements posted by myself or OP.

original Exception:

"""Exception: This module requires fastai, PyTorch and torchvision as its dependencies. Install it using 'conda install -c pytorch -c fastai fastai=1.0.39 pytorch=1.0.0 torchvision'"""

my Exception:

"""Exception: This module requires fastai, PyTorch and torchvision and its dependencies.
Install them using 'conda install -c pytorch -c fastai fastai=1.0.54 pytorch=1.0.0 torchvision scikit-image'"""

Suggest changing installation_steps for ‘win32’ to match [‘linux’,‘darwin’] in _data.py to reflect the working code so future errors thrown won’t lead down the same path.

ref arcgis.learn._data.py (v1.8.0)[line 81]

Until Pro ships with an arcgis api version greater than 1.7.0 these error could persist.

1reaction

bharanigurmindercommented, Feb 17, 2020

Since torchvision is a high level neural network API it uses Pillow to stack the data required for training. On Windows pillow has issues opening image files. On further researching, I found many bugs related to Pillow library having issues with tiff file and recommendations suggested to downgrade the libtiff library to make it work. Even after downgrading the libtiff module the issues still persist. After multiple failed attempts I finally exported the training data into jpeg format. This a workaround worked for me.