azureml-core 1.44.0 fails to deploy model to webservice
See original GitHub issue- azureml-core
- 1.44.0
- conda virtualenv (compute instance):
- Azure Machine Learning
- ML
Describe the bug Model fails to deploy when I run the deployment code in Azure Notebook using virtualenv with azureml-core 1.44.0
It works just fine with older version (1.43.0) or the default Python 3.8 - Azure ML that uses 1.42.0 at the moment.
The output:
Running
2022-08-25 07:03:20+00:00 Creating Container Registry if not exists.
2022-08-25 07:03:20+00:00 Registering the environment.
2022-08-25 07:03:21+00:00 Use the existing image.
2022-08-25 07:03:22+00:00 Generating deployment configuration.
2022-08-25 07:03:23+00:00 Submitting deployment to compute.
2022-08-25 07:03:30+00:00 Checking the status of deployment heart-disease-classification-env..
2022-08-25 07:05:44+00:00 Checking the status of inference endpoint heart-disease-classification-env.
Failed
Service deployment polling reached non-successful terminal state, current service state: Failed
Operation ID: 3a980ad2-890e-4e8a-91d6-c119bd0528a4
More information can be found using '.get_logs()'
Error:
{
"code": "AciDeploymentFailed",
"statusCode": 400,
"message": "Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.
1. Please check the logs for your container instance: heart-disease-classification-env. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
3. You can also try to run image 237a7cc8f2c84e1287a6cc08d5e54f9f.azurecr.io/azureml/azureml_09e362cde9760a4b66987389c8bbc20a locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.",
"details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.
1. Please check the logs for your container instance: heart-disease-classification-env. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
3. You can also try to run image 237a7cc8f2c84e1287a6cc08d5e54f9f.azurecr.io/azureml/azureml_09e362cde9760a4b66987389c8bbc20a locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information."
},
{
"code": "AciDeploymentFailed",
"message": "Your container application crashed. Please follow the steps to debug:
1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.
2. If your container application crashed. This may be caused by errors in your scoring file's init() function. You can try debugging locally first. Please refer to https://aka.ms/debugimage#debug-locally for more information.
3. You can also interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
4. View the diagnostic events to check status of container, it may help you to debug the issue.
"RestartCount": 3
"CurrentState": {"state":"Waiting","startTime":null,"exitCode":null,"finishTime":null,"detailStatus":"CrashLoopBackOff: Back-off restarting failed"}
"PreviousState": {"state":"Terminated","startTime":"2022-08-25T07:07:34.619Z","exitCode":111,"finishTime":"2022-08-25T07:07:48.858Z","detailStatus":"Error"}
"Events":
{"count":1,"firstTimestamp":"2022-08-25T07:03:36Z","lastTimestamp":"2022-08-25T07:03:36Z","name":"Pulling","message":"pulling image "237a7cc8f2c84e1287a6cc08d5e54f9f.azurecr.io/azureml/azureml_09e362cde9760a4b66987389c8bbc20a@sha256:7650a3f19eb4803881637a920dc3e9bf9837c0e9c492b7d22be840d0ba8cb1cf"","type":"Normal"}
{"count":1,"firstTimestamp":"2022-08-25T07:05:15Z","lastTimestamp":"2022-08-25T07:05:15Z","name":"Pulled","message":"Successfully pulled image "237a7cc8f2c84e1287a6cc08d5e54f9f.azurecr.io/azureml/azureml_09e362cde9760a4b66987389c8bbc20a@sha256:7650a3f19eb4803881637a920dc3e9bf9837c0e9c492b7d22be840d0ba8cb1cf"","type":"Normal"}
{"count":4,"firstTimestamp":"2022-08-25T07:05:37Z","lastTimestamp":"2022-08-25T07:07:34Z","name":"Started","message":"Started container","type":"Normal"}
{"count":4,"firstTimestamp":"2022-08-25T07:05:54Z","lastTimestamp":"2022-08-25T07:07:48Z","name":"Killing","message":"Killing container with id 54971cd5cf0e6de46f30bd592bea94752d4ad857fb32f6d85e33b3a8bd4e4c92.","type":"Normal"}
"
}
]
}
To Reproduce Steps to reproduce the behavior:
- I use the standard heart-diseaase dataset, train the model and export it to model/hd_otr.pkl
- In assets folder I store the outlierremover.py script that I use to remove outliers:
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
class OutlierRemover(BaseEstimator, TransformerMixin):
def __init__(self, factor=1.5):
self.factor = factor
def outlier_detector(self, X, y=None):
X = pd.Series(X).copy()
q1 = X.quantile(0.25)
q3 = X.quantile(0.75)
iqr = q3 - q1
self.lower_bound.append(q1 - (self.factor * iqr))
self.upper_bound.append(q3 + (self.factor * iqr))
def fit(self,X,y=None):
self.lower_bound = []
self.upper_bound = []
X.apply(self.outlier_detector)
return self
def transform(self, X, y=None):
X = pd.DataFrame(X).copy()
for i in range(X.shape[1]):
x = X.iloc[:, i].copy()
x[(x < self.lower_bound[i])] = self.lower_bound[i]
x[(x > self.upper_bound[i])] = self.upper_bound[i]
X.iloc[:, i] = x
return X
outlier_remover = OutlierRemover()
and score.py file:
import joblib
from azureml.core.model import Model
import json
import pandas as pd
import numpy as np
from outlierremover import OutlierRemover
from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType
def init():
global model
# Example when the model is a file
model_path = Model.get_model_path('hd_otr') # logistic
print('Model Path is ', model_path)
model = joblib.load(model_path)
data_sample = PandasParameterType(pd.DataFrame({'age': pd.Series([71], dtype='int64'),
'sex': pd.Series(['0'], dtype='object'),
'cp': pd.Series(['0'], dtype='object'),
'trestbps': pd.Series([112], dtype='int64'),
'chol': pd.Series([203], dtype='int64'),
'fbs': pd.Series(['0'], dtype='object'),
'restecg': pd.Series(['1'], dtype='object'),
'thalach': pd.Series([185], dtype='int64'),
'exang': pd.Series(['0'], dtype='object'),
'oldpeak': pd.Series([0.1], dtype='float64'),
'slope': pd.Series(['2'], dtype='object'),
'ca': pd.Series(['0'], dtype='object'),
'thal': pd.Series(['2'], dtype='object')}))
input_sample = StandardPythonParameterType({'data': data_sample})
result_sample = NumpyParameterType(np.array([0]))
output_sample = StandardPythonParameterType({'Results': result_sample})
@input_schema('Inputs', input_sample)
@output_schema(output_sample)
def run(Inputs):
try:
data = Inputs['data']
#result = model.predict_proba(data)
result = np.round(model.predict_proba(data)[0][0], 2)
return result.tolist()
except Exception as e:
error = str(e)
return error
- In the deployment.ipynb notebook the code is as follows:
from azureml.core import Workspace
from azureml.core.webservice import AciWebservice
from azureml.core.webservice import Webservice
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.conda_dependencies import CondaDependencies
ws = Workspace.from_config()
model = Model.register(workspace = ws,
model_path ='model/hd_otr.pkl',
model_name = 'hd_otr',
tags = {'version': '1'},
description = 'Heart disease classification with outliers detection',
)
# to install required packages
env = Environment('env')
cd = CondaDependencies.create(pip_packages=['pandas', 'azureml-defaults', 'joblib', 'inference-schema', 'imbalanced-learn'], conda_packages = ['scikit-learn'])
env.python.conda_dependencies = cd
# register environment to re-use later
env.register(workspace = ws)
myenv = Environment.get(workspace=ws, name='env')
myenv.save_to_directory('./environ', overwrite=True)
aciconfig = AciWebservice.deploy_configuration(
cpu_cores=1,
memory_gb=1,
tags={'data':'heart disease classifier'},
description='Classification of heart diseases'
)
inference_config = InferenceConfig(entry_script='score.py', environment=myenv, source_directory='./assets')
service = Model.deploy(workspace=ws,
name='heart-disease-classification-env',
models=[model],
inference_config=inference_config,
deployment_config=aciconfig,
overwrite=True)
service.wait_for_deployment(show_output=True)
url = service.scoring_uri
print(url)
…which gives the error from 1. with 1.44.0 but works just fine with the older versions.
Issue Analytics
- State:
- Created a year ago
- Comments:14 (3 by maintainers)
Top Results From Across the Web
azureml-core 1.44.0 fails to deploy model to webservice
1. From the AML SDK, you can run print(service. · 2. If your container application crashed. · 3. You can also interactively debug...
Read more >azureml-core
Creating/managing Machine learning compute targets and resources. Models, images and web services. Modules supporting data representation for Datastore and ...
Read more >Issue with Deploying a Model using Azure Machine ...
Issue with Deploying a Model using Azure Machine Learning Service using notebook ... azureml-train-automl-runtime==1.44.0; inference-schema ...
Read more >How to fix Azure ml model deployment Error
I'm trying to deploy a RandomForest model using azure ML with ACI , but after i deploy my service i keep getting this...
Read more >What is Azure Machine Learning
And much more... Visit Azure Machine Learning studio at ml.azure.com. When you have the right model, you can easily use it in a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
After updating to azure-ml 1.45.0 it works just fine; the logs below:
Thank you!
We’ve published 0.7.6 to address this issue. Please let us know if you continue to experience this issue in the latest version. Thanks!