Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problems when running model deployment via a custom component

See original GitHub issue

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you’ve tried the usual “quick fixes”:

Search the issues already opened: https://github.com/googleapis/python-aiplatform/issues
Search StackOverflow: https://stackoverflow.com/questions/tagged/google-cloud-platform+python

If you are still having issues, please be sure to include as much information as possible:

Environment details

OS type and version: Vertex AI Notebooks
Python version: 3.8.2
pip version: 2.1.1
`google-cloud-aiplatform 1.1.1

Steps to reproduce

Here’s the notebook: https://colab.research.google.com/drive/18C6nct6m3puwm-PDDAfnljsp2o1DNvq6?usp=sharing.

I am trying to deploy a Vertex AI model to an Endpoint via a custom TFX component. The component looks like so (please refer to the above-mentioned notebook for the full snippet):

@component
def VertexDeployer(
    project: Parameter[str],
    region: Parameter[str],
    model_display_name: Parameter[str],
    deployed_model_display_name: Parameter[str]
):  

    logging.info(f"Endpoint display: {deployed_model_display_name}")
    vertex_ai.init(project=project, location=region)

    endpoints = vertex_ai.Endpoint.list(
        filter=f'display_name={deployed_model_display_name}', 
        order_by="update_time")
    
    if len(endpoints) > 0:
        logging.info(f"Endpoint {deployed_model_display_name} already exists.")
        endpoint = endpoints[-1]
    else:
        endpoint = vertex_ai.Endpoint.create(deployed_model_display_name)

    model = vertex_ai.Model.list(
        filter=f'display_name={model_display_name}',
        order_by="update_time"
    )[-1]

    endpoint = vertex_ai.Endpoint.list(
        filter=f'display_name={deployed_model_display_name}',
        order_by="update_time"
    )[-1]

    deployed_model = endpoint.deploy(
        model=model,
        # Syntax from here: https://git.io/JBQDP
        traffic_split={"0": 100},
        machine_type="n1-standard-4",
        min_replica_count=1,
        max_replica_count=1
    )

    logging.info(f"Model deployed to: {deployed_model}")

As per the logs, things start fine:

2021-08-02 13:30:58.622 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Creating Endpoint
2021-08-02 13:30:58.622 IST
workerpool0-0
I0802 08:00:58.622227 139871630878528 base.py:74] Creating Endpoint
2021-08-02 13:30:58.622 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Create Endpoint backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7134428861818732544
2021-08-02 13:30:58.623 IST
workerpool0-0
I0802 08:00:58.622441 139871630878528 base.py:78] Create Endpoint backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7134428861818732544
2021-08-02 13:31:00.683 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Endpoint created. Resource name: projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:00.683 IST
workerpool0-0
I0802 08:01:00.682116 139871630878528 base.py:98] Endpoint created. Resource name: projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:00.683 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:To use this Endpoint in another session:
2021-08-02 13:31:00.683 IST
workerpool0-0
I0802 08:01:00.682311 139871630878528 base.py:99] To use this Endpoint in another session:
2021-08-02 13:31:00.683 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:endpoint = aiplatform.Endpoint('projects/29880397572/locations/us-central1/endpoints/3702996832675168256')
2021-08-02 13:31:00.683 IST
workerpool0-0
I0802 08:01:00.682386 139871630878528 base.py:101] endpoint = aiplatform.Endpoint('projects/29880397572/locations/us-central1/endpoints/3702996832675168256')
2021-08-02 13:31:01.048 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Deploying Model projects/29880397572/locations/us-central1/models/4554203550527258624 to Endpoint : projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:01.048 IST
workerpool0-0
I0802 08:01:01.048401 139871630878528 base.py:139] Deploying Model projects/29880397572/locations/us-central1/models/4554203550527258624 to Endpoint : projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:01.157 IST

But then out of the blue, it gets to:

INFO:google.cloud.aiplatform.models:Deploy Endpoint model backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7217745454925086720
2021-08-02 14:13:41.232 IST
workerpool0-0
I0802 08:43:41.232497 140248418977600 base.py:159] Deploy Endpoint model backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7217745454925086720
2021-08-02 14:14:01.059 IST
service
The replica workerpool0-0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=29880397572&resource=ml_job%2Fjob_id%2F8657831634038423552&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%228657831634038423552%22

I have even tried to deploy it separately (code included in the notebook):

vertex_ai.init(project=GOOGLE_CLOUD_PROJECT, 
               location=GOOGLE_CLOUD_REGION, 
               staging_bucket="gs://" + GCS_BUCKET_NAME)

model_display_name = "densenet_flowers"
deployed_model_display_name = model_display_name + "_" + TIMESTAMP

endpoints = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}', 
    order_by="update_time"
)

if len(endpoints) > 0:
    print(f"Endpoint {deployed_model_display_name} already exists.")
    endpoint = endpoints[-1]
else:
    endpoint = vertex_ai.Endpoint.create(deployed_model_display_name)

model = vertex_ai.Model.list(
    filter=f'display_name={model_display_name}',
    order_by="update_time"
)[-1]

endpoint = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}',
    order_by="update_time"
)[-1]

deployed_model = endpoint.deploy(
    model=model,
    # Syntax from here: https://git.io/JBQDP
    traffic_split={"0": 100},
    machine_type="n1-standard-4",
    min_replica_count=1,
    max_replica_count=1,
)

It then leads to:

2021-08-02 13:44:07.857 IST
2021/08/02 08:14:07 No id provided.
Expand all | Collapse all{
 insertId: "1vfgsqqg11n9wnl"  
 jsonPayload: {
  levelname: "ERROR"   
  message: "2021/08/02 08:14:07 No id provided.
"   
 }
 labels: {
  compute.googleapis.com/resource_id: "4165408107493202181"   
  compute.googleapis.com/resource_name: "fluentd-caip-jrshv"   
  compute.googleapis.com/zone: "us-central1-a"   
 }
 logName: "projects/fast-ai-exploration/logs/aiplatform.googleapis.com%2Fprediction_container"  
 receiveTimestamp: "2021-08-02T08:14:32.337946313Z"  
 resource: {
  labels: {
   endpoint_id: "3702996832675168256"    
   location: "us-central1"    
   resource_container: "projects/29880397572"    
  }
  type: "aiplatform.googleapis.com/Endpoint"   
 }
 severity: "ERROR"  
 timestamp: "2021-08-02T08:14:07.857787819Z"  
}

I have tried all of this from a Vertex AI Notebook as well and the issue still persists.

Issue Analytics

State:
Created 2 years ago
Comments:12 (4 by maintainers)

Top GitHub Comments

1reaction

sayakpaulcommented, Aug 4, 2021

I was able to complete the deployment using the standalone APIs by changing the serving image:

from google.cloud import aiplatform as vertex_ai

vertex_ai.init(project=GOOGLE_CLOUD_PROJECT, location=GOOGLE_CLOUD_REGION)

PIPELINE_NAME = 'two-way-vertex-pipelines'
serving_model_dir = 'gs://{}/serving_model/{}'.format(
    GCS_BUCKET_NAME, PIPELINE_NAME)
pushed_model_location = os.path.join(serving_model_dir, "densenet")
pushed_model_dir = os.path.join(
    pushed_model_location, tf.io.gfile.listdir(pushed_model_location)[-1]
)
serving_image_uri = "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-5:latest"

vertex_model = vertex_ai.Model.upload(
    display_name="densenet_debug",
    artifact_uri=pushed_model_dir,
    serving_container_image_uri=serving_image_uri,
    parameters_schema_uri=None,
    instance_schema_uri=None,
    explanation_metadata=None,
    explanation_parameters=None,
)

deployed_model_display_name = "densenet_flowers_debug"

endpoints = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}', 
    order_by="update_time")

if len(endpoints) > 0:
    print(f"Endpoint {deployed_model_display_name} already exists.")
    endpoint = endpoints[-1]
else:
    endpoint = vertex_ai.Endpoint.create(deployed_model_display_name)

model = vertex_ai.Model.list(
    filter='display_name=densenet_debug',
    order_by="update_time"
)[-1]

endpoint = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}',
    order_by="update_time"
)[-1]

deployed_model = endpoint.deploy(
    model=model,
    # Syntax from here: https://git.io/JBQDP
    traffic_split={"0": 100},
    machine_type="n1-standard-4",
    min_replica_count=1,
    max_replica_count=1
)

But now, when I am trying to make prediction requests I bump into:

/usr/local/lib/python3.7/dist-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
    922                                       wait_for_ready, compression)
--> 923         return _end_unary_response_blocking(state, call, False, None)
    924 

/usr/local/lib/python3.7/dist-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
    825     else:
--> 826         raise _InactiveRpcError(state)
    827 

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.FAILED_PRECONDITION
	details = "The request size (1857938 bytes) exceeds 1.500MB limit."
	debug_error_string = "{"created":"@1628049004.479789726","description":"Error received from peer ipv4:108.177.126.95:443","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"The request size (1857938 bytes) exceeds 1.500MB limit.","grpc_status":9}"
>

The above exception was the direct cause of the following exception:

FailedPrecondition                        Traceback (most recent call last)
<ipython-input-16-cdc5dd401751> in <module>
      7 image = tf.ones((224, 224, 3), dtype=tf.int32)
      8 image = [image.numpy().tolist()]
----> 9 endpoint.predict(image)

/usr/local/lib/python3.7/dist-packages/google/cloud/aiplatform/models.py in predict(self, instances, parameters)
   1105 
   1106         prediction_response = self._prediction_client.predict(
-> 1107             endpoint=self.resource_name, instances=instances, parameters=parameters
   1108         )
   1109 

/usr/local/lib/python3.7/dist-packages/google/cloud/aiplatform_v1/services/prediction_service/client.py in predict(self, request, endpoint, instances, parameters, retry, timeout, metadata)
    453 
    454         # Send the request.
--> 455         response = rpc(request, retry=retry, timeout=timeout, metadata=metadata,)
    456 
    457         # Done; return the response.

/usr/local/lib/python3.7/dist-packages/google/api_core/gapic_v1/method.py in __call__(self, *args, **kwargs)
    143             kwargs["metadata"] = metadata
    144 
--> 145         return wrapped_func(*args, **kwargs)
    146 
    147 

/usr/local/lib/python3.7/dist-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     73             return callable_(*args, **kwargs)
     74         except grpc.RpcError as exc:
---> 75             six.raise_from(exceptions.from_grpc_error(exc), exc)
     76 
     77     return error_remapped_callable

/usr/local/lib/python3.7/dist-packages/six.py in raise_from(value, from_value)

FailedPrecondition: 400 The request size (1857938 bytes) exceeds 1.500MB limit.

Here’s the Colab Notebook for reproducibility. Please note that the GCS Bucket has public read access. But this question still stands:

Are you suggesting I start from a TF Serving Image and rebuild it similarly what I am doing now?

0reactions

sayakpaulcommented, Aug 5, 2021

@andrewferlitsch sure but I cannot host the model forever on GCS because my resources are limited. Could you help me create one?

Top Results From Across the Web

How to Solve the Model Serving Component of the MLOps Stack

A common way to implement the whole model deployment/serving workflow is to have the model serving component fetch concrete models based on the...

3 Ways to Deploy Machine Learning Models in Production

The simplest way to deploy a machine learning model is to create a web service for prediction. In this example, we use the...

How to put machine learning models into production

The goal of building a machine learning model is to solve a problem, and a machine learning model can only do so when...

Troubleshooting online endpoints deployment and scoring

Learn how to troubleshoot some common deployment and scoring errors with online endpoints.

Machine Learning Model Deployment: Strategy to ... - YouTube

How to deploy machine learning models into production · Next Generation Scheduling YARN and K8s: For Hybrid Cloud/On-prem Environment to run ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Problems when running model deployment via a custom component

Environment details

Steps to reproduce

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

400 Request contains an invalid argument

empty export model response - 'google.cloud.aiplatform_v1beta1.types.model_service.ExportModelResponse'