question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problems when running model deployment via a custom component

See original GitHub issue

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you’ve tried the usual “quick fixes”:

If you are still having issues, please be sure to include as much information as possible:

Environment details

  • OS type and version: Vertex AI Notebooks
  • Python version: 3.8.2
  • pip version: 2.1.1
  • `google-cloud-aiplatform 1.1.1

Steps to reproduce

Here’s the notebook: https://colab.research.google.com/drive/18C6nct6m3puwm-PDDAfnljsp2o1DNvq6?usp=sharing.

I am trying to deploy a Vertex AI model to an Endpoint via a custom TFX component. The component looks like so (please refer to the above-mentioned notebook for the full snippet):

@component
def VertexDeployer(
    project: Parameter[str],
    region: Parameter[str],
    model_display_name: Parameter[str],
    deployed_model_display_name: Parameter[str]
):  

    logging.info(f"Endpoint display: {deployed_model_display_name}")
    vertex_ai.init(project=project, location=region)

    endpoints = vertex_ai.Endpoint.list(
        filter=f'display_name={deployed_model_display_name}', 
        order_by="update_time")
    
    if len(endpoints) > 0:
        logging.info(f"Endpoint {deployed_model_display_name} already exists.")
        endpoint = endpoints[-1]
    else:
        endpoint = vertex_ai.Endpoint.create(deployed_model_display_name)

    model = vertex_ai.Model.list(
        filter=f'display_name={model_display_name}',
        order_by="update_time"
    )[-1]

    endpoint = vertex_ai.Endpoint.list(
        filter=f'display_name={deployed_model_display_name}',
        order_by="update_time"
    )[-1]

    deployed_model = endpoint.deploy(
        model=model,
        # Syntax from here: https://git.io/JBQDP
        traffic_split={"0": 100},
        machine_type="n1-standard-4",
        min_replica_count=1,
        max_replica_count=1
    )

    logging.info(f"Model deployed to: {deployed_model}")

As per the logs, things start fine:

2021-08-02 13:30:58.622 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Creating Endpoint
2021-08-02 13:30:58.622 IST
workerpool0-0
I0802 08:00:58.622227 139871630878528 base.py:74] Creating Endpoint
2021-08-02 13:30:58.622 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Create Endpoint backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7134428861818732544
2021-08-02 13:30:58.623 IST
workerpool0-0
I0802 08:00:58.622441 139871630878528 base.py:78] Create Endpoint backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7134428861818732544
2021-08-02 13:31:00.683 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Endpoint created. Resource name: projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:00.683 IST
workerpool0-0
I0802 08:01:00.682116 139871630878528 base.py:98] Endpoint created. Resource name: projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:00.683 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:To use this Endpoint in another session:
2021-08-02 13:31:00.683 IST
workerpool0-0
I0802 08:01:00.682311 139871630878528 base.py:99] To use this Endpoint in another session:
2021-08-02 13:31:00.683 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:endpoint = aiplatform.Endpoint('projects/29880397572/locations/us-central1/endpoints/3702996832675168256')
2021-08-02 13:31:00.683 IST
workerpool0-0
I0802 08:01:00.682386 139871630878528 base.py:101] endpoint = aiplatform.Endpoint('projects/29880397572/locations/us-central1/endpoints/3702996832675168256')
2021-08-02 13:31:01.048 IST
workerpool0-0
INFO:google.cloud.aiplatform.models:Deploying Model projects/29880397572/locations/us-central1/models/4554203550527258624 to Endpoint : projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:01.048 IST
workerpool0-0
I0802 08:01:01.048401 139871630878528 base.py:139] Deploying Model projects/29880397572/locations/us-central1/models/4554203550527258624 to Endpoint : projects/29880397572/locations/us-central1/endpoints/3702996832675168256
2021-08-02 13:31:01.157 IST

But then out of the blue, it gets to:

INFO:google.cloud.aiplatform.models:Deploy Endpoint model backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7217745454925086720
2021-08-02 14:13:41.232 IST
workerpool0-0
I0802 08:43:41.232497 140248418977600 base.py:159] Deploy Endpoint model backing LRO: projects/29880397572/locations/us-central1/endpoints/3702996832675168256/operations/7217745454925086720
2021-08-02 14:14:01.059 IST
service
The replica workerpool0-0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=29880397572&resource=ml_job%2Fjob_id%2F8657831634038423552&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%228657831634038423552%22

I have even tried to deploy it separately (code included in the notebook):

vertex_ai.init(project=GOOGLE_CLOUD_PROJECT, 
               location=GOOGLE_CLOUD_REGION, 
               staging_bucket="gs://" + GCS_BUCKET_NAME)

model_display_name = "densenet_flowers"
deployed_model_display_name = model_display_name + "_" + TIMESTAMP

endpoints = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}', 
    order_by="update_time"
)

if len(endpoints) > 0:
    print(f"Endpoint {deployed_model_display_name} already exists.")
    endpoint = endpoints[-1]
else:
    endpoint = vertex_ai.Endpoint.create(deployed_model_display_name)

model = vertex_ai.Model.list(
    filter=f'display_name={model_display_name}',
    order_by="update_time"
)[-1]

endpoint = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}',
    order_by="update_time"
)[-1]

deployed_model = endpoint.deploy(
    model=model,
    # Syntax from here: https://git.io/JBQDP
    traffic_split={"0": 100},
    machine_type="n1-standard-4",
    min_replica_count=1,
    max_replica_count=1,
)

It then leads to:

2021-08-02 13:44:07.857 IST
2021/08/02 08:14:07 No id provided.
Expand all | Collapse all{
 insertId: "1vfgsqqg11n9wnl"  
 jsonPayload: {
  levelname: "ERROR"   
  message: "2021/08/02 08:14:07 No id provided.
"   
 }
 labels: {
  compute.googleapis.com/resource_id: "4165408107493202181"   
  compute.googleapis.com/resource_name: "fluentd-caip-jrshv"   
  compute.googleapis.com/zone: "us-central1-a"   
 }
 logName: "projects/fast-ai-exploration/logs/aiplatform.googleapis.com%2Fprediction_container"  
 receiveTimestamp: "2021-08-02T08:14:32.337946313Z"  
 resource: {
  labels: {
   endpoint_id: "3702996832675168256"    
   location: "us-central1"    
   resource_container: "projects/29880397572"    
  }
  type: "aiplatform.googleapis.com/Endpoint"   
 }
 severity: "ERROR"  
 timestamp: "2021-08-02T08:14:07.857787819Z"  
}

I have tried all of this from a Vertex AI Notebook as well and the issue still persists.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
sayakpaulcommented, Aug 4, 2021

I was able to complete the deployment using the standalone APIs by changing the serving image:

from google.cloud import aiplatform as vertex_ai

vertex_ai.init(project=GOOGLE_CLOUD_PROJECT, location=GOOGLE_CLOUD_REGION)

PIPELINE_NAME = 'two-way-vertex-pipelines'
serving_model_dir = 'gs://{}/serving_model/{}'.format(
    GCS_BUCKET_NAME, PIPELINE_NAME)
pushed_model_location = os.path.join(serving_model_dir, "densenet")
pushed_model_dir = os.path.join(
    pushed_model_location, tf.io.gfile.listdir(pushed_model_location)[-1]
)
serving_image_uri = "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-5:latest"

vertex_model = vertex_ai.Model.upload(
    display_name="densenet_debug",
    artifact_uri=pushed_model_dir,
    serving_container_image_uri=serving_image_uri,
    parameters_schema_uri=None,
    instance_schema_uri=None,
    explanation_metadata=None,
    explanation_parameters=None,
)

deployed_model_display_name = "densenet_flowers_debug"

endpoints = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}', 
    order_by="update_time")

if len(endpoints) > 0:
    print(f"Endpoint {deployed_model_display_name} already exists.")
    endpoint = endpoints[-1]
else:
    endpoint = vertex_ai.Endpoint.create(deployed_model_display_name)

model = vertex_ai.Model.list(
    filter='display_name=densenet_debug',
    order_by="update_time"
)[-1]

endpoint = vertex_ai.Endpoint.list(
    filter=f'display_name={deployed_model_display_name}',
    order_by="update_time"
)[-1]

deployed_model = endpoint.deploy(
    model=model,
    # Syntax from here: https://git.io/JBQDP
    traffic_split={"0": 100},
    machine_type="n1-standard-4",
    min_replica_count=1,
    max_replica_count=1
)

But now, when I am trying to make prediction requests I bump into:

/usr/local/lib/python3.7/dist-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
    922                                       wait_for_ready, compression)
--> 923         return _end_unary_response_blocking(state, call, False, None)
    924 

/usr/local/lib/python3.7/dist-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
    825     else:
--> 826         raise _InactiveRpcError(state)
    827 

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.FAILED_PRECONDITION
	details = "The request size (1857938 bytes) exceeds 1.500MB limit."
	debug_error_string = "{"created":"@1628049004.479789726","description":"Error received from peer ipv4:108.177.126.95:443","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"The request size (1857938 bytes) exceeds 1.500MB limit.","grpc_status":9}"
>

The above exception was the direct cause of the following exception:

FailedPrecondition                        Traceback (most recent call last)
<ipython-input-16-cdc5dd401751> in <module>
      7 image = tf.ones((224, 224, 3), dtype=tf.int32)
      8 image = [image.numpy().tolist()]
----> 9 endpoint.predict(image)

/usr/local/lib/python3.7/dist-packages/google/cloud/aiplatform/models.py in predict(self, instances, parameters)
   1105 
   1106         prediction_response = self._prediction_client.predict(
-> 1107             endpoint=self.resource_name, instances=instances, parameters=parameters
   1108         )
   1109 

/usr/local/lib/python3.7/dist-packages/google/cloud/aiplatform_v1/services/prediction_service/client.py in predict(self, request, endpoint, instances, parameters, retry, timeout, metadata)
    453 
    454         # Send the request.
--> 455         response = rpc(request, retry=retry, timeout=timeout, metadata=metadata,)
    456 
    457         # Done; return the response.

/usr/local/lib/python3.7/dist-packages/google/api_core/gapic_v1/method.py in __call__(self, *args, **kwargs)
    143             kwargs["metadata"] = metadata
    144 
--> 145         return wrapped_func(*args, **kwargs)
    146 
    147 

/usr/local/lib/python3.7/dist-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     73             return callable_(*args, **kwargs)
     74         except grpc.RpcError as exc:
---> 75             six.raise_from(exceptions.from_grpc_error(exc), exc)
     76 
     77     return error_remapped_callable

/usr/local/lib/python3.7/dist-packages/six.py in raise_from(value, from_value)

FailedPrecondition: 400 The request size (1857938 bytes) exceeds 1.500MB limit.

Here’s the Colab Notebook for reproducibility. Please note that the GCS Bucket has public read access. But this question still stands:

Are you suggesting I start from a TF Serving Image and rebuild it similarly what I am doing now?

0reactions
sayakpaulcommented, Aug 5, 2021

@andrewferlitsch sure but I cannot host the model forever on GCS because my resources are limited. Could you help me create one?

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Solve the Model Serving Component of the MLOps Stack
A common way to implement the whole model deployment/serving workflow is to have the model serving component fetch concrete models based on the...
Read more >
3 Ways to Deploy Machine Learning Models in Production
The simplest way to deploy a machine learning model is to create a web service for prediction. In this example, we use the...
Read more >
How to put machine learning models into production
The goal of building a machine learning model is to solve a problem, and a machine learning model can only do so when...
Read more >
Troubleshooting online endpoints deployment and scoring
Learn how to troubleshoot some common deployment and scoring errors with online endpoints.
Read more >
Machine Learning Model Deployment: Strategy to ... - YouTube
How to deploy machine learning models into production · Next Generation Scheduling YARN and K8s: For Hybrid Cloud/On-prem Environment to run ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found