GCP Vertex AI, deploying model to existing endpoint
See original GitHub issueFeature Area
/area sdk /area components
I currently have a simple pipeline that deploys to Vertex AI that takes an existing model container image from Artifact Registry, uploads it to the Vertex AI model store, creates an endpoint and deploys the model to that endpoint. So far so good.
Now say I have a new revision of that model and I want to deploy that to the same endpoint with a 50/50 traffic split between the old and new version. This is achievable relatively simply in the Vertex AI gui but how do I do this in a kubeflow pipeline?
How am I able to get either an existing model (already uploaded) or an already created endpoint to pass to the ModelDeployOp?
The following code snippet does not work as the returned endpoint type doesnt plug into the operation.
def pipeline(
project: str = PROJECT_ID,
model_display_name: str = MODEL_DISPLAY_NAME,
serving_container_image_uri: str = IMAGE_URI,
):
train_task = print_op("No training to be done here!")
model_upload_op = gcc_aip.ModelUploadOp(
project=project,
location=REGION,
display_name=model_display_name,
# artifact_uri=WORKING_DIR,
serving_container_image_uri=serving_container_image_uri,
serving_container_ports=[{"containerPort": 8000}],
serving_container_predict_route="/hello_world",
serving_container_health_route="/health",
)
endpoints = aip.Endpoint.list(
filter=f"display_name={ENDPOINT_NAME}", order_by="create_time"
)
existing_endpoint = endpoints[0]
existing_endpoint = existing_endpoint
model_deploy_op = gcc_aip.ModelDeployOp(
endpoint=existing_endpoint,
model=model_upload_op.outputs["model"],
deployed_model_display_name=model_display_name,
dedicated_resources_machine_type="n1-standard-4",
dedicated_resources_min_replica_count=1,
dedicated_resources_max_replica_count=1,
traffic_split=traffic_split,
)
Love this idea? Give it a 👍. We prioritize fulfilling features with the most 👍.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:9
Top GitHub Comments
@lc-billyfung it seems your code was close enough for me to get it working with some minor modifications. scrolling through the other issues in the repo i found a mention of this command “.ignore_type()”. using that, with a small correction to your str split and the pipeline deploys:
I’m doing this by using a custom component that checks for existing endpoints based on display_name and then deploys the model to it if found.