gRPC Client ~4x slower than http requests in docker
See original GitHub issueIm running the containerized torchserve to deploy a simple face recognition model. I have a 20mb torchscript model, that takes an input image of size 3x72x53 and returns K dimensional features as the embeddings. Now by simply using the requests library in python, my inference runtime is 0.02 seconds (which is very good since its very close to running the same torchscript model locally), but while running the same model with gRPC client as described in the docs its taking > 0.07 seconds.
Now out of curiosity, I ran the same client script inside the docker container for both http and grpc, and I found that grpc is equally faster if not more. This means that the issue has something to do with the docker port forwarding I suppose.
Here’s my config.properties:
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
metrics_address=http://0.0.0.0:8082
number_of_netty_threads=32
job_queue_size=1000
model_store=/home/model-server/model-store
load_models=all
install_py_dep_per_model=true # since I have cv2 dependency
This is my handler:
import torch
import base64
from ts.torch_handler.base_handler import BaseHandler
from torchvision import transforms
from PIL import Image
import cv2
import numpy as np
class ModelHandler(BaseHandler):
def initialize(self, context):
super().initialize(context)
self.transform = transforms.Compose([
transforms.Resize((72, 54)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
])
def preprocess(self, data):
images = []
for instance in data:
try:
cv2_encoded_bytes = instance.get("data") or instance.get("body")
except:
cv2_encoded_bytes = instance
if isinstance(cv2_encoded_bytes, (bytearray, bytes)):
cv2_decoded_bytes = base64.b64decode(cv2_encoded_bytes)
cv2_decoded = np.frombuffer(cv2_decoded_bytes, dtype = np.uint8)
cv2_image = cv2.imdecode(cv2_decoded, 1)
# Im sending cv2.imcode as the model inputs to have fewer network bottlenecks
# hence using cv2.imdecode to get back the image
image = Image.fromarray(cv2_image)
image = self.transform(image)
else:
image = torch.FloatTensor(image)
images.append(image)
return torch.stack(images).to(self.device)
def postprocess(self, inference_output): # same as BaseHandler
return inference_output.tolist()
And below is my client file that has the logic of calling the model from http and grpc:
import cv2
import base64
import numpy as np
import requests
from time import time
import grpc
import inference_pb2
import inference_pb2_grpc
import management_pb2_grpc
cv2_image = np.ones((512, 512, 3), dtype = np.uint8)
h, w, c = cv2_image.shape
has_encoded, cv2_encoded = cv2.imencode('.jpg', cv2_image, [int(cv2.IMWRITE_JPEG_QUALITY), 50])
cv2_encoded_bytes = base64.b64encode(cv2_encoded)
def infer(stub, model_name, model_input):
return stub.Predictions(
inference_pb2.PredictionsRequest(
model_name = model_name,
input = {'data': model_input}
)
)
if __name__ == '__main__':
channel = grpc.insecure_channel('localhost:7070')
stub = inference_pb2_grpc.InferenceAPIsServiceStub(channel)
times = []
for _ in range(100):
start = time()
response = requests.post("http://localhost:8080/predictions/face-recognition/", data = cv2_encoded_bytes)
# response = infer(stub, 'face-recognition', cv2_encoded_bytes)
took = time() - start
times.append(took)
print(f'Took \t: {took}')
times = np.array(times)
print(f'Mean time \t: {np.mean(times)}')
print(f'Median time \t: {np.median(times)}')
Outputs with http when running from host:
Mean time : 0.025488979816436767
Median time : 0.023711800575256348
Outputs with grpc when running from host:
Mean time : 0.07445537805557251
Median time : 0.07411670684814453
Outputs with http when running within the container:
Mean time : 0.02136315107345581
Median time : 0.02019190788269043
Outputs with grpc when running within the container:
Mean time : 0.019478685855865478
Median time : 0.01735556125640869
Any way I can get around this latency?
Thanks for your time.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Top GitHub Comments
@braindotai pls feel free to reopen this ticket if there are any further issues.
@braindotai Thank you for the update. The docker image is built on ubuntu. You can try building a docker image on windows to see if it can help.