Custom pip package installation failed while running torchserve with custom library
See original GitHub issue2020-10-23 21:31:48,131 [WARN ] main org.pytorch.serve.ModelServer - Failed to load model: mymodel_docker_import.mar org.pytorch.serve.archive.ModelException: Custom pip package installation failed for mymodel
Context
- torchserve version: 0.2.0
- torch version: 1.6.0
- Operating System and version: ubuntu:18.04
Your Environment
- Installed using source? [yes/no]: no
- Are you planning to deploy it using docker container? [yes/no]: yes
- Is it a CPU or GPU environment?: CPU
- Using a default/custom handler? [If possible upload/share custom handler/model]: custom
- What kind of model is it e.g. vision, text, audio?: vision
- Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.? [If public url then provide link.]: local models from model-store
- Provide config.properties, logs [ts.log] and parameters used for model registration/update APIs: OK
Current Behavior
My custom handler uses opencv library. I edited config.properties and used –requirements-file parameter in torch-model-archiver. It fails to get a response from a running torchserve
Steps to Reproduce
Step. 1.
docker run --rm -it -p 8080:8080 -p 8081:8081 \
-v $(pwd)/model_store:/home/model-server/model-store \
-v $(pwd)/scripts:/home/model-server/scripts \
-v $(pwd)/handlers:/home/model-server/handlers \
--name mar torchserve:latest
Step. 2.
docker exec -it --user root mar /bin/bash
torch-model-archiver --model-name mymode_docker_import --version 1.0 \
--serialized-file $(pwd)/scripts/model.pt --handler $(pwd)/handlers/mymodel_handler.py \
--export-path $(pwd)/model-store --requirements-file $(pwd)/handlers/requirements.txt
Step. 2.1.
requirements.txt is
opencv-python==4.4.0.42
Step. 3.
torchserve --start --model-store model-store --models mymodel=mymodel_docker_import.mar --ts-config $(pwd)/model-store/config.properties
Step. 3.1.
config.properties is
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
default_workers_per_model=2
install_py_dep_per_model=true
Step. 4.
curl http://localhost:8080/ping responds:
{ “status”: “Healthy”}
Step. 5.
curl localhost:8081/models/ responds:
{
"models": [
{
"modelName": "mymodel",
"modelUrl": "mymodel_docker_import.mar"
}
]
}
BUT when I send a request I get 503 Error
2020-10-23 22:28:18,812 [INFO ] epollEventLoopGroup-3-4 ACCESS_LOG - /172.17.0.1:43698 "GET /models/ HTTP/1.1" 200 0
2020-10-23 22:28:18,812 [INFO ] epollEventLoopGroup-3-4 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:df5072c52f11,timestamp:null
2020-10-23 22:28:25,794 [INFO ] epollEventLoopGroup-3-1 ACCESS_LOG - /172.17.0.1:55526 **"POST /predictions/mymodel HTTP/1.1" 503** 80
2020-10-23 22:28:25,794 [INFO ] epollEventLoopGroup-3-1 TS_METRICS - Requests5XX.Count:1|#Level:Host|#hostname:df5072c52f11,timestamp:null
Failure Logs [if any]
Torchserve version: 0.2.0 TS Home: /home/venv/lib/python3.6/site-packages Current directory: /home/model-server Temp directory: /home/model-server/tmp Number of GPUs: 0 Number of CPUs: 2 Max heap size: 984 M Python executable: /home/venv/bin/python3 Config file: /home/model-server/model-store/config.properties Inference address: http://0.0.0.0:8080 Management address: http://0.0.0.0:8081 Metrics address: http://127.0.0.1:8082 Model Store: /home/model-server/model-store Initial Models: mymodel=mymodel_docker_import.mar Log dir: /home/model-server/logs Metrics dir: /home/model-server/logs Netty threads: 0 Netty client threads: 0 Default workers per model: 2 Blacklist Regex: N/A Maximum Response Size: 6553500 Maximum Request Size: 6553500 Prefer direct buffer: false Custom python dependency for model allowed: true Metrics report format: prometheus Enable metrics API: true 2020-10-23 22:13:45,351 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: mymodel_docker_import.mar 2020-10-23 22:13:50,305 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag 9da49c85305c4c4a8081cd0be6bb5826 2020-10-23 22:13:50,328 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model mymodel 2020-10-23 22:13:50,328 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model mymodel 2020-10-23 22:13:50,329 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model mymodel loaded. 2020-10-23 22:14:08,732 [WARN ] main org.pytorch.serve.ModelServer - Failed to load model: mymodel_docker_import.mar org.pytorch.serve.archive.ModelException: Custom pip package installation failed for mymodel at org.pytorch.serve.wlm.ModelManager.setupModelDependencies(ModelManager.java:190) at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:125) at org.pytorch.serve.ModelServer.initModelStore(ModelServer.java:213) at org.pytorch.serve.ModelServer.start(ModelServer.java:308) at org.pytorch.serve.ModelServer.startAndWait(ModelServer.java:104) at org.pytorch.serve.ModelServer.main(ModelServer.java:85) 2020-10-23 22:14:08,741 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel. 2020-10-23 22:14:08,884 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080 2020-10-23 22:14:08,885 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel. 2020-10-23 22:14:08,902 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081 2020-10-23 22:14:08,902 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel. 2020-10-23 22:14:08,910 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082 Model server started. 2020-10-23 22:14:09,079 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:df5072c52f11,timestamp:1603491249
Issue Analytics
- State:
- Created 3 years ago
- Comments:7

Top Related StackOverflow Question
@veronikayurchuk Try this Dockerfile (#724). With the current version there is a problem with the permissions of the virtual python environment and the user of the container. In my case I built the container with those changes and it worked fine.
Replace the
opencv-pythondependency withopencv-python-headless.