Model version is not defined
See original GitHub issueIssue
I want to serve 2 models using torch serve but, torchserve is throwing an error
org.pytorch.serve.archive.InvalidModelException: Model version is not defined.
following are the commands i used to create model-archive
I only have one
.mar
file per model
both my models are of different version
model 1
torch-model-archiver --model-name centernet_resnet18 --version 1.1 --serialized-file models/res18_model.pth --extra-files index_to_name.json --model-file model.py --handler model_handler.py --requirements-file requirements.txt -f
model 2
torch-model-archiver --model-name yolov5x --version 1 --serialized-file exp35_last.torchscript.pt --extra-files index_to_name.json --handler model_handler.py -f
config
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
number_of_netty_threads=32
job_queue_size=1000
default_workers_per_model=2
async_logging=true
model_store=/home/model-server/model-store
torch serve
torchserve --start --ts-config /home/model-server/config.properties --models yolov5x.mar --ncs
-
torchserve version: 0.2.0
-
torch-model-archiver: 0.2.0 install using
pip install torchserve==0.2.0 torch-model-archiver==0.2.0
-
torch version:1.6.0
-
torchvision version: 0.7.0
-
torchtext version: 0.6.0
-
java version: 11
-
Operating System and version: ubuntu 18
Expected Behavior
how do i tell torch-serve
which version to use for each model
I cant find any doc related to this
Current Behavior
torch-serve
throwing error asking for default model version
I am facing this issue in docker container only on local its working file and assigning default version by itself
Failure Logs
2020-09-03 06:13:12,307 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.2.0
TS Home: /usr/local/lib/python3.6/dist-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 8
Max heap size: 1962 M
Python executable: /usr/bin/python3
Config file: /home/model-server/config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://127.0.0.1:8082
Model Store: /home/model-server/model-store
Initial Models: all
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 2
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Prefer direct buffer: false
Custom python dependency for model allowed: false
Metrics report format: prometheus
Enable metrics API: true
2020-09-03 06:13:12,364 [DEBUG] main org.pytorch.serve.ModelServer - Loading models from model store: centernet_resnet18.mar
2020-09-03 06:13:12,375 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag 59f22da511324ca39be78e387049df13
2020-09-03 06:13:12,377 [WARN ] main org.pytorch.serve.ModelServer - Failed to load model: /home/model-server/model-store/centernet_resnet18.mar
org.pytorch.serve.archive.InvalidModelException: Model version is not defined.
at org.pytorch.serve.archive.ModelArchive.validate(ModelArchive.java:158)
at org.pytorch.serve.wlm.ModelManager.createModelArchive(ModelManager.java:156)
at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:115)
at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:68)
at org.pytorch.serve.ModelServer.initModelStore(ModelServer.java:177)
at org.pytorch.serve.ModelServer.start(ModelServer.java:308)
at org.pytorch.serve.ModelServer.startAndWait(ModelServer.java:104)
at org.pytorch.serve.ModelServer.main(ModelServer.java:85)
2020-09-03 06:13:12,379 [DEBUG] main org.pytorch.serve.ModelServer - Loading models from model store: yolov5x.mar
2020-09-03 06:13:12,380 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag 7a1253dd425d47d7b7623c91078d4e25
2020-09-03 06:13:12,380 [WARN ] main org.pytorch.serve.ModelServer - Failed to load model: /home/model-server/model-store/yolov5x.mar
org.pytorch.serve.archive.InvalidModelException: Model version is not defined.
at org.pytorch.serve.archive.ModelArchive.validate(ModelArchive.java:158)
at org.pytorch.serve.wlm.ModelManager.createModelArchive(ModelManager.java:156)
at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:115)
at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:68)
at org.pytorch.serve.ModelServer.initModelStore(ModelServer.java:177)
at org.pytorch.serve.ModelServer.start(ModelServer.java:308)
at org.pytorch.serve.ModelServer.startAndWait(ModelServer.java:104)
at org.pytorch.serve.ModelServer.main(ModelServer.java:85)
2020-09-03 06:13:12,383 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2020-09-03 06:13:12,476 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080
2020-09-03 06:13:12,476 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2020-09-03 06:13:12,478 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081
2020-09-03 06:13:12,478 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2020-09-03 06:13:12,479 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (5 by maintainers)
Top GitHub Comments
The models are registeres with
model_name
-version
combination. In your case, both the models have a different name and hence will be registered as two different models, each having one version of that model available.To register multiple versions of the same model, you will need to keep the value for
--model-name
parameter same while creating the mar file.To register specific models at the startup, you can provide the list of modes using
--models
parameter on the command line or usingload_models
parameter inconfig.properties
files. For more details refer TorchServe configuration documentationSOLVED model-archives were corrupted when downloading that was the reason it failed to load the
manifest
fileSUGGESTION if the error message can be more relevant when it cant find
manifest
file