question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Torchserve_sanity script fails for MNIST models

See original GitHub issue

Context

python torchserve_sanity.py fails for MNIST models in the gradle build inside a conda environment of a fresh ubuntu instance. torchserve==0.2.0 torch-model-archiver==0.2.0

Python version: 3.8 (64-bit runtime) Python executable: /home/ubuntu/anaconda3/envs/fbserve/bin/python3

Versions of relevant python libraries: numpy==1.19.2 torch==1.4.0

Java Version: openjdk 11.0.5 2019-10-15 OpenJDK Runtime Environment (build 11.0.5+10-post-Ubuntu-2ubuntu116.04) OpenJDK 64-Bit Server VM (build 11.0.5+10-post-Ubuntu-2ubuntu116.04, mixed mode, sharing)

OS: Ubuntu 16.04.7 LTS GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

Your Environment

  • Installed using source? [yes/no]: yes
  • Are you planning to deploy it using docker container? [yes/no]: no
  • Is it a CPU or GPU environment?: CPU
  • Using a default/custom handler? [If possible upload/share custom handler/model]: NA
  • What kind of model is it e.g. vision, text, audio?: vision
  • Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.? [If public url then provide link.]: NA
  • Provide config.properties, logs [ts.log] and parameters used for model registration/update APIs: NA
  • Link to your project [if any]: NA

Expected Behavior

The torchserve_sanity.py should pass for the master branch.

Current Behavior

It fails with the cases:

1) TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testLoadMNISTEagerModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1698
2) TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testPredictionMNISTEagerModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1727
3) TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testUnregistedMNISTEagerModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1650
4) TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testLoadMNISTScriptedModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1698

TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testPredictionMNISTScriptedModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1727
5)TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testUnregistedMNISTScriptedModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1650
6) TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testLoadMNISTTracedModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1698
7) TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testPredictionMNISTTracedModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1727

TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testUnregistedMNISTTracedModel STANDARD_OUT
    2020-11-25 15:03:48,323 [WARN ] epollEventLoopGroup-3-5 org.pytorch.serve.wlm.ModelManager - Model not found: mnist_traced
    2020-11-25 15:03:48,323 [INFO ] epollEventLoopGroup-3-5 ACCESS_LOG - /127.0.0.1:42678 "DELETE /models/mnist_traced HTTP/1.1" 404 0
    2020-11-25 15:03:48,323 [INFO ] epollEventLoopGroup-3-5 TS_METRICS - Requests4XX.Count:1|#Level:Host|#hostname:ip-172-31-61-118,timestamp:null

TorchServeSuite > TorchServe > org.pytorch.serve.ModelServerTest > testUnregistedMNISTTracedModel FAILED
    java.lang.AssertionError at ModelServerTest.java:1650

Steps to Reproduce

  1. Install java jdk
  2. Install the requirements as mentioned in the comment
  3. Run python torchserve_sanit.py

Failure Logs [if any]

ec2_explain.log master_ec2_c5_sanity_fail.log

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jeremiahschungcommented, Dec 4, 2020

The recommendation from the pytorch release eng team is to use one of the two options:

0reactions
harshbafnacommented, Dec 9, 2020

Issues related to install_dependency on CPU and different CUDA environments has been fixed as part of #836.

Closing the issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

MNIST Benchmark (Image Classification) - Papers With Code
Rank Model Percentage error Accuracy Trainable Pa... 1 Heterogeneous ensemble with simple CNN 0.09 99.91 4 Efficient‑CapsNet 0.16 99.84 161,824 5 SOPCNN (Only a single Model)...
Read more >
20. Training and Testing with MNIST | Machine Learning
The MNIST dataset is used by researchers to test and compare their research results with others. The lowest error rates in literature are...
Read more >
Solve the MNIST Image Classification Problem | by Rakshit Raj
In this article, we will design a neural network that uses the Python library Keras to learn to classify handwritten digits of the...
Read more >
Training a neural network on MNIST with Keras - TensorFlow
Step 2: Create and train the model. Plug the TFDS input pipeline into a simple Keras model, compile the model, and train it....
Read more >
Failing to download MNIST dataset at load_data() #33285
The programs run fine on Colaboratory but if I try to run in locally on Terminal, it fails. Seems to be a certification...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found