Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shared Memory client fails for batch size != 1

See original GitHub issue

Requests with batch size 1 and correct input_byte_size calculation work correctly.

Increasing batch size to two and multiply input_byte_size by two leads to the following exception (modified version of simple_shm_client.cc):

int batch_size = 2;
options->SetBatchSize(batch_size);
size_t input_byte_size = 608 * 608 * 3 * sizeof(float) * batch_size;

failed setting shared memory input: [ 0] INVALID_ARG - The input ‘000_net’ has shared memory of size 8871936 bytes while the expected size is 4435968 bytes

So the expected size does not take batch size into account. Matching the expected size then fails during a different sanity check:

int batch_size = 2;
options->SetBatchSize(batch_size);
size_t input_byte_size = 608 * 608 * 3 * sizeof(float);

error: unable to run model: [inference:0 6] INVALID_ARG - unexpected shared-memory size 4435968 for input ‘000_net’, expecting 8871936 for model ‘yolov3’

So now batch size is taken into account for the expected byte size.

Issue Analytics

State:
Created 4 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

2reactions

CoderHamcommented, Aug 12, 2019

The fix to this is simple. I shall test it and create a new PR for it.

1reaction

philipp-schmidtcommented, Aug 13, 2019

Thanks, very much appreciated @CoderHam !

Top Results From Across the Web

Shared Memory client fails for batch size != 1 #544 - GitHub

The error shows, that while I set the correct input_byte_size of the full batch (~8.8MB), the client sanity checks always expect batch size...

Why am I getting memory allocation error even on batch size ...

My model crashes with memory allocation error on tensor [1,16,1536,1536]. Using the equation given in the article above I've calculated the ...

Shared Memory Problem (unable to allocate ... - Ask TOM

Bind variables are SO MASSIVELY important -- I cannot in any way shape or form OVERSTATE their importance. Same with the PLSQL call...

Troubleshooting TensorFlow - TPU - Google Cloud

Batch size or model too large. Possible Cause of Memory Issue. When training a neural network on a CPU, GPU, or TPU, the...

CUDA C++ Best Practices Guide

Code samples throughout the guide omit error checking for conciseness. ... memory and the device memories of all installed supported devices share a...