Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Java index out of bounds exception when running many requests through server

See original GitHub issue

Context

Trying to loadtest a torch serve model to gauge performance on a custom handler.

torchserve version: 0.2.0
torch version: 1.6.0
java version: openjdk 11.0.8
Operating System and version: Debian via the python 3.7-buster image.

Your Environment

Are you planning to deploy it using docker container? [yes/no]: yes
Is it a CPU or GPU environment?: CPU
Using a default/custom handler? custom
What kind of model is it e.g. vision, text, audio?: feed forward for custom input.
Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.? from model store
Provide config.properties, logs [ts.log] and parameters used for model registration/update APIs: number of netty threads=32

Expected Behavior

Expected torch serve not to throw this error or understand what properties of the environment I could change to address it. It only seems to happen on medium load.

Current Behavior

With a load of ~5rps and varying batch size and CPU memory and count allocations the server will throw an errors in ~4%+ of requests.

Failure Logs [if any]

2020-10-17 00:16:41,887 [INFO ] epollEventLoopGroup-5-3 org.pytorch.serve.wlm.WorkerThread - 9002 Worker disconnected. WORKER_MODEL_LOADED 2020-10-17 00:16:41,887 [ERROR] epollEventLoopGroup-5-3 org.pytorch.serve.wlm.WorkerThread - Unknown exception io.netty.handler.codec.DecoderException: java.lang.IndexOutOfBoundsException: readerIndex(1021) + length(4) exceeds writerIndex(1024): PooledUnsafeDirectByteBuf(ridx: 1021, widx: 1024, cap: 1024) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:471) at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:404) at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:371) at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.lang.IndexOutOfBoundsException: readerIndex(1021) + length(4) exceeds writerIndex(1024): PooledUnsafeDirectByteBuf(ridx: 1021, widx: 1024, cap: 1024) at io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1477) at io.netty.buffer.AbstractByteBuf.readInt(AbstractByteBuf.java:810) at org.pytorch.serve.util.codec.ModelResponseDecoder.decode(ModelResponseDecoder.java:56) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)

Thank you in advance for any help you can provide!

Issue Analytics

State:
Created 3 years ago
Comments:38 (16 by maintainers)

Top GitHub Comments

1reaction

punshrivcommented, Dec 15, 2020

@harshbafna I was debugging this further and observed the following: Python backend sends the complete response of all the batched request , but when the frontend server gets its , its fragemented. Example for the below scenario , for the total response size of 500777 , the Message decoder gets the fragments

2020-12-15 03:36:08,577 [INFO ] W-9000-bert-base-nli-mean-tokens-embeddings_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - #DEBUG 500777 2020-12-15 03:36:08,577 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.util.codec.ModelResponseDecoder - #DEBUG SIZE 65536 2020-12-15 03:36:08,577 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.util.codec.ModelResponseDecoder - #DEBUG SIZE 131072 2020-12-15 03:36:08,577 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.util.codec.ModelResponseDecoder - #DEBUG SIZE 196608 2020-12-15 03:36:08,578 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.util.codec.ModelResponseDecoder - #DEBUG SIZE 262144 2020-12-15 03:36:08,578 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.util.codec.ModelResponseDecoder - #DEBUG SIZE 327680 2020-12-15 03:36:08,580 [ERROR] epollEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - Unknown exception io.netty.handler.codec.DecoderException: java.lang.IndexOutOfBoundsException: readerIndex(327678) + length(4) exceeds writerIndex(327680): PooledUnsafeDirectByteBuf(ridx: 327678, widx: 327680, cap: 524288)

I suspect the issue is caused due to incorrect decoding of these fragments , What are your throught on this ? Shouldnt the reassembly of these fragments be done at a lower level and then be decoded at the application level ?

1reaction

punshrivcommented, Dec 14, 2020

@harshbafna

Input to test:

curl -X POST
http://XXXXXXXXX/predictions/distilbert-base-uncased-distilled-squad-reader/3.0
-H ‘content-type: application/json’
-d ‘{ “q_id”: “1”, “text”:“Who is the prime minister of India ?”, “content”: “Narendra Damodardas Modi is an Indian politician serving as the 14th and current Prime Minister of India since 2014” }’

This should return the following response:

{ “q_id”: “1”, “answer”: “narendra damodardas modi” }

config.properties: inference_address=http://0.0.0.0:8080 management_address=http://0.0.0.0:8081 metrics_address=http://0.0.0.0:8082 number_of_netty_threads=32 job_queue_size=1000 model_store=/home/model-server/model-store