question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Event based scaling stalls with RetriesExhaustedException

See original GitHub issue

While trying Event based scaling with samples, scaling stalls with RetriesExhaustedException after first 10 mins of segment scale. Scale configurations are tried with 50 writter, 5 reader, min segement 30 and scale factor 4.

Pravega details:

Pravega version : 0.5.0-2252.c180711
Zookeeper Operator : pravega/zookeeper-operator:0.2.1
Pravega Operator: pravega/pravega-operator:0.3.2
root@pravega-benchmark:/pravega-puncher/scripts# ./runFatEventScaler | grep ===
18:05:07.193 [pool-2-thread-1] INFO io.pravega.puncher.streams.scaling.ScalingCase - ============ segments: 30
18:17:07.166 [pool-2-thread-1] INFO io.pravega.puncher.streams.scaling.ScalingCase - ============ segments: 53
Exception in thread "Thread-51" java.util.concurrent.CompletionException: io.pravega.common.util.RetriesExhaustedException: java.util.concurrent.CompletionException: io.pravega.shared.protocol.netty.ConnectionFailedException
        at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:375)
        at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1934)
        at io.pravega.client.segment.impl.SegmentInputStreamImpl.handleRequest(SegmentInputStreamImpl.java:137)
        at io.pravega.client.segment.impl.SegmentInputStreamImpl.read(SegmentInputStreamImpl.java:122)
        at io.pravega.client.segment.impl.EventSegmentReaderImpl.readEvent(EventSegmentReaderImpl.java:75)
        at io.pravega.client.segment.impl.EventSegmentReaderImpl.read(EventSegmentReaderImpl.java:62)
        at io.pravega.client.stream.impl.EventStreamReaderImpl.readNextEventInternal(EventStreamReaderImpl.java:117)
        at io.pravega.client.stream.impl.EventStreamReaderImpl.readNextEvent(EventStreamReaderImpl.java:90)
        at io.pravega.puncher.streams.scaling.ScaledEventReader.run(ScaledEventReader.java:41)
        at java.lang.Thread.run(Thread.java:748)
Caused by: io.pravega.common.util.RetriesExhaustedException: java.util.concurrent.CompletionException: io.pravega.shared.protocol.netty.ConnectionFailedException

Segment stores are showing following exceptions on these streams repeatedly:

2019-05-24 18:39:46,006 253008127 [core-9] INFO  i.p.s.s.h.handler.AppendProcessor - Segment 'puncherScope3/puncherStream3/6.#epoch.0' is sealed and cddc038c-7921-4ba5-a853-3e73340edb33 cannot perform operation 'appending data'.
2019-05-24 18:39:46,105 253008226 [epollEventLoopGroup-11-3] ERROR i.p.s.s.h.h.ServerConnectionInboundHandler - Caught exception on connection:
io.netty.handler.codec.DecoderException: io.pravega.shared.protocol.netty.InvalidMessageException: AppendBlockEnd without AppendBlock.
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:98)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
        at io.pravega.shared.protocol.netty.ExceptionLoggingHandler.channelRead(ExceptionLoggingHandler.java:37)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799)
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:421)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:321)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)
Caused by: io.pravega.shared.protocol.netty.InvalidMessageException: AppendBlockEnd without AppendBlock.
        at io.pravega.shared.protocol.netty.AppendDecoder.processCommand(AppendDecoder.java:102)
        at io.pravega.shared.protocol.netty.AppendDecoder.decode(AppendDecoder.java:55)
        at io.pravega.shared.protocol.netty.AppendDecoder.decode(AppendDecoder.java:36)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
        ... 28 common frames omitted

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:13 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
RaulGraciacommented, May 25, 2019

@sumit-bm note that the AppendBlockEnd without AppendBlock issue has been fixed in #3820 and cherry-picked to r0.5 (the version you have tested does not incorporate this change). Please, reproduce the experiment with the last version of 0.5 or master to verify if the issue is still present.

0reactions
andreipaduroiucommented, Jun 10, 2019

I’m closing this due to not being able to reproduce. Please create a new issue and attach new logs if this happens again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

KEDA | Kubernetes Event-driven Autoscaling
KEDA is a Kubernetes-based Event Driven Autoscaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number...
Read more >
Event Driven Autoscaling | giffgaff.io
KEDA is a Kubernetes-based Event Driven Autoscaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number ......
Read more >
KEDA: Kubernetes Event-Driven Autoscaling - YouTube
KEDA (Kubernetes Event - Driven Autoscaling ) might be the solution for all (horizontal) scaling needs. Kubernetes Horizontal Pod Autoscaler ...
Read more >
Intro to Kubernetes-based event-driven autoscaling (KEDA)
In Kubernetes, this is equivalent to scaling a deployment to add more pods. You can do it manually, but the Horizontal Pod Autoscaler...
Read more >
Horizontal Pod Autoscaling | Kubernetes
Any HPA target can be scaled based on the resource usage of the pods in the scaling target. When defining the pod specification...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found