Scaling stopped on dynamic stream
See original GitHub issueProblem description On a stream with dynamic scaling, the scaling stopped after some time:
controller.stdout.namor.stopped_scaling.log
The controller logs have repeating errors of the following kind:
2018-06-04 18:33:12,893 21334802 [controllerpool-55] WARN i.p.c.s.e.r.ScaleOperationTask - processing scale request for hulk/smallScaleYoung segments [117] failed {}
io.pravega.controller.store.stream.StoreException$OperationNotAllowedException: Stream: smallScaleYoung State: SCALING
at io.pravega.controller.store.stream.StoreException.create(StoreException.java:102)
at io.pravega.controller.store.stream.StoreException.create(StoreException.java:70)
at io.pravega.controller.store.stream.PersistentStreamBase.lambda$updateState$30(PersistentStreamBase.java:269)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at io.pravega.controller.store.stream.ZKStoreHelper.lambda$null$8(ZKStoreHelper.java:156)
at io.pravega.controller.store.stream.ZKStoreHelper.lambda$callback$24(ZKStoreHelper.java:326)
at org.apache.curator.framework.imps.Backgrounding$1$1.run(Backgrounding.java:158)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
2018-06-04 18:33:12,903 21334812 [epollEventLoopGroup-2-12] INFO i.p.c.t.Stream.StreamMetadataTasks - event posted successfully
2018-06-04 18:33:12,903 21334812 [controllerpool-45] WARN i.p.c.e.i.ConcurrentEventProcessor - ConcurrentEventProcessor Processing failed java.util.concurrent.CompletionException
2018-06-04 18:33:12,903 21334812 [controllerpool-45] ERROR i.p.c.e.i.ConcurrentEventProcessor - ConcurrentEventProcessor Processing failed, exiting {}
java.util.concurrent.CompletionException: io.pravega.controller.store.stream.StoreException$OperationNotAllowedException: Stream: smallScaleYoung State: SCALING
at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at io.pravega.controller.eventProcessor.impl.SerializedRequestHandler.lambda$run$1(SerializedRequestHandler.java:94)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at io.pravega.controller.server.eventProcessor.requesthandlers.StreamRequestHandler.lambda$null$4(StreamRequestHandler.java:129)
at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at io.pravega.controller.task.Stream.StreamMetadataTasks.lambda$writeEvent$51(StreamMetadataTasks.java:545)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at io.pravega.client.segment.impl.SegmentOutputStreamImpl$ResponseProcessor.ackUpTo(SegmentOutputStreamImpl.java:385)
at io.pravega.client.segment.impl.SegmentOutputStreamImpl$ResponseProcessor.dataAppended(SegmentOutputStreamImpl.java:345)
at io.pravega.shared.protocol.netty.WireCommands$DataAppended.process(WireCommands.java:631)
at io.pravega.shared.protocol.netty.ReplyProcessor.process(ReplyProcessor.java:20)
at io.pravega.client.netty.impl.ClientConnectionInboundHandler.channelRead(ClientConnectionInboundHandler.java:98)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.pravega.shared.protocol.netty.ExceptionLoggingHandler.channelRead(ExceptionLoggingHandler.java:37)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:797)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:404)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:304)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.pravega.controller.store.stream.StoreException$OperationNotAllowedException: Stream: smallScaleYoung State: SCALING
at io.pravega.controller.store.stream.StoreException.create(StoreException.java:102)
at io.pravega.controller.store.stream.StoreException.create(StoreException.java:70)
at io.pravega.controller.store.stream.PersistentStreamBase.lambda$updateState$30(PersistentStreamBase.java:269)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at io.pravega.controller.store.stream.ZKStoreHelper.lambda$null$8(ZKStoreHelper.java:156)
at io.pravega.controller.store.stream.ZKStoreHelper.lambda$callback$24(ZKStoreHelper.java:326)
at org.apache.curator.framework.imps.Backgrounding$1$1.run(Backgrounding.java:158)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 common frames omitted
Problem location Controller
Suggestions for an improvement Dynamic scaling should work
Issue Analytics
- State:
- Created 5 years ago
- Comments:9 (9 by maintainers)
Top Results From Across the Web
Spark Streaming: Dynamic Scaling And Backpressure in Action
Dynamic Scaling continues There are some failed tasks, but our spark configuration prevents spark from killing the application, continuing to ...
Read more >Amazon Kinesis dynamically stream resize - Stack Overflow
This is npm module which scale amazon kinesis as per current traffic needs. This module continuously monitor traffic in kinesis stream and split ......
Read more >Auto scaling Amazon Kinesis Data Streams using Amazon ...
Scaling your streams manually can create a lot of operational overhead. If you leave your streams overprovisioned, costs can increase.
Read more >Troubleshoot Dataflow autoscaling | Google Cloud
Scaling up stops Your batch or streaming job starts scaling up, but the workers stop scaling up even though a backlog remains. This...
Read more >Dynamic Scaling and Backpressure - Learning Publicly
An important note: This article is about backpressure and dynamic allocation in spark streaming and not normal batch jobs.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I fixed that 2 days ago. Please update to latest master.
These are the logs from another failure. The scaling stopped on stream
smallScaleSlow
(the last one happened at about 15:50 log time, so there should be next one at 16:00 or so). This was a two-controller setup. One controller: pravega-pravega_controller–917328ec-94e0-459a-bd7d-41204bbc427c.txt Another controller: pravega-pravega_controller-p1-fa51f0e4-8510-4dc4-8abe-d41563f0c703.txt pravega-pravega_controller-p2-fa51f0e4-8510-4dc4-8abe-d41563f0c703.txt pravega-pravega_controller-p3-fa51f0e4-8510-4dc4-8abe-d41563f0c703.txt pravega-pravega_controller-p4-fa51f0e4-8510-4dc4-8abe-d41563f0c703.txtAnd there were three segment stores: pravega-pravega_segmentstore–861f7c34-f887-4aac-b12f-e3e576db9b02.txt pravega-pravega_segmentstore–ba07b3ff-b20f-43f7-b6da-f6edca9e3402.txt pravega-pravega_segmentstore–e9ceb1fd-969f-42f8-8a3f-2588e6b6def9.txt