Upgrade from 2.0.10 to 2.0.11 puts Cloudflow-Operator in Crashloop
See original GitHub issueDescribe the bug When upgrading from 2.0.10 to 2.0.11 following the official docs the Cloudflow operator enters a Crashloop, always reporting:
2020-10-20 10:34:34,771 INFO [Materializer] - [action] Element: Executed update action for CloudflowApplication/tom-application-one/tom-application-one
2020-10-20 10:34:34,771 INFO [StatusChangeEvent] - [Status changes] Detected Pod tom-application-one-archive.tom-application-one-streamlet-bar-7wrrn6: (Pod tom-application-one-streamlet-bar-7wrrn6 ADDED).
2020-10-20 10:34:34,772 INFO [StatusChangeEvent] - [Status changes] Detected StatusChangeEvent for tom-application-one-archive.tom-application-one-streamlet-bar-7wrrn6: (Pod tom-application-one-streamlet-bar-7wrrn6 ADDED).
2020-10-20 10:34:34,775 INFO [StatusChangeEvent] - [Status changes] Handling StatusChange for tom-application-one: (Pod tom-application-one-streamlet-foo-759f4zmwl ADDED).
2020-10-20 10:34:34,775 INFO [StatusChangeEvent] - [Status changes] app: tom-application-one status of streamlet streamlet-foo changed: (Pod tom-application-one-streamlet-foo-759f4zmwl ADDED)
2020-10-20 10:34:35,368 ERROR [Materializer] - [action] Upstream failed.
cloudflow.operator.action.ActionException: Action provide failed: String: 3: Expecting end of input or a comma, got '=' (if you intended '=' to be part of a key or string value, try enclosing the key or value in double quotes, or you may be able to rename the file .properties rather than .conf)
at cloudflow.operator.action.SkuberActionExecutor$$anonfun$execute$2.applyOrElse(SkuberActionExecutor.scala:50)
at cloudflow.operator.action.SkuberActionExecutor$$anonfun$execute$2.applyOrElse(SkuberActionExecutor.scala:48)
at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:417)
at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:56)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:93)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:93)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:48)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: com.typesafe.config.ConfigException$Parse: String: 3: Expecting end of input or a comma, got '=' (if you intended '=' to be part of a key or string value, try enclosing the key or value in double quotes, or you may be able to rename the file .properties rather than .conf)
at com.typesafe.config.impl.ConfigDocumentParser$ParseContext.parseError(ConfigDocumentParser.java:201)
at com.typesafe.config.impl.ConfigDocumentParser$ParseContext.parseError(ConfigDocumentParser.java:197)
at com.typesafe.config.impl.ConfigDocumentParser$ParseContext.parseObject(ConfigDocumentParser.java:535)
at com.typesafe.config.impl.ConfigDocumentParser$ParseContext.parse(ConfigDocumentParser.java:648)
at com.typesafe.config.impl.ConfigDocumentParser.parse(ConfigDocumentParser.java:14)
at com.typesafe.config.impl.Parseable.rawParseValue(Parseable.java:262)
at com.typesafe.config.impl.Parseable.rawParseValue(Parseable.java:250)
at com.typesafe.config.impl.Parseable.parseValue(Parseable.java:180)
at com.typesafe.config.impl.Parseable.parseValue(Parseable.java:174)
at com.typesafe.config.impl.Parseable.parse(Parseable.java:301)
at com.typesafe.config.ConfigFactory.parseString(ConfigFactory.java:1102)
at com.typesafe.config.ConfigFactory.parseString(ConfigFactory.java:1112)
at cloudflow.operator.event.ConfigInputChangeEvent$.getConfigFromSecret(ConfigInputChangeEvent.scala:86)
at cloudflow.operator.action.TopicActions$.createActionFromKafkaConfigSecret(TopicActions.scala:135)
at cloudflow.operator.action.TopicActions$.$anonfun$action$2(TopicActions.scala:102)
at cloudflow.operator.action.ProvidedAction.$anonfun$executeWithRetry$1(Action.scala:404)
at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:307)
...
[ERROR] [10/20/2020 10:34:35.969] [default-akka.actor.default-dispatcher-8] [akka.actor.ActorSystemImpl(default)] The configuration input stream failed, terminating. (akka.stream.AbruptStageTerminationException: GraphStage [akka.stream.impl.fusing.GraphStages$IgnoreSink$$anon$10@247947d7] terminated abruptly, caused by for example materializer or actor system termination.)
[ERROR] [10/20/2020 10:34:35.967] [default-akka.actor.default-dispatcher-5] [akka.actor.ActorSystemImpl(default)] The config updates stream failed, terminating. (akka.stream.AbruptStageTerminationException: GraphStage [akka.stream.impl.fusing.GraphStages$IgnoreSink$$anon$10@59a35aa7] terminated abruptly, caused by for example materializer or actor system termination.)
[ERROR] [10/20/2020 10:34:35.974] [default-akka.actor.default-dispatcher-6] [akka.actor.ActorSystemImpl(default)] The status changes stream failed, terminating. (akka.stream.AbruptStageTerminationException: GraphStage [akka.stream.impl.fusing.GraphStages$IgnoreSink$$anon$10@2d4fa106] terminated abruptly, caused by for example materializer or actor system termination.)
The streamlets have not been re-deployed and are still running from the previous 2.0.10 deployment.
To Reproduce Follow the official upgrade instructions copy/pasting:
helm upgrade cloudflow cloudflow-helm-charts/cloudflow \
--namespace cloudflow \
--reuse-values \
--version="2.0.11"
--set kafkaClusters.default.bootstrapServers=cloudflow-strimzi-kafka-bootstrap.cloudflow:9092
Expected behavior The 2.0.11 operator to start up healthy
Additional context It appears that the official upgrade docs are missing a backslash before the last line. IMO, it should rather look like this:
helm upgrade cloudflow cloudflow-helm-charts/cloudflow \
--namespace cloudflow \
--reuse-values \
--version="2.0.11" \
--set kafkaClusters.default.bootstrapServers=cloudflow-strimzi-kafka-bootstrap.cloudflow:9092
But that is probably not the core of the issue. Migrating back to 2.0.10 (worked) and then re-deploying 2.0.11 with the missing backslash did not bring any improvement.
However, migrating back to 2.0.10 and then re-trying to move to 2.0.11 with double-quotes around the config value still keeps crashing the operator.
Issue Analytics
- State:
- Created 3 years ago
- Comments:43 (43 by maintainers)
Top GitHub Comments
That sounds like a bug. I’ll have a quick look.
We’ve updated the docs (available now in dev: https://cloudflow.io/docs/current/administration/upgrading-cloudflow.html) that fixes the main issue, the kubectl-cloudflow plugin is also fixed in master.