question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bookkeeper containers are keep on restarting after deploying pravega with 3 zookeepers in docker swarm setup

See original GitHub issue

Deploy pravega:0.5.0-2260.b477eec release in docker swarm setup with 3 zookeeper containers

1 hdfs
3 zookeepers
1 segmentstore
1 controller
4 bookkeepers

First deployed zookeeper with 3 replicas. After all three zookeeper came up deployed pravega, but while installing bookkeeper all 4 bookkeepers keep on restarting and never came up.

Same issue observed while verifying in single node(1 master) setup and multiple node (1 master & 2 worker)

Getting below exception while restarting the containers (logs from Exited bookie containers)

2019-06-04 12:10:18,670 - INFO  - [main-EventThread:ZooKeeperWatcherBase@130] - ZooKeeper client is connected now.
2019-06-04 12:10:18,686 - ERROR - [main:ZKRegistrationManager@374] - BookKeeper metadata doesn't exist in zookeeper. Has the cluster been initialized? Try running bin/bookkeeper shell metaformat
2019-06-04 12:10:18,687 - INFO  - [main:BookieNettyServer@396] - Shutting down BookieNettyServer
2019-06-04 12:10:18,728 - ERROR - [main:Main@221] - Failed to build bookie server
org.apache.bookkeeper.bookie.BookieException$MetadataStoreException: Failed to get cluster instance id
        at org.apache.bookkeeper.discover.ZKRegistrationManager.getClusterInstanceId(ZKRegistrationManager.java:387)
        at org.apache.bookkeeper.bookie.Bookie.checkEnvironmentWithStorageExpansion(Bookie.java:413)
        at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:257)
        at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:641)
        at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:131)
        at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:100)
        at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:43)
        at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:299)
        at org.apache.bookkeeper.server.Main.doMain(Main.java:219)
        at org.apache.bookkeeper.server.Main.main(Main.java:201)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for BookKeeper metadata
        at org.apache.bookkeeper.discover.ZKRegistrationManager.getClusterInstanceId(ZKRegistrationManager.java:377)
        ... 9 more

NOTE :- while scale down zookeeper from 3 to 2, all 4 bookkeepers came up

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Ranjan-Padhicommented, Jun 28, 2019

Thanks @shrids ,

I have tried with ZOO_SERVERS and ZOO_MY_ID configuration in zookeeper yml file and it’s working properly

After deploying zookeeper with new below zookeeper yml file, pravega bookies deployed successfully without restarting and IO also running successfully

Zookeeper.yml file

version: '3.1'

services:
  zookeeper1:
    image: zookeeper:3.5.4-beta
    hostname: zookeeper1
    ports:
      - 2181:2181
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zookeeper2:2888:3888;2181 server.3=zookeeper3:2888:3888;2181

  zookeeper2:
    image: zookeeper:3.5.4-beta
    hostname: zookeeper2
    ports:
      - 2182:2181
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zookeeper1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zookeeper3:2888:3888;2181

  zookeeper3:
    image: zookeeper:3.5.4-beta
    hostname: zookeeper3
    ports:
      - 2183:2181
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zookeeper1:2888:3888;2181 server.2=zookeeper2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181
0reactions
shridscommented, Jun 7, 2019

Please use https://hub.docker.com/_/zookeeper and https://zookeeper.apache.org/doc/r3.5.4-beta/zookeeperStarted.html#sc_RunningReplicatedZooKeeper as a reference to deploy a zk cluster in replicated mode.

(Just changing to replicas: 3 will cause all the zk instances to be deployed as separate individual instances and the request to zookeeper:2181 could be serviced by any of the individual zk instance. )

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deployment in Docker Swarm - Exploring Pravega
This runs a single node HDFS container and single node ZooKeeper inside the pravega_default overlay network, and adds them to the pravega stack....
Read more >
zookeeper - Official Image | Docker Hub
Since the Zookeeper "fails fast" it's better to always restart it. Connect to Zookeeper from an application in another Docker container. $ docker...
Read more >
Deploying multiple zookeepers in docker swarm
Solved. version: '3.2' services: zoo1: image: zookeeper restart: always hostname: zoo1 ports: - 2181:2181 environment: ZOO_MY_ID: 1 ...
Read more >
Zookeeper / Exhibitor cluster nodes keep restarting - Super User
I have successfully deployed 3 Zookeeper / Exhibitor nodes in Docker containers and they form a cluster. I am starting them via
Read more >
Simple index - piwheels
... django-host-settings statgcb202inter awsbots nester-list-print pycacore random-quote-generator-93618 scys-20210916-20210920 three-commas autoscraper ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found