question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug][Server] zookeeper multi directories, tasks cannot be assigned

See original GitHub issue

Describe the bug zookeeper多级目录场景下任务无法分配worker

To Reproduce Steps to reproduce the behavior, for example:

  1. 修改配置文件zookeeper.properties zookeeper.dolphinscheduler.root=/path/to/dolphinscheduler
  2. 正常启动master, worker
  3. 新建工作流定义, 新建shell节点, echo “hello world”

Expected behavior 1.控制台正常显示已注册的master, worker 2.启动任务后, master报错: “fail to execute : xxx due to no suitable worker, current task need to yyy worker group execute”

Which version of Dolphin Scheduler: -[1.3.1.release]

Additional context 1.ExecutorDispatcher#dispatch //hostManager.select 根据已经注册的worker, 通过group找到允许执行的worker 2.ZookeeperNodeManager$WorkerGroupNodeListener#dataChanged 监听worker变化 3.错误代码部分

`

    if (event.getType() == TreeCacheEvent.Type.NODE_ADDED) {
      logger.info("worker group node : {} added.", path);
      String group = parseGroup(path); //格式化获取group出错, 导致无法刷新syncWorkerGroupNodes
      Set<String> workerNodes = workerGroupNodes.getOrDefault(group, new HashSet<>());
      Set<String> previousNodes = new HashSet<>(workerNodes);
      Set<String> currentNodes = registryCenter.getWorkerGroupNodesDirectly(group);
      logger.info("currentNodes : {}", currentNodes);
      syncWorkerGroupNodes(group, currentNodes);
   }

`

`

private String parseGroup(String path){
String[] parts = path.split("\\/");
if(parts.length != 6){
  throw new IllegalArgumentException(String.format("worker group path : %s is not valid, ignore", path));
}
String group = parts[4]; // /dolphinscheduler/nodes/worker/default, 这段代码符合的路径
return group;

} `

  1. 临时修改 String group = parts[parts.length-2];// 临时修改

5.问题原因: workerGroupNodes保存worker信息, workerGroupNodes通过dataChange刷新, parseGroup错误导致变量一直无法更新, 最终ExecutorDispatcher#dispatch无法获取worker, 任务无法继续

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
XiaotaoYicommented, Aug 13, 2020

I will take it.

0reactions
XiaotaoYicommented, Aug 14, 2020

I will take it.

This is a bug, 1.3.2 immediately to release the version, do you have time to submit the code as soon as possible? If there is no time, we will fix this problem in 1.3.2. Thx

这算是一个bug,1.3.2着急要发版,你有时间尽快提交代码么,如果没有时间,我们将会在1.3.2上修复这个问题 . 如果有时间可以直接在1.3.2上提,谢谢

send a pr, but e2e Test fail. could you help take a look?

Read more comments on GitHub >

github_iconTop Results From Across the Web

ZooKeeper Administrator's Guide
Deployment. This section contains information about deploying Zookeeper and covers these topics: System Requirements; Clustered (Multi-Server) ...
Read more >
ZooKeeper Administrator's Guide - Apache ZooKeeper
Deployment. This section contains information about deploying Zookeeper and covers these topics: System Requirements. Clustered (Multi-Server) Setup.
Read more >
Administrator's Guide - Apache ZooKeeper
Deployment. This section contains information about deploying Zookeeper and covers these topics: System Requirements; Clustered (Multi-Server) Setup; Single ...
Read more >
ZooKeeper Administrator's Guide - Apache ZooKeeper
This section contains information about deploying Zookeeper and covers these topics: System Requirements. Clustered (Multi-Server) Setup.
Read more >
ZooKeeper Administrator's Guide - Apache ZooKeeper
System Requirements. Clustered (Multi-Server) Setup. Single Server and Developer Setup. The first two sections assume you are interested in ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found