[Bug][Server] zookeeper multi directories, tasks cannot be assigned
See original GitHub issueDescribe the bug zookeeper多级目录场景下任务无法分配worker
To Reproduce Steps to reproduce the behavior, for example:
- 修改配置文件zookeeper.properties zookeeper.dolphinscheduler.root=/path/to/dolphinscheduler
- 正常启动master, worker
- 新建工作流定义, 新建shell节点, echo “hello world”
Expected behavior 1.控制台正常显示已注册的master, worker 2.启动任务后, master报错: “fail to execute : xxx due to no suitable worker, current task need to yyy worker group execute”
Which version of Dolphin Scheduler: -[1.3.1.release]
Additional context 1.ExecutorDispatcher#dispatch //hostManager.select 根据已经注册的worker, 通过group找到允许执行的worker 2.ZookeeperNodeManager$WorkerGroupNodeListener#dataChanged 监听worker变化 3.错误代码部分
`
if (event.getType() == TreeCacheEvent.Type.NODE_ADDED) {
logger.info("worker group node : {} added.", path);
String group = parseGroup(path); //格式化获取group出错, 导致无法刷新syncWorkerGroupNodes
Set<String> workerNodes = workerGroupNodes.getOrDefault(group, new HashSet<>());
Set<String> previousNodes = new HashSet<>(workerNodes);
Set<String> currentNodes = registryCenter.getWorkerGroupNodesDirectly(group);
logger.info("currentNodes : {}", currentNodes);
syncWorkerGroupNodes(group, currentNodes);
}
`
`
private String parseGroup(String path){
String[] parts = path.split("\\/");
if(parts.length != 6){
throw new IllegalArgumentException(String.format("worker group path : %s is not valid, ignore", path));
}
String group = parts[4]; // /dolphinscheduler/nodes/worker/default, 这段代码符合的路径
return group;
} `
- 临时修改
String group = parts[parts.length-2];// 临时修改
5.问题原因: workerGroupNodes保存worker信息, workerGroupNodes通过dataChange刷新, parseGroup错误导致变量一直无法更新, 最终ExecutorDispatcher#dispatch无法获取worker, 任务无法继续
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
ZooKeeper Administrator's Guide
Deployment. This section contains information about deploying Zookeeper and covers these topics: System Requirements; Clustered (Multi-Server) ...
Read more >ZooKeeper Administrator's Guide - Apache ZooKeeper
Deployment. This section contains information about deploying Zookeeper and covers these topics: System Requirements. Clustered (Multi-Server) Setup.
Read more >Administrator's Guide - Apache ZooKeeper
Deployment. This section contains information about deploying Zookeeper and covers these topics: System Requirements; Clustered (Multi-Server) Setup; Single ...
Read more >ZooKeeper Administrator's Guide - Apache ZooKeeper
This section contains information about deploying Zookeeper and covers these topics: System Requirements. Clustered (Multi-Server) Setup.
Read more >ZooKeeper Administrator's Guide - Apache ZooKeeper
System Requirements. Clustered (Multi-Server) Setup. Single Server and Developer Setup. The first two sections assume you are interested in ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I will take it.
send a pr, but e2e Test fail. could you help take a look?