question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Optimization] Split GcsNodeInfo into Basic/Full GcsNodeInfo

See original GitHub issue

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Recently we found that GetAllNodeInfo is a bottleneck of GCS. Because when actor/raylet FO, it will subscribe to node change, in which GetAllNodeInfo will be sent.

After analyzing, we found that only a few fields are used by CoreWorker/Raylet. We don’t need to return/publish all fields to them.

So we splited GcsNodeInfo to BasicGcsNodeInfo and FullGcsNodeInfo. We’ve tested it and it has 25% performance boost.

I want to contribute this optimization, but since it will cause a lagre-scale code change(about 800 lines), I want to confirm with you guys first in this issue.

Main change

1. protobuf

before:

message GcsNodeInfo {
  enum GcsNodeState {
    ALIVE = 0;
    DEAD = 1;
  }
  bytes node_id = 1;
  string node_manager_address = 2;
  string raylet_socket_name = 3;
  string object_store_socket_name = 4;
  int32 node_manager_port = 5;
  int32 object_manager_port = 6;
  GcsNodeState state = 7;
  string node_manager_hostname = 8;
  int32 metrics_export_port = 9;
  double start_time = 10;
  double terminate_time = 11;
  int32 pid = 12;
  int32 brpc_port = 13;
  int64 timestamp = 14;
  string shape_group = 16;
  string pod_name = 17;
  map<string, double> resources_total = 21;
}

after:

message BasicGcsNodeInfo {
  enum GcsNodeState {
    ALIVE = 0;
    DEAD = 1;
  }
  bytes node_id = 1;
  string node_manager_address = 2;
  int32 node_manager_port = 5;
  int32 object_manager_port = 6;
  GcsNodeState state = 7;
  string node_manager_hostname = 8;
  int64 timestamp = 14;
}

message FullGcsNodeInfo {
  BasicGcsNodeInfo basic_gcs_node_info = 22;

  string raylet_socket_name = 3;
  string object_store_socket_name = 4;
  int32 metrics_export_port = 9;
  int32 pid = 12;
  int32 brpc_port = 13;
  double start_time = 10;
  double terminate_time = 11;
  string shape_group = 16;
  string pod_name = 17;
  map<string, double> resources_total = 21;
}

2. GCS RPC Handler/ Accessor will have 2 versions of GetAllNodeInfo

// handlers:
  void HandleGetAllBasicNodeInfo(const rpc::GetAllBasicNodeInfoRequest &request,
                                 rpc::GetAllBasicNodeInfoReply *reply,
                                 rpc::SendReplyCallback send_reply_callback) override;

  void HandleGetAllFullNodeInfo(const rpc::GetAllFullNodeInfoRequest &request,
                                rpc::GetAllFullNodeInfoReply *reply,
                                rpc::SendReplyCallback send_reply_callback) override;

// accessors:
class BasicNodeInfoAccessor {
 ....
}

class FullNodeInfoAccessor {
 ...
}

And all related codes. Which will cause 800+ lines modification.

Some Other Topics

Why we didn’t simply keep the GcsNodeInfo as it is, and mask some fields?

First, it makes the protocol harder to understand. Then, this will introduce additional memory copy.

In current implementation, we use arena to avoid copying. But if we want to mask some fields of Reply, we need to create a new Reply first and copy a bunch of int values, string pointers to it. Which will break this optimization.

Actually this is a problem of gRPC’s Arena, it lacks a feature that allows us to mask some fields in Arena.

void GcsNodeManager::HandleGetAllNodeInfo(const rpc::GetAllNodeInfoRequest &request,
                                          rpc::GetAllNodeInfoReply *reply,
                                          rpc::SendReplyCallback send_reply_callback) {
  // Here the unsafe allocate is safe here, because entry.second's life cycle is longer
  // then reply.
  // The request will be sent when call send_reply_callback and after that, reply will
  // not be used any more. But entry is still valid.
  for (const auto &entry : alive_nodes_) {
    reply->mutable_node_info_list()->UnsafeArenaAddAllocated(entry.second.get());
  }
  for (const auto &entry : dead_nodes_) {
    reply->mutable_node_info_list()->UnsafeArenaAddAllocated(entry.second.get());
  }
  GCS_RPC_SEND_REPLY(send_reply_callback, reply, Status::OK());
  ++counts_[CountType::GET_ALL_NODE_INFO_REQUEST];
}

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
iychengcommented, Dec 3, 2021

I have one question here, for worker, should they just ask for GetNodeInfo from raylet? If we do that could it fix this problem?

Long term goal is that worker shouldn’t talk with gcs directly. For some info it can be cached in raylet, so it’ll go with raylet, for some, raylet will redirect them to gcs.

cc @scv119 as well.

0reactions
lixin-weicommented, Dec 8, 2021

@iycheng @scv119 Yep I agree with you. Seems it’s not worth it to do this optimization after we made raylet as a proxy.

Let’s go straight forward to the raylet proxy solution. Closing this issue, thank you for your review!

do we know if the cpu or network is the bottleneck?

@scv119 it’s CPU.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[BUG] Unit test segfault - Microsoft/Nnfusion - IssueHint
Issue Title Created Date Comment Count Updated Da... VS issues warning about wildcards in props file 1 2022‑02‑06 2022‑09‑30 libmxnet.so is libmxnet.dylib on OSX 7...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found