question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Load balancing in JetStream Work Queue

See original GitHub issue

I am trying to write a demo for the JetStream Work Queue with push consumer. I have some question about the load balance behavior I observed.

My environment:

  • client: nats@2.6.1
  • server: nats-server 2.7.4 (using docker)

My producer code:

import { connect, JetStreamClient, JSONCodec, RetentionPolicy } from "nats";

async function main() {
  const nc = await connect({ servers: "127.0.0.1" });

  const jsm = await nc.jetstreamManager();

  await jsm.streams.add({
      name:"test",
      subjects: ["test"],
      retention: RetentionPolicy.Workqueue,
  });

  setInterval(dispatch.bind(nc.jetstream()), 1000);
}

async function dispatch(this: JetStreamClient) {
  const encoder = JSONCodec();
  const ack = await this.publish(
    SUBJECT,
    encoder.encode({ task: `sample task at ${new Date().toISOString()}` })
  );
  console.log(`message sent with seq number ${ack.seq}`);
}

main().then(
  () => {},
  (err) => console.error(err)
);

My consumer code:

import {
  connect,
  consumerOpts,
  createInbox,
  JsMsg,
  JSONCodec,
  NatsError,
} from "nats";
import util from "util";

async function main(pid: string) {
  console.log("start worker: ", pid);
  const nc = await connect({ servers: "127.0.0.1" });

  const js = nc.jetstream();

  const opt = consumerOpts();
  opt.manualAck();
  opt.ackExplicit();
  opt.queue("x");
  opt.callback(handler);
  opt.maxAckPending(2);
  opt.deliverAll();
  opt.deliverTo(createInbox("x"));
  opt.durable("x");

  await js.subscribe("test", opt);
}

async function handler(err: NatsError, msg: JsMsg) {
    const jc = JSONCodec();
    console.log("message get:", msg.seq);
    await doWork(jc.decode(msg.data));
    msg.ack();
}

async function doWork(data: unknown): Promise<void> {
  await new Promise((resolve) => setTimeout(resolve, 2000));
}

main(process.pid.toString()).then(
  () => {},
  (err) => console.error(err)
);

If I run the consumer code twice in the terminal, I have two nodejs process, each of them subscribe the work queue through the same consumer x.
What I expected to see is that the content created in the producer being dispatched to the two processes evenly.
But what I see in the log is a random dispatch:

Log for consumer process 1:

start worker:  1323710
message get: 43
message get: 44
message get: 45
message get: 46
message get: 48
message get: 49
message get: 50
message get: 51
message get: 55
message get: 58
message get: 60
message get: 62
message get: 65
message get: 67
message get: 69
message get: 72
message get: 73
message get: 75
message get: 76
message get: 78
message get: 81
message get: 87
message get: 88
message get: 89
message get: 91
message get: 92
message get: 94

Log for consumer process 2:

start worker:  1323822
message get: 47
message get: 52
message get: 53
message get: 54
message get: 56
message get: 57
message get: 59
message get: 61
message get: 63
message get: 64
message get: 66
message get: 68
message get: 70
message get: 71
message get: 74
message get: 77
message get: 79
message get: 80
message get: 82
message get: 83
message get: 84
message get: 85
message get: 86
message get: 90
message get: 93
message get: 95
message get: 96

I read from the document that in NATS core the queue group does a random selection. But I did not find respective description about JetStream.
Is JetStream also doing a random selection? Is there any config so that it can have more predictive behavior such as:

  • round-robin
  • subscriber with more pending ack message has lower priority for dispatching

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:20 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
ds3p2commented, Apr 1, 2022

I see. I think pull consumer with long-poll is sufficient for my need so far.
Thanks for the support from all of you.
I am closing this issue.

0reactions
derekcollisoncommented, Apr 1, 2022

No, at core NATS level we have no semantics to be smarter, and while I was working at Google I saw the power of not trying to be too smart and just using random selection at scale.

With pull consumers you are essentially in a bit more control and present a FIFO pattern here, since each app has to send a request to get a message or batch of messages.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Getting queues working in JetStream #510 - nats-io/nats.net
Getting queues working in JetStream. ... consumers to listen on the same subject and ensure only one of them gets the message (some...
Read more >
Queue Groups - NATS Docs
Such distributed queues are a built-in load balancing feature that NATS provides. Advantages. Ensures application fault tolerance. Workload processing can ...
Read more >
Distributed communication patterns with NATS
NATS provides a built-in load balancing feature called distributed queues. Using queue subscribers will balance message delivery across a ...
Read more >
Grokking NATS Consumers: Push-based queue groups
There are two problems queue groups solve. The first, which core NATS and JetStream supports, is the need for load balancing and scaling...
Read more >
Building Distributed Event Streaming Systems In Go With ...
The Pull based consumers let JetStream pull the messages from consumer systems. Pull based consumer systems are like work queues. Because the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found