question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

rhea.js and amqp, behind a multi-tenant REST API, with failover

See original GitHub issue

Hello, everyone! We have started investigation of this library for one of our online gaming platforms and would like to hear some ideas about the best practices to connect QUEUE in a production as a Sender.

Some info

  • We are using expressjs and exposing a REST api, which has an endpoint, for example POST /game-action .
  • Many different users, concurrently (100-300users) send requests to this endpoint, which then puts messages in a QUEUE.
  • We are using Amazon MQ, with AMQP protocol

We tried few different strategies to connect queue as a sender and were getting such behaviors. (Details below)

Strategy 1 - Basic

const connection_options = {
    'username': QueueConfig.QUEUE_USERNAME,
    'host': QueueConfig.QUEUE_URL,
    'port': QueueConfig.QUEUE_PORT,
    'password': QueueConfig.QUEUE_PASSWORD,
    'transport': 'ssl'
};

express.post('/game-action',(req,res)=>{
    const connection = rhea.connect(connection_options);
    const sender = connection.open_sender({target: QueueConfig.QUEUE_NAME,autosettle: true});
    sender.on('sendable', (context) => {
         sender.send({body: message});
         sender.detach();
         connection.close();
    });



    return res.send("Done");
})

Strategy 1 - Problems

  1. EFFICIENCY - Definitely we feel this option doesn’t seem efficient when opening connection on every requests.
  2. CONCURRENCY - When many concurrent users were sending requests, we were getting messages that connection was already established. Seems like when requests were sent concurrently, rhea was not able to create a dedicated container in some cases and competing requests to queue were failing.
  3. FAILOVER - In AWS we had HA MQ setup, with 2 endpoints and such approach was not handling a failover of active/standby instances.

Strategy 2 - Resolving Consurency

const connection_options = {
    'username': QueueConfig.QUEUE_USERNAME,
    'host': QueueConfig.QUEUE_URL,
    'port': QueueConfig.QUEUE_PORT,
    'password': QueueConfig.QUEUE_PASSWORD,
    'transport': 'ssl'
};

express.post('/game-action',(req,res)=>{

    //////Added Unique Container for every request
    connection_options['container_id'] = `${new Date().getTime()}-${random}`

    const connection = rhea.connect(connection_options);
    const sender = connection.open_sender({target: QueueConfig.QUEUE_NAME,autosettle: true});
    sender.on('sendable', (context) => {
         sender.send({body: message});
         sender.detach();
         connection.close();
    });



    return res.send("Done");
})

Strategy 2 - Problems

  1. [STILL-EFFICIENCY] We feel this option doesn’t seem efficient when opening connection on every request.
  2. [RESOLVED-CONCURRENCY] By adding random container id to every request the problem from startegy 1 was gone.
  3. [STILL-FAILOVER] In AWS we had HA MQ setup, with 2 endpoints and such approach was not failover of active/standby instances.

Strategy 3 - Resolving failover

const hosts = ['active','standby']
const connection_options = {
    'username': QueueConfig.QUEUE_USERNAME,
    'password': QueueConfig.QUEUE_PASSWORD,
    'transport': 'ssl',
    connection_details: function () {
        attempt++;
        return {
            port: 5671,
            host: hosts.length ? hosts[attempt % hosts.length] : hosts,
            transport: 'ssl'
        };
    },
};

express.post('/game-action',(req,res)=>{
    //Added Unique Container for every request
    connection_options['container_id'] = `${new Date().getTime()}-${random}`
    const connection = rhea.connect(connection_options);
    const sender = connection.open_sender({target: QueueConfig.QUEUE_NAME,autosettle: true});
    sender.on('sendable', (context) => {
         sender.send({body: message});
         sender.detach();
         connection.close();
    });

    return res.send("Done");
})

Strategy 3 - Problems

  1. [STILL-EFFICIENCY] Efficiency Even worse, now during every connection we see it tried both active/standby, logging error with standby, finally connecting to active, but each requests/queue send takes much longer.
  2. [RESOLVED-CONCURENCY] By adding random container id to every requests the problem from startegy 1 was gone.
  3. [RESOLVED PARTIALLY - FAILOVER] Now every connection will try both active and standby and will send message when failover happens, but yeah it’s not efficient.

Strategy 4 - Moving connection outside the REST endpoint

const hosts = ['active','standby']
const connection_options = {
    'username': QueueConfig.QUEUE_USERNAME,
    'password': QueueConfig.QUEUE_PASSWORD,
    'transport': 'ssl',
    connection_details: function () {
        attempt++;
        return {
            port: 5671,
            host: hosts.length ? hosts[attempt % hosts.length] : hosts,
            transport: 'ssl'
        };
    },
};

const connection = rhea.connect(connection_options);

express.post('/game-action',(req,res)=>{

    //// COMMENTED OUT AS NO MORE POSSIBLE WHEN CONNECTION IS OUTSIDE
    //Added Unique Container for every request
    //connection_options['container_id'] = `${new Date().getTime()}-${random}`
    
    const sender = connection.open_sender({target: QueueConfig.QUEUE_NAME,autosettle: true});
    sender.on('sendable', (context) => {
         sender.send({body: message});
         sender.detach();
         //commented out connection close and left only detach
         //connection.close();
    });
    return res.send("Done");
})

Strategy 4 - Problems

  1. EFFICIENCY-RESOLVED - Now connection outside the rest endpoint, is better then establishing on every request.
  2. CONCURRENCY-PROBLEM - After moving connection outside the request, we do not have ability to create unique connection identifiers and requests concurrency started appearing back again.
  3. FAILOVER -RESOLVED - Now with failover hosts and connection moved out this issue seems to be gone.

QUESTIONS

  • Based on described scenario any suggestion to handle this, in a production ready way, would be really appreciated!
  • We think option 4 is reasonable, but we do not know how we can resolve concurrency, which previously was resolved, by adding container_id, to every established connection? Maybe by unique container in every POST /endpoint request, but we’re not able to find some samples?
  • Also what is the best practice? Each request should initialize a sender? Or can we initialize sender outside the rest endpoint as well?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
grscommented, Apr 14, 2021

No problem, glad it was resolved!

1reaction
grscommented, Apr 14, 2021

What version of rhea are you using? The log indicates that heartbeats are indeed not being sent out after reconnecting. I can’t reproduce this with the latest library (there was an issue like this some time ago, which was fixed).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Failover for NodeJS rhea AMQP client - Stack Overflow
I am using an AMQP connection in NodeJS use rhea in the following way: var container = require('rhea'); container.on('message', ...
Read more >
amqp/rhea: A reactive messaging library based on ... - GitHub
A request-response example where the 'client' sends messages to a 'server' (or service) which converts them to upper case and sends them back....
Read more >
Using the AMQ JavaScript Client - Red Hat Customer Portal
AMQ JavaScript is a flexible and capable messaging API. It enables any application to speak AMQP 1.0. An event-driven API that simplifies ...
Read more >
PDF - ActiveMQ Artemis Documentation
We provide a HA solution with automatic client failover so you can guarantee zero message loss or duplication in event of server failure....
Read more >
cloud virtual infrastructure: Topics by WorldWideScience.org
Cloud services are divided into infrastructure as a service (IaaS, platform as a ... Applications running in multi-tenant IaaS clouds increasingly require ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found