question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Experiment with using a direct DB connection instead of an API for retrieve data for SSR

See original GitHub issue

For most of our pages, the explorer server code is doing one or more (up to 6) api calls to https://stacks-node-api.mainnet.stacks.co/ server-side which slows down dramatically our first server response. We should avoid this pattern because it puts too much load on the explorer web server and users would experience degraded performance on the explorer as traffic increases (even if the api endpoint can handle it). Example: 1000 users, each user requests the /txid page once, the explorer server itself requests 6 calls for each user and wait for each user for all the 6 requests to complete before returning a response to the user. That’s 6000 api call that can’t really leverage user caching as the cache would invalidate more often with more users, it consumes way more memory than necessary etc… Had those api call been made client side those 6000 api call could be cached per user. Reducing the load on the explorer server means it will be able to handle more with the same config or the config can be reduced to handle the same number of users. (Note that nothing changes for the api because the api endpoint will still have to handle those 1000 * 6 requests)

My PR reduces the number of server-side api calls to a maximum of 1. This solution is a quick one and will give us good performance without much change in the code.

Ideally we should fetch the data directly from the db server side with a direct read-only connection to the same db that the api server is using. This way the explorer server will get all the data really fast and will return a response quickly. Besides we can leverage the SSR of nextjs and have great SEO.

This task is to experiment with simple node db connection to the postgresql and check the performance and the amount of code that need to change. Note that those changes could be done in a progressive manner (say page per page) without changing the rest of the explorer code.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
CharlieC3commented, Jul 13, 2022

Ideally we should fetch the data directly from the db server side with a direct read-only connection to the same db that the api server is using. This way the explorer server will get all the data really fast and will return a response quickly. Besides we can leverage the SSR of nextjs and have great SEO.

Are there any reports showing the API is a cause of performance issues in the Explorer? The API is specifically designed to be as fast as possible at retrieving data from the DB with the Explorer being a prime usecase. What makes us believe the Explorer can do it better? Not trying to poke holes in this, just trying to understand the thought process around this 😄

Additionally during times of high traffic (e.g. during a mint), it’s been shown that the root cause/bottleneck is queries piling up in the database, rather than the API service itself. If this happens, the Explorer is going to experience the same slowdown whether it routes requests through the API or directly to the backend database.

The only way around this is to create a separate private API + DB deployment for Explorer SSR’s, and at that point it would be easier to simply route SSR traffic through the private API anyway.

I think reducing the amount of SSRs from 6 down to 1 in itself should yield massive performance improvements by an order of magnitude under heavy load, but it’s not clear to me that there would be any benefit to accessing the API database directly.

Maybe @zone117x or @rafaelcr can weigh in here if they have additional perspective I’m missing.

2reactions
CharlieC3commented, Jul 15, 2022

If you have many users requesting a page from the explorer at the same time, the explorer server is currently caching all the data for all the users in the server. I suspect all the api caching would be useless in time of high traffic or when users refresh the page multiple times…

Just adding more context for the current setup. Generally at all times there will be multiple load-balanced instances of the Explorer running to serve requests, especially when traffic is high. So even if they do have support for a server-side cache, it’s not being shared across all Explorer instances. If someone reloads their address page, they’d likely hit a different Explorer instance, which may not have cached data to serve them.

Moving the caching efforts client-side should open the door to resolve this since we’d then be able to leverage their browser’s cache and perhaps Cloudflare’s cache. But if there’s still any server-side caching going on, we should look into leveraging a middle caching layer (e.g. Cloudflare), or implementing support for a shared cache view for the Explorer with something like Redis or memcached.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Which is better for internal use: an API or direct database ...
I think it depends on what you need. If you ask which one is agile? API for sure, because you add some abstraction...
Read more >
Pros and Cons of using API instead of direct DB Access
With an API, your app code won't need to change. ... Enforced separation of concerns between data access layer (DB) and application. Cons:....
Read more >
database - Why do people do REST API's instead of DBAL's?
Introducing a REST API layer between the web app and the database has no benefit. All the stated benefits (caching, isolation of database...
Read more >
How to Fetch Data using API and SQL databases!
We are going to learn fetching data from an API and how you can use SQL databases to extract data and save it...
Read more >
Optimize - Google Developers
This article describes how to use Optimize to report on experiments running on your server or other Internet-connected devices.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found