Experiment with using a direct DB connection instead of an API for retrieve data for SSR
See original GitHub issueFor most of our pages, the explorer server code is doing one or more (up to 6) api calls to https://stacks-node-api.mainnet.stacks.co/
server-side which slows down dramatically our first server response.
We should avoid this pattern because it puts too much load on the explorer web server and users would experience degraded performance on the explorer as traffic increases (even if the api endpoint can handle it).
Example: 1000 users, each user requests the /txid page once, the explorer server itself requests 6 calls for each user and wait for each user for all the 6 requests to complete before returning a response to the user. That’s 6000 api call that can’t really leverage user caching as the cache would invalidate more often with more users, it consumes way more memory than necessary etc…
Had those api call been made client side those 6000 api call could be cached per user.
Reducing the load on the explorer server means it will be able to handle more with the same config or the config can be reduced to handle the same number of users.
(Note that nothing changes for the api because the api endpoint will still have to handle those 1000 * 6 requests)
My PR reduces the number of server-side api calls to a maximum of 1. This solution is a quick one and will give us good performance without much change in the code.
Ideally we should fetch the data directly from the db server side with a direct read-only connection to the same db that the api server is using. This way the explorer server will get all the data really fast and will return a response quickly. Besides we can leverage the SSR of nextjs and have great SEO.
This task is to experiment with simple node db connection to the postgresql and check the performance and the amount of code that need to change. Note that those changes could be done in a progressive manner (say page per page) without changing the rest of the explorer code.
Issue Analytics
- State:
- Created a year ago
- Comments:11 (6 by maintainers)
Top GitHub Comments
Are there any reports showing the API is a cause of performance issues in the Explorer? The API is specifically designed to be as fast as possible at retrieving data from the DB with the Explorer being a prime usecase. What makes us believe the Explorer can do it better? Not trying to poke holes in this, just trying to understand the thought process around this 😄
Additionally during times of high traffic (e.g. during a mint), it’s been shown that the root cause/bottleneck is queries piling up in the database, rather than the API service itself. If this happens, the Explorer is going to experience the same slowdown whether it routes requests through the API or directly to the backend database.
The only way around this is to create a separate private API + DB deployment for Explorer SSR’s, and at that point it would be easier to simply route SSR traffic through the private API anyway.
I think reducing the amount of SSRs from 6 down to 1 in itself should yield massive performance improvements by an order of magnitude under heavy load, but it’s not clear to me that there would be any benefit to accessing the API database directly.
Maybe @zone117x or @rafaelcr can weigh in here if they have additional perspective I’m missing.
Just adding more context for the current setup. Generally at all times there will be multiple load-balanced instances of the Explorer running to serve requests, especially when traffic is high. So even if they do have support for a server-side cache, it’s not being shared across all Explorer instances. If someone reloads their address page, they’d likely hit a different Explorer instance, which may not have cached data to serve them.
Moving the caching efforts client-side should open the door to resolve this since we’d then be able to leverage their browser’s cache and perhaps Cloudflare’s cache. But if there’s still any server-side caching going on, we should look into leveraging a middle caching layer (e.g. Cloudflare), or implementing support for a shared cache view for the Explorer with something like Redis or memcached.