Occasional empty / zero byte streams at high load
See original GitHub issueHi!
First of all, thank you for this great library! I’ve been using it for a couple of years now and it really makes the GraphQL developer experience better 🙂.
Unfortunately, we’re seeing occasional errors where the createReadStream()
would have zero bytes on our production environment with very little load running on GKE even though the whole multipart request was received (I’ve added a PassThrough stream to print out the whole request when the stream length is zero).
I have been able to replicate this on a local K8S when load testing the service at 40-60 rps with a chance of around 1 in 5000. Basically, the service in question here is a GraphQL Gateway, we’re using graphql-upload v13.0.0
’s processRequest as a middleware for fastify
to process the body and then passing it to apollo-server-fastify
and @apollo/gateway
together with apollo-federation-file-upload as the datasource to replay the file upload.
My initial assumption was that it has something to do with the networking/parsing layer but I had run a separate test using the received request body directly to dicer
in a >100,000 loop to see if it was the cause but the result didn’t have any errors parsing the multipart.
I have a couple of theories as to why this happens.
- Upon diving with
fs-capacitor
I have learned that all file uploads create a temporary file so maybe during high load it fails to create a temporary possibly due to too many files being open? I’m currently thinking if this is related to this issue - During debugging, whenever I encounter the zero byte stream the
onData
andonEnd
was never called, I’m not super experienced when it comes to streams but would it be possible forcreateReadStream()
to execute before the data is written to the FileStream? - It may also be the combination of the two? During high load disk writes can cause high latency which affects reading of the file when calling
createReadStream
.
Our solution right now is to migrate to graphql-upload-minimal since we’re only using the stream to pass through to the receiving backend service. I haven’t encountered the issue so far with this setup even at 120 rps load.
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:9 (4 by maintainers)
I understand, our solution with
graphql-upload-minimal
seems to be working fine and might be the better route for a gateway service. In the meantime, I’ve attached a patch file to be used withpatch-package
if there is a need for an immediate fix for other people.patches/fs-capacitor+6.2.0.patch
It seems the idea to dynamic import pure ESM
fs-capacitor
is not going to work due a TypeScript issue: https://github.com/microsoft/TypeScript/issues/49055#issuecomment-1151747145Side note about the dynamic import approach; I had an idea about doing the dynamic import on the first function call, and storing the result in a
let
outside of the function scope so it can be used from then on instead of awaiting a promise for the dynamic import again and again each function call. But I couldn’t find any information about if Node.js has optimisations for multiple dynamic import calls of the same thing (are calls after the first faster?) or if awaiting a promise really saves that much time or system resources.Thanks for offering to add a CJS entry point to
fs-capacitor
, but maybe I should just movegraphql-upload
to pure ESM and be done with it.