CORS pre-flight breaks socket.io behind load balancer
See original GitHub issueI ran into an issue on our servers. We are running socket.io v1.0.6 on multiple server instances behind a load balancer. For polling, the requests go through the ELB with sticky sessions turned on. Our real-time service is on a subdomain, and thanks to CORS pre-flight requests, socket.io fucks up. Here is what happens on the client when the polling transport is used:
- A socket.io handshake POST request occurs. The response comes back valid with an
sid
, and the headers include the AWS ELB cookie. - Next, a pre-flight
OPTIONS
request is made by the browser. The ELB cookie is not included by the browser here. As a result, theOPTIONS
request is routed to a potentially different server which will not recognize thesid
in the query string. - When the request is routed to the wrong server, socket.io responds with a 400 HTTP status code and an
Session ID unknown
error. - Since the pre-flight request fails, the browser also fails the actual GET polling request, and tries to re-do the handshake from the beginning
- Possibly due to the headers being sent, the browser sends the
OPTIONS
pre-flight request fairly regularly as opposed to doing it only once, so this cycle repeats over and over.
The fix on our end currently is to respond to all OPTIONS
requests with a 200 and all the usual Access-Control-Allow-…
headers the browser knows and loves. We do this before they even get to socket.io in our nginx config.
Now, engine.io appears to already handle this case here: https://github.com/Automattic/engine.io/blob/master/lib/transports/polling-xhr.js#L40
However, that check is only reached if the sid
is valid here: https://github.com/Automattic/engine.io/blob/master/lib/server.js#L180
which it isn’t, of course. I can submit a PR but I’d like to know how you guys think it’d be best to handle this. AFAIK, if a request method is OPTIONS
, we can make the assumption that we are polling. But, since we don’t have a valid sid
to look up a client by, this might mean moving fairly transport-specific logic into server.js
which sounds less than ideal.
Thoughts?
Issue Analytics
- State:
- Created 9 years ago
- Reactions:6
- Comments:43 (25 by maintainers)
Top GitHub Comments
I’m trying to reproduce the issue, but so far the connection seems actually stable.
I’m using https://github.com/socketio/engine.io/tree/a63c7b787c54b3a47da7f355826bf2770139c62b.
For anyone else with this issue, the following code will fix it:
Applied after socket.io:
e.g: