Header transfer-encoding make Splash API return 504 Gateway Timeout
See original GitHub issueI was developing a crawler using Splash when suddenly i started to receive a lot of gateway timeouts. Trying to troubleshooting the problem, i discover the cause of this is header transfer-encoding: chunked
, i made a PoC (the url httpbin.org/headers returns the same headers i sent on request):
import requests
import json
ENDPOINT_SPLASH = 'http://localhost:8050/execute'
def test_with_custom_headers():
lua_script = """
function main(splash, args)
splash:set_custom_headers({
["x-custom-header"] = "splash"
})
assert(splash:go(args.url))
assert(splash:wait(0.5))
return {
html = splash:html()
}
end
"""
payload = {
'lua_source': lua_script,
'url': 'https://httpbin.org/headers',
'timeout': 15,
}
r = requests.post(url=ENDPOINT_SPLASH,
json=payload)
result = json.loads(r.text)
return result.get('html', result)
def test_with_content_encoding():
lua_script = """
function main(splash, args)
splash:set_custom_headers({
["transfer-encoding"] = "chunked"
})
assert(splash:go(args.url))
assert(splash:wait(0.5))
return {
html = splash:html()
}
end
"""
payload = {
'lua_source': lua_script,
'url': 'https://httpbin.org/headers',
'timeout': 15,
}
r = requests.post(url=ENDPOINT_SPLASH,
json=payload)
result = json.loads(r.text)
return result.get('html', result)
print("test_with_custom_headers: \n{}\n".format(test_with_custom_headers()))
print("test_with_content_encoding: \n{}".format(test_with_content_encoding()))
Results:
test_with_custom_headers:
<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en,*",
"Host": "httpbin.org",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1",
"X-Custom-Header": "splash"
}
}
</pre></body></html>
test_with_content_encoding:
{'info': {'timeout': 15.0}, 'type': 'GlobalTimeoutError', 'error': 504, 'description': 'Timeout exceeded rendering page'}
Issue Analytics
- State:
- Created 4 years ago
- Comments:5
Top Results From Across the Web
Getting 504 Gateway Time-out while running ...
Here the script to proof that (url httpbin.org/headers) returns the same headers ... args) splash:set_custom_headers({ ["transfer-encoding"] ...
Read more >Troubleshoot API Gateway HTTP 504 timeout errors
To troubleshoot 504 timeout errors from API Gateway, first identify and verify the source of the error in your Amazon CloudWatch execution logs....
Read more >504 Gateway Timeout Error: What It Is and How to Fix It
A 504 Gateway Timeout Error is an HTTP response status code indicating that a server currently acting as a gateway or proxy did...
Read more >[Solved]-Getting 504 Gateway Time-out while running ...
Coding example for the question Getting 504 Gateway Time-out while ... This is because the url that you want to scrapy returns transfer-encoding...
Read more >Gateway Timeout when using different port mapping
Even when setting it to port 80 I get a gateway timeout. I changed my config to reflect this. ... [root@docker-core pi-hole]# curl...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’m having the same issue but weirdly enough only when using proxies via
splash:on_request
. My splash is patched with decompression patch described in this issue: https://github.com/scrapinghub/splash/issues/324 if you aren’t using proxies this might solve the issue for you.I’m having the same issue. Is there a solution for this problem yet? Thank you.