question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

requestHandlerTimeoutSecs works ambiguously

See original GitHub issue

Which package is this bug report for? If unsure which one to select, leave blank

@crawlee/puppeteer (PuppeteerCrawler)

Issue description

If it set a value for requestHandlerTimeoutSecs in the crawler config, it seems to be adding on to a default value of 70. According to the docs, the default value is 60 and I would think requestHandlerTimeoutSecs should override the default value.

Setting it to 0 times out at 70. Setting it to 10 times out at 80. Setting it to 61 times out at 131.

Initially I tested it because I wanted to run my handler without a timeout and thought maybe 0 would do it.

Tested it with both Playwright and Puppeteer with the same result.

Code sample

const crawler = new PuppeteerCrawler({
        requestHandler: router,
        maxConcurrency: 1,
        requestHandlerTimeoutSecs: 10,
        headless: false,
        requestList
    })

Package version

^3.0.0

Node.js version

v19.0.0

Operating system

Ubuntu

Apify platform

  • Tick me if you encountered this issue on the Apify platform

Priority this issue should have

Low (slightly annoying)

I have tested this on the next release

No response

Other context

No response

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
B4nancommented, Oct 31, 2022

Yeah, that’s the same as OP - the final timeout is a sum of a few things, including additional 10s as a safety check. Which makes some sense for the default, but not so much when a user sets the value explicitly.

https://github.com/apify/crawlee/blob/master/packages/http-crawler/src/internals/http-crawler.ts#L333 https://github.com/apify/crawlee/blob/master/packages/browser-crawler/src/internals/browser-crawler.ts#L357

0reactions
mnmkngcommented, Nov 1, 2022

Oops.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Manage the Skill Session and Session Attributes | Alexa Skills ...
When false , the extended session occurs after the user fails to respond to the reprompt. Here, the session remains open for between...
Read more >
Tornado Documentation - Read the Docs
Most of the work of a Tornado web application is done in subclasses of RequestHandler. The main entry point for.
Read more >
Howto — pytest_httpserver 1.0.6 documentation
json and data parameters are mutually exclusive so both of then cannot be specified as in such case the behavior is ambiguous. Note....
Read more >
Messaging Endpoints - Spring
The receiveTimeout property specifies the amount of time the poller should wait if no messages are available when it invokes the receive ...
Read more >
How to implement Timeout in BaseHTTPServer ...
I managed to get timeouts working for HTTP requests with self.rfile._sock.settimeout(60) ... timeout = 0.1 # seconds class WebHTTPServer(BaseHTTPServer.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found