question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Investigate worker overhead

See original GitHub issue

Motivated by a desire for reduced latencies on the workers for Actors (we found that 1ms things were taking 5ms) we added a thread that statistically profiles the event loop. This showed overhead from a couple surprising sources:

  1. psutil and the SystemMonitor
  2. Tornado’s write_to_fd which apparently isn’t entirely non-blocking, see this stack overflow question
  3. Tornado’s add_callback overhead, see this stack overflow question

I’m not sure how best to address these. There are probably a few approaches:

  1. Check that we’re using psutil appropriately, and that there isn’t some better way to regularly poll system use at high-ish frequency (currently we poll every 500ms)
  2. Quantify the cause of add_callback, and see if there aren’t some occasions where we can reduce our use of Tornado
  3. Investigate other concurrency frameworks, like asyncio + uvloop. This sounds neat, but is likely expensive for many reasons. I did try using uvloop + asyncio + tornado but it wasn’t very effective. The overhead appears to be higher in this stack so that uvloop doesn’t seem to do much good.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:10 (9 by maintainers)

github_iconTop GitHub Comments

5reactions
bybytecommented, Jun 19, 2019

With Python 2.7 we also experienced each idle worker process using about 10% CPU. For us, changing the following in distributed/distributed.yaml brought the usage down to about 0.5% (the choice to try and edit these came from debugging the worker):

distributed.admin.tick.interval: 1000ms 
distributed.worker.profile.interval: 1000ms
1reaction
NotSqrtcommented, Nov 6, 2018

Yes, I’m working on it, while also tracking memory and CPU of worker children, when tasks create subprocesses.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Employee Overhead Definition - Law Insider
Employee Overhead means overhead charges allocable to contractual and statutory benefits and to administration of employees, including without limitation ...
Read more >
Construction Incidents Investigation Engineering Reports
Many of these incidents resulted in one or more worker fatalities, and most of ... August 2013: Investigation of the March 31, 2013...
Read more >
OSHA investigating after Sam's Club employee dies following ...
The 20-year-old employee died days after he was struck by overhead garage door at Sam's Club on Sept. 30. A spring had snapped...
Read more >
Direct & Indirect Workers' Compensation Costs Explained - CBIZ
Claim investigation costs involve costs associated with the investigation of a workers' compensation claim if there is a concern of fraud.
Read more >
First Department Reverses Lower Court and Dismisses ...
... Claim In Case Involving Worker Who Struck Neck On Overhead Object ... The plaintiff, a mechanic investigating a reported gas leak at...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found