question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add automatic monitoring/instrumentation of underlying job

See original GitHub issue

There have been many requests over time to provide data on CPU, memory, and disk usage.

In an ideal world this would be a rich time series of data, in a less ideal world this would be max with a couple of periodic data points, and in a much more practical world it’d just be max usage. The last one is the definition of done here but if one wants to get ambitious …

The PAPI backend already allows for a custom script to be attached, we should be putting in something on our own for any backend which runs things via unix command lines (ie spark jobs and such likely don’t make sense)

At least one group is using the following in production, this is likely a good start if not completely AOK

echo ==================================
echo =========== MONITORING ===========
echo ==================================
echo --- General Information ---
echo \#CPU: $(nproc)
echo Total Memory: $(free -h | grep Mem | awk '{ print $2 }')
echo Total Disk space: $(df -h | grep cromwell_root | awk '{ print $2}')
echo 
echo --- Runtime Information ---

function runtimeInfo() {
        echo [$(date)]
        echo \* CPU usage: $(top -bn 2 -d 0.01 | grep '^%Cpu' | tail -n 1 | awk '{print $2}')%
        echo \* Memory usage: $(free -m | grep Mem | awk '{ OFMT="%.0f"; print ($3/$2)*100; }')%
        echo \* Disk usage: $(df | grep cromwell_root | awk '{ print $5 }')
}

while true; do runtimeInfo; sleep 300; done```

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
Hornethcommented, Jul 31, 2017

ToL: Instead of pretty printing the values like in this script, it shouldn’t be too hard to output them in some kind of tsv format with timestamps that would hopefully be easily parseable to a timeseries.

0reactions
Hornethcommented, Nov 5, 2018

@TedBrookings this seems like a really good idea. docker stats would of course only work for tasks running in a docker container but that’s hopefully the majority of them. It would not have been possible to do it in PAPIv1 but PAPIv2 should be flexible enough to allow for something like that.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Four fundamentals of workplace automation - McKinsey
As the automation of physical and knowledge work advances, many jobs will be redefined rather than eliminated--at least in the short term.
Read more >
Neogov insight training Guide
Add other Insight users' recruitment tasks from this section. Once a task has been added, you can edit and delete. The task system...
Read more >
Designing a branded company careers page with Workable
Both Workable careers pages features, Basic and Advanced: Are easy to set up within minutes, without needing complicated IT or design assistance; Automatically...
Read more >
'Long-Term Investors Will Be Rewarded': Oppenheimer ...
The Q3 dividend was set at 26.5 cents per common share, and totaled $8 million paid ... optimization, and basic monitoring instrumentation.
Read more >
Screening and Evaluating Job Candidates - SHRM
Ensuring the basic requirements for a position are not overinflated can expand the pool of talent from which the employer is recruiting and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found