Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

resource (CPU, RAM, I/O) usage tracking

See original GitHub issue

the resource library allows Python processes to track memory usage and such things.

forking may be necessary to properly monitor each test invidually in this case. a separate data structure should also be built parallel to results to track other resources.

there seems to be a way with getrusage(2) to get information about threads as well, but this doesn’t seem to be a good idea considering Python’s limited support for threads and how the extension is Linux-specific.

i think the following data points could be collected:

0   ru_utime    time in user mode (float)
1   ru_stime    time in system mode (float)
2   ru_maxrss   maximum resident set size
9   ru_inblock  block input operations
10  ru_oublock  block output operations

those could be interesting but may be just adding too much noise:

3   ru_ixrss    shared memory size
4   ru_idrss    unshared memory size
5   ru_isrss    unshared stack size
6   ru_minflt   page faults not requiring I/O
7   ru_majflt   page faults requiring I/O
8   ru_nswap    number of swap outs
11  ru_msgsnd   messages sent
12  ru_msgrcv   messages received
13  ru_nsignals signals received
14  ru_nvcsw    voluntary context switches
15  ru_nivcsw   involuntary context switches

basically, this would extend the time metric to be an array of metrics with different units and so on…

would that be useful to others as well?

Issue Analytics

State:
Created 8 years ago
Reactions:8
Comments:17 (9 by maintainers)

Top GitHub Comments

1reaction

varaccommented, Mar 27, 2017

Hej, I’m also interested in including CPU, RAM, I/O usage. Just curious what’s the state of this issue - any progress since Oct 2015 ?

0reactions

huonwcommented, Feb 16, 2021

Yeah, you’re right that tracemalloc works really well for this: both solutions are using it. The only difference between them is the return value of the “timer” and how it computes the “duration” via a subtraction between the start “time” and the end “time”.

(I think your * 4 factor may be driven by this because I don’t think it should be necessary: tracemalloc is returning the exact byte size of the current/peak memory allocations it knows about, so multiplying by some arbitrary factor is… confusing.)

To show this in practice, I wrote a benchmark, which uses the basic one (without the * 4), and the one from StellarGraph above (that is, with a class with a custom __sub__): I’ve used a setup function with a very high peak (80MB, on a 64-bit computer) followed by a benchmark function with a lower peak (8MB), as an extreme example.

Expand for benchmark code

# issue_28.py
import pytest
import tracemalloc

tracemalloc.start()

class MallocPeak:
    def __init__(self, peak):
        self.peak = peak

    def __sub__(self, other):
        # ignore other
        return self.peak

def peak_default_sub():
    _, peak = tracemalloc.get_traced_memory()
    tracemalloc.clear_traces()
    return peak

def peak_fancy_sub():
    _, peak = tracemalloc.get_traced_memory()
    tracemalloc.clear_traces()
    return MallocPeak(peak)

def _setup():
    # setup code with a comically high peak, to make the issue clear
    x = [None] * 10_000_000 # 80MB
    del x

def _func():
    # real code has a lower peak than the setup
    y = [None] * 1_000_000 # 8MB
    del y

@pytest.mark.benchmark(timer=peak_default_sub)
def test_default_sub(benchmark):
    benchmark.pedantic(_func, iterations=1, rounds=1, warmup_rounds=0, setup=_setup)

@pytest.mark.benchmark(timer=peak_fancy_sub)
def test_fancy_sub(benchmark):
    benchmark.pedantic(_func, iterations=1, rounds=1, warmup_rounds=0, setup=_setup)

With that setup, the results of pytest issue_28.py are:

-------------------------------------------------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------------------------------------------------
Name (time in ns)                              Min                                     Max                                    Mean            StdDev                                  Median               IQR            Outliers     OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_default_sub      -72,167,433,000,000,000.0000 (1.0)      -72,167,433,000,000,000.0000 (1.0)      -72,167,433,000,000,000.0000 (1.0)      0.0000 (1.0)      -72,167,433,000,000,000.0000 (1.0)      0.0000 (1.0)           0;0 -0.0000 (0.11)          1           1
test_fancy_sub          8,000,512,000,000,000.0000 (0.11)       8,000,512,000,000,000.0000 (0.11)       8,000,512,000,000,000.0000 (0.11)     0.0000 (1.0)        8,000,512,000,000,000.0000 (0.11)     0.0000 (1.0)           0;0  0.0000 (1.0)           1           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

(Note that it’s reporting in “nanoseconds” because of the negative values: all the byte sizes are multiplied by 1_000_000_000.)

In summary:

test_default_sub is very incorrect (a negative peak!): it gives a value close to -72MB = 8MB - 80MB, because the timer is subtracting the peak of the setup function off the peak of the real benchmark
test_fancy_sub is correct: it gives almost exactly 8MB, which is the right answer, given the test function just allocates a 8MB list

This is exaggerated for clarity, but more realistic settings show the error too: for instance, change the setup function to have a peak of 8MB too (test_default_sub: -168KB; test_fancy_sub: 8MB, as above) or don’t use a setup function at all (test_default_sub: 7.8MB, test_fancy_sub: 8MB, as above).

Why is this happening? If we unroll the loop for the test_default_sub benchmark and inline the functions, we effectively get:

# do setup
_setup()

# snapshot 'time'/size before
_, peak_before = tracemalloc.get_traced_memory() # peak_before = 80MB
tracemalloc.clear_traces() # reset the peak to 0, but peak_before = 80MB still

# run benchmarked function
_func()

# snapshot 'time'/size after
_, peak_after = tracemalloc.get_traced_memory() # peak_after = 8MB
tracemalloc.clear_traces()

# compute result 'duration'
result = peak_after - peak_before # 8MB - 80MB = -72MB

In particular, the ‘before’ value needs to be 0, because the clear_traces means that’s where we’re starting. AFAIK, the best way to achieve this is to use a custom subtraction on the result line, because it’s hard to know inside a timer call if it’s the ‘before’ snapshot (to return 0) or the ‘after’ one (to return the real peak).

Top Results From Across the Web

Monitor CPU and Memory

The easiest way to check the instantaneous memory and CPU usage of a job is to ... for how resources are used by...

How to Monitor CPU and Memory on Linux? - Geekflare

Know how much an individual process or system-wide consume CPU or memory. ... and command-line open-source solution to monitor server resources, daemons, ...

What are the tools to monitor system resources (CPU, memory ...

What are the tools to monitor system resources (CPU, memory usage, network, disk I/O) etc. on a Linux cluster?

How to monitor CPU/memory usage of a single process?

On Linux, top actually supports focusing on a single process, although it naturally doesn't have a history graph: top -p PID. This is...

Monitoring memory utilization, CPU utilization, and disk I/O ...

They are task manager or resource monitor in Windows, activity monitor in Mac OS X, and top in Linux. When running these tools,...