Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Test execution time increases over time

See original GitHub issue

Consider the following settings file:

site_configuration = {
    'systems': [
        {
            'name': 'generic',
            'descr': 'Generic example system',
            'hostnames': ['.*'],
            'partitions': [
                {
                    'name': 'default',
                    'scheduler': 'local',
                    'launcher': 'local',
                    'environs': ['builtin'],
                    'max_jobs': 1
                }
            ]
        },
    ],
[...]

And the following RunOnly test:

$ cat sleep.py 
import reframe as rfm
import reframe.utility.sanity as sn

@rfm.simple_test
class Zzzzz(rfm.RunOnlyRegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['*']

    duration = parameter(range(25))

    num_tasks_per_node = 1

    executable = '/usr/bin/sleep'

    @run_after('init')
    def set_args(self):
        self.executable_opts = [f'{self.duration}']

    @run_before('run')
    def verbose(self):
        print(f'{self.executable} {" ".join(self.executable_opts)}', flush=True)

    @sanity_function
    def check_exit(self):
        return sn.assert_eq(self.job.exitcode, 0, msg="Exited with exit code {0}")

Tests are executed in the reverse order of the parameters list, but we see that the latest tests (sleep 0 … sleep 10) all take 10 seconds to execute despite the serial execution policy:

$ time reframe --system generic -c sleep.py --exec-policy=serial -v --run --timestamp
[ReFrame Setup]
  version:           3.12.0-dev.1
  command:           '/home/fabecassis/.local/bin/reframe --system generic -c sleep.py --exec-policy=serial -v --run --timestamp'
  launched by:       fabecassis@ioctl
  working directory: '/tmp/reframe'
  settings file:     '/tmp/reframe/settings2.py'
  check search path: '/tmp/reframe/sleep.py'
  stage directory:   '/tmp/reframe/stage/2022-06-02T14:29:50'
  output directory:  '/tmp/reframe/output/2022-06-02T14:29:50'

Loaded 25 test(s)
Generated 25 test case(s)
Filtering test cases(s) by name: 25 remaining
Filtering test cases(s) by tags: 25 remaining
Filtering test cases(s) by other attributes: 25 remaining
Final number of test cases: 25
[==========] Running 25 check(s)
[==========] Started on Thu Jun  2 14:29:50 2022 

[----------] start processing checks
[ RUN      ] Zzzzz_24 @generic:default+builtin
/usr/bin/sleep 24
[       OK ] ( 1/25) Zzzzz %duration=24 @generic:default+builtin
==> setup: 0.002s compile: 0.002s run: 24.655s sanity: 0.000s performance: 0.000s total: 24.661s
[ RUN      ] Zzzzz_23 @generic:default+builtin
/usr/bin/sleep 23
[       OK ] ( 2/25) Zzzzz %duration=23 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 24.325s sanity: 0.000s performance: 0.000s total: 24.330s
[ RUN      ] Zzzzz_22 @generic:default+builtin
/usr/bin/sleep 22
[       OK ] ( 3/25) Zzzzz %duration=22 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 23.144s sanity: 0.000s performance: 0.000s total: 23.150s
[ RUN      ] Zzzzz_21 @generic:default+builtin
/usr/bin/sleep 21
[       OK ] ( 4/25) Zzzzz %duration=21 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 24.165s sanity: 0.000s performance: 0.000s total: 24.170s
[ RUN      ] Zzzzz_20 @generic:default+builtin
/usr/bin/sleep 20
[       OK ] ( 5/25) Zzzzz %duration=20 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 29.745s sanity: 0.000s performance: 0.000s total: 29.750s
[ RUN      ] Zzzzz_19 @generic:default+builtin
/usr/bin/sleep 19
[       OK ] ( 6/25) Zzzzz %duration=19 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.030s sanity: 0.000s performance: 0.000s total: 20.035s
[ RUN      ] Zzzzz_18 @generic:default+builtin
/usr/bin/sleep 18
[       OK ] ( 7/25) Zzzzz %duration=18 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.033s sanity: 0.000s performance: 0.000s total: 20.038s
[ RUN      ] Zzzzz_17 @generic:default+builtin
/usr/bin/sleep 17
[       OK ] ( 8/25) Zzzzz %duration=17 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.022s sanity: 0.000s performance: 0.000s total: 20.027s
[ RUN      ] Zzzzz_16 @generic:default+builtin
/usr/bin/sleep 16
[       OK ] ( 9/25) Zzzzz %duration=16 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.024s sanity: 0.000s performance: 0.000s total: 20.029s
[ RUN      ] Zzzzz_15 @generic:default+builtin
/usr/bin/sleep 15
[       OK ] (10/25) Zzzzz %duration=15 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.032s sanity: 0.000s performance: 0.000s total: 20.037s
[ RUN      ] Zzzzz_14 @generic:default+builtin
/usr/bin/sleep 14
[       OK ] (11/25) Zzzzz %duration=14 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.031s sanity: 0.000s performance: 0.000s total: 20.036s
[ RUN      ] Zzzzz_13 @generic:default+builtin
/usr/bin/sleep 13
[       OK ] (12/25) Zzzzz %duration=13 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.032s sanity: 0.000s performance: 0.000s total: 20.036s
[ RUN      ] Zzzzz_12 @generic:default+builtin
/usr/bin/sleep 12
[       OK ] (13/25) Zzzzz %duration=12 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.031s sanity: 0.000s performance: 0.000s total: 20.036s
[ RUN      ] Zzzzz_11 @generic:default+builtin
/usr/bin/sleep 11
[       OK ] (14/25) Zzzzz %duration=11 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 20.031s sanity: 0.000s performance: 0.000s total: 20.036s
[ RUN      ] Zzzzz_10 @generic:default+builtin
/usr/bin/sleep 10
[       OK ] (15/25) Zzzzz %duration=10 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.048s sanity: 0.000s performance: 0.000s total: 10.053s
[ RUN      ] Zzzzz_9 @generic:default+builtin
/usr/bin/sleep 9
[       OK ] (16/25) Zzzzz %duration=9 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.060s sanity: 0.000s performance: 0.000s total: 10.066s
[ RUN      ] Zzzzz_8 @generic:default+builtin
/usr/bin/sleep 8
[       OK ] (17/25) Zzzzz %duration=8 @generic:default+builtin
==> setup: 0.002s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.025s
[ RUN      ] Zzzzz_7 @generic:default+builtin
/usr/bin/sleep 7
[       OK ] (18/25) Zzzzz %duration=7 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.024s
[ RUN      ] Zzzzz_6 @generic:default+builtin
/usr/bin/sleep 6
[       OK ] (19/25) Zzzzz %duration=6 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.024s
[ RUN      ] Zzzzz_5 @generic:default+builtin
/usr/bin/sleep 5
[       OK ] (20/25) Zzzzz %duration=5 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.024s
[ RUN      ] Zzzzz_4 @generic:default+builtin
/usr/bin/sleep 4
[       OK ] (21/25) Zzzzz %duration=4 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.024s
[ RUN      ] Zzzzz_3 @generic:default+builtin
/usr/bin/sleep 3
[       OK ] (22/25) Zzzzz %duration=3 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.024s
[ RUN      ] Zzzzz_2 @generic:default+builtin
/usr/bin/sleep 2
[       OK ] (23/25) Zzzzz %duration=2 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.001s performance: 0.000s total: 10.024s
[ RUN      ] Zzzzz_1 @generic:default+builtin
/usr/bin/sleep 1
[       OK ] (24/25) Zzzzz %duration=1 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.017s sanity: 0.000s performance: 0.000s total: 10.022s
[ RUN      ] Zzzzz_0 @generic:default+builtin
/usr/bin/sleep 0
[       OK ] (25/25) Zzzzz %duration=0 @generic:default+builtin
==> setup: 0.001s compile: 0.002s run: 10.019s sanity: 0.000s performance: 0.000s total: 10.024s
[----------] all spawned checks have finished

[  PASSED  ] Ran 25/25 test case(s) from 25 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Thu Jun  2 14:36:47 2022 
Run report saved in '/home/fabecassis/.reframe/reports/run-report.json'
Log file(s) saved in '/tmp/rfm-eylidseo.log'

real    6m56.945s
user    0m0.697s
sys     0m0.075s

e.g. the sleep 0 test took 10 seconds to execute, and the sleep 11 test took 20 seconds to execute.

If you revert the order of the tests with duration = parameter(range(24, -1, -1)), then the sleep 0 test will take < 1 second to execute, as expected:

[----------] start processing checks
[ RUN      ] Zzzzz_0 @generic:default+builtin
/usr/bin/sleep 0
[       OK ] ( 1/25) Zzzzz %duration=0 @generic:default+builtin
==> setup: 0.002s compile: 0.002s run: 0.109s sanity: 0.000s performance: 0.000s total: 0.115s

I’m aware that https://reframe-hpc.readthedocs.io/en/stable/pipeline.html#timing-the-test-pipeline mentions:

the time spent in the pipeline’s “run” phase should not be interpreted as the actual runtime of the test

But this is with the serial execution policy, so tests pipeline should not be overlapped. And the wall clock time of the sleep 0 test is indeed 10 seconds, so that seems excessively long.

Issue Analytics

State:
Created a year ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

vkarakcommented, Jun 9, 2022

But in the generated JUnit result file, if the total run took 300s, each test would be reported as taking 300s with the async policy. And when showing the summary of the whole run, GitLab CI will sum the duration of each test and report the total in the dashboard, assuming that tests executed sequentially. Hence if I had 20 tests, GitLab CI will report that the test suite took 20 * 300s to execute.

Good point that I have not thought about it! I will raise the priority of #1527, because in fact these timings are not meaningful and seem to affect the CI integration.

0reactions

flx42commented, Jun 8, 2022

I guess that you are running locally and not through a job scheduler in this scenario, right? Because otherwise you could technically run your tests with -S exclusive_access=true and –if you use Slurm– your jobs will run exclusively on the node.

Interesting, I was not familiar with this setting. This test is indeed running with Slurm under a salloc, so we use the srunalloc launcher. I will test that, thanks!

I guess you are using those times, but what are you using them for?

We are using Reframe’s ability to export the results to the JUnit format, and the JUnit file is sent to GitLab CI to be able to see test status from the GitLab CI (https://docs.gitlab.com/ee/ci/testing/unit_test_reports.html) But in the generated JUnit result file, if the total run took 300s, each test would be reported as taking 300s with the async policy. And when showing the summary of the whole run, GitLab CI will sum the duration of each test and report the total in the dashboard, assuming that tests executed sequentially. Hence if I had 20 tests, GitLab CI will report that the test suite took 20 * 300s to execute.

Top Results From Across the Web

Why does script execution time increase over time when using ...

I'm trying python for the first time. I wrote this script below: # Importing modules import time buffersize = 10000 f ...

Improve your test execution time - Mihaita Tinta - Medium

The problem with large test suites is that they usually take a longer time to execute. We will analyse a few methods on...

Chapter 9. Speeding up test execution - Effective Unit Testing

This chapter explains strategies for speeding up the execution of your test suite—the time it takes to run all of your unit tests—so...

The execution time of tests changes at repeated ... - GitHub

I am developing a prototype with a remote computer that executes tests against a web page at fixed times, and reports the resulting...

Collecting and Analyzing Execution Time Data - Dynatrace

Time -Based Measurement and Event-Based Measurement are the two widely-used techniques for collecting execution time data. Let's examine each in detail.