Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Distribution of user classes is not respected and some user classes are just never spawned

See original GitHub issue

Describe the bug

When using LoadTestShape in distributed mode and with multiple user classes having different weights, the distribution of users does not respect the weights (even within some tolerance). Furthermore, users with the smaller weights are often never picked up. The problem appears especially when the LoadTestShape specifies stages with small increments (e.g. 5 users at a time).

Expected behavior

The distribution of users should take into account the overall distribution of users across all workers.

Actual behavior

Some user classes with lower weights are never picked up and the distribution of users does not respect the weights (even within a certain tolerance).

Steps to reproduce

I think the problem is that each worker is responsible for spawning its own users. Consider the following setup:

5 workers
Load test shape that increases the users by 5 at a rate of 1/s each minute until 100 users
3 user classes with weights [35, 55, 10] Once the test starts, the master will instruct each worker to spawn 1 user every minute. However, the weight_users function will always return the user with the weight of 55.

Possible solutions

I see two aspects that needs to be implemented:

I think that when running in distributed mode, the master runner should instruct the number of each user class to the workers instead of only the number of users. The worker runner would thus spawn the specified users as-is instead of computing the buckets.
The master runner should keep a state of all the running users and their class so that it can spawn the appropriate classes in order to preserve the distribution as much as possible. This state could also serve to solve https://github.com/locustio/locust/issues/896.

I’m not super familiar with the codebase, but would that make sense? Is there some technical limitation I’m not aware of?

Environment

OS: Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1031-azure x86_64)
Python version: Python 3.7.9
Locust version: locust==1.3.1
Locust command line that you ran:

Master:

  "${python_exec}" -m locust \
    --locustfile load_tests/locustfile_generated.py \
    --master \
    --master-bind-host "${master_host}" \
    --master-bind-port "${master_port}" \
    --expect-workers "${number_of_workers}" \
    --stop-timeout 900 \
    --csv="${results_path}/results" \
    --logfile "${results_path}/logs.log" \
    --loglevel "DEBUG"

Workers:

  "${python_exec}" -m locust \
    --locustfile load_tests/locustfile_generated.py \
    --worker \
    --master-host "${master_host}" \
    --master-port "${master_port}" \
    --csv="${results_path}/results" \
    --logfile "${results_path}/logs.log" \
    --loglevel "DEBUG"

Locust file contents (anonymized if necessary): The content of each test has been omitted. Also, this file is rendered from a template, so that is why the classes and tasks have generic names.

import json
import os
import random
import uuid
from pathlib import Path

import locust.stats
from essential_generators import DocumentGenerator
from locust import (
    HttpUser,
    LoadTestShape,
    between,
    task,
)

from load_tests.api.common import (
    create_user,
    delete_user,
    get_password,
)

locust.stats.CSV_STATS_INTERVAL_SEC = 2

current_path = Path(os.path.split(__file__)[0])

host = os.environ['API_HOST']

cached_data_file_path = os.environ['CACHED_DATA_FILE_PATH']

gen = DocumentGenerator()

random.seed()


class Test1(HttpUser):

    wait_time = between(5, 10)
    weight = 35
    host = host

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.user_id = None
        self.username = None
        self.password = None
        with open(cached_data_file_path, "rt") as file:
            self.cached_data = json.load(file)

    def on_start(self):
        self.username = f"test-user-{uuid.uuid4()}"
        self.password = get_password()
        self.user_id, user = create_user(self.client, self.username, self.password)

    def on_stop(self):
        if self.user_id is not None:
            delete_user(self.client, self.user_id)
        self.user_id = None
        self.username = None
        self.password = None

    @task(8)
    def test1(self):
        # omitted
        pass

    @task(8)
    def test2(self):
        # omitted
        pass

    @task(8)
    def test3(self):
        # omitted
        pass

    @task(8)
    def test4(self):
        # omitted
        pass

    @task(8)
    def test5(self):
        # omitted
        pass

    @task(8)
    def test6(self):
        # omitted
        pass

    @task(8)
    def test7(self):
        # omitted
        pass

    @task(8)
    def test8(self):
        # omitted
        pass

    @task(8)
    def test9(self):
        # omitted
        pass

    @task(8)
    def test10(self):
        # omitted
        pass

    @task(8)
    def test11(self):
        # omitted
        pass

    @task(12)
    def test12(self):
        # omitted
        pass


class Test2(HttpUser):

    wait_time = between(0, 0.5)
    weight = 55
    host = host

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.user_id = None
        self.username = None
        self.password = None
        with open(cached_data_file_path, "rt") as file:
            self.cached_data = json.load(file)

    def on_start(self):
        self.username = f"test-user-{uuid.uuid4()}"
        self.password = get_password()
        self.user_id, user = create_user(self.client, self.username, self.password)

    def on_stop(self):
        if self.user_id is not None:
            delete_user(self.client, self.user_id)
        self.user_id = None
        self.username = None
        self.password = None

    @task(50)
    def test1(self):
        # omitted
        pass

    @task(50)
    def test2(self):
        # omitted
        pass


class Test3(HttpUser):

    wait_time = between(5, 10)
    weight = 10
    host = host

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.user_id = None
        self.username = None
        self.password = None
        with open(cached_data_file_path, "rt") as file:
            self.cached_data = json.load(file)

    def on_start(self):
        self.username = f"test-user-{uuid.uuid4()}"
        self.password = get_password()
        self.user_id, user = create_user(self.client, self.username, self.password)

    def on_stop(self):
        if self.user_id is not None:
            delete_user(self.client, self.user_id)
        self.user_id = None
        self.username = None
        self.password = None

    @task(100)
    def test4(self):
        # omitted
        pass


class StagesShape(LoadTestShape):
    """
    A simply load test shape class that has different user and spawn_rate at
    different stages.
    Keyword arguments:
        stages -- A list of dicts, each representing a stage with the following keys:
            duration -- When this many seconds pass the test is advanced to the next stage
            users -- Total user count
            spawn_rate -- Number of users to start/stop per second
            stop -- A boolean that can stop that test at a specific stage
        stop_at_end -- Can be set to stop once all stages have run.
    """
    stages = [
        {"duration": 300, "users": 5, "spawn_rate": 1},
        {"duration": 600, "users": 25, "spawn_rate": 1},
        {"duration": 900, "users": 50, "spawn_rate": 1},
        {"duration": 4500, "users": 100, "spawn_rate": 1},
        {"duration": 5400, "users": 1, "spawn_rate": 1},
        {"duration": 6300, "users": 50, "spawn_rate": 1},
    ]

    for previous_stage, stage in zip(stages[:-1], stages[1:]):
        assert stage["duration"] > previous_stage["duration"]

    def tick(self):
        run_time = self.get_run_time()

        for stage in self.stages:
            if run_time < stage["duration"]:
                tick_data = (stage["users"], stage["spawn_rate"])
                return tick_data

        return None

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:19 (8 by maintainers)

Top GitHub Comments

1reaction

mboutetcommented, Nov 17, 2020

@cyberw, I went with your approach, so everything is deterministic. I still have some work to do on my PR, but once it is ready for review, I will remove the “Draft” status.

0reactions

mboutetcommented, Jun 21, 2021

/remove-lifecycle stale

Top Results From Across the Web

Changelog Highlights — Locust 2.0.0b4 documentation

We've renamed the Locust and HttpLocust classes to User and HttpUser . ... is not respected when changing number of running Users in...

JupyterHub spawns not actually launching

command run on a spinning-up but not ready pod containing a server for user wburr (myself), logged in to JupyterHub front-end via web....

By default, Juniper Networks devices have four types of login classes with preset permissions: operator, read-only, superuser or super-user, and unauthorized.

9. API Reference — Python 3.11.1 documentation

the Distribution class to use. a subclass of distutils.core.Distribution. script_name. The name of the setup.py script - defaults to sys.argv[0]. a string.

Frequently Asked Questions - Slurm Workload Manager

If the user's resource limit is not propagated, the limit in effect for the slurmd daemon will be used for the spawned job....