CSV results do not match calculated response times
See original GitHub issueCSV results do not match calculated response times
In my load test I have a task sequence which reports data in the aggregate view. The results look like
Name # reqs # fails Avg Min Max | Median req/s failures/s
-------------------------------------------------------------------------------------
GET Task 1: 4355 0(0.00%) 69 41 837 | 60 8.70 0.00
POST Task 2: 4331 0(0.00%) 91 41 1312 | 94 8.90 0.00
POST Task 3: 4302 0(0.00%) 62 18 653 | 70 8.40 0.00
POST Task 4: 4278 0(0.00%) 51 14 563 | 66 8.90 0.00
POST Task 5: 4253 0(0.00%) 60 15 687 | 74 9.70 0.00
POST Task 6: 4226 0(0.00%) 63 24 723 | 76 9.80 0.00
POST Task 7: 4198 0(0.00%) 55 17 553 | 69 9.60 0.00
POST Task 8: 4171 0(0.00%) 59 21 714 | 72 9.10 0.00
GET Task 9: 4136 0(0.00%) 347 106 1566 | 290 9.10 0.00
-------------------------------------------------------------------------------------
Aggregated 38250 0(0.00%) 94 14 1566 | 70 82.20 0.00
Task 9 downloads a file.
In my code I execute:
stime = time.time()
get_episode_mp3(Connection=self, endpoint=self.config['cdnUrl'],
auth=f"Bearer {self.auth_token}",
token=self.auth_token,
episode=self.selected_episode_object,
episode_id=self.selected_episode_id,
TaskName="Task 9: Get Episode Audio File")
end_time = time.time()
iteration_time = end_time - stime
self.iteration_pass.append(iteration_time * 1000)
which calls
with connection_object.client.get(target, headers=headers, name=task_name, stream=True, catch_response=True) as response:
The problem is the difference in the time objects is not consistent with the calculated difference in the to calls to time.time()
.
When i look at the data written to my log file (iteration_time
), the values are very different. The data above yields a max of 174135.23 ms, an average of 10143.89 ms, a median of 594.50 ms, and a min of 411.09 ms.
I should point out this data is from two worker nodes and teh aggregate statistics are reported on the master.
Expected behavior
The aggregated statistics should reflect the actual response times from the workers.
Actual behavior
I guess I explained that above.
Environment
- OS: Linux
- Python version: 3.7.6
- Locust version: 1.0.2
- Locust command line that you ran: On the master: locust --loglevel=DEBUG --headless -u 250 --run-time=10m --stop-timeout=600 --host platform --logfile=debug.log -r 1 --csv test-csv --master --expect-workers=2
On the slave: locust --worker --master-host=locust-master --master-port=5557 --loglevel=DEBUG --logfile=debug.log
- Locust file contents (anonymized if necessary):
Issue Analytics
- State:
- Created 3 years ago
- Comments:6
Top GitHub Comments
Well, i think we can close this. I found the issue.
In this line:
the
Stream=True
is the problem. The response time is the elapsed time to get the initial response. The reason for the difference between my time and the locust time is that i was waiting until the entire object was available. Removing that parameter corrects the problem with this measurement. It raised the question for our use case to maybe we want to know both values, but that is a different question and not relevant to Locust.Sorry for the confusion - but good lesson learned.
The server is AWS CloudFront. The bandwidth to the locust server is capped at 20 Mbps to simulate a specific use case. There is no way for a file this size to be downloaded in 312 milliseconds. But I will try to figure out how to simplify it.