question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CPU resources, htop shows all 12 cores being used for example_1.py

See original GitHub issue

Thanks for the great library. I am running some tests to benchmark performance. Since I hope to be running these using many CPUs, I want to understand how many CPUs a script will consume. I installed the repository as of today, and ran:

python examples/example_1.py

I’m running this on a machine with a single GPU, and an i7-8700k CPU with 12 logical cores. I assume I’m not using the GPU in the above command.

In a separate tab, my htop is regularly showing something like this:

Screenshot from 2019-11-11 10-21-15

All 12 CPUs on my machine are being used.

When I run:

python examples/example_1.py --cuda_idx 0

I don’t generally see all 12 CPUs being used, but I may get something like 6 CPUs used:

Screenshot from 2019-11-11 10-25-52

Just wondering, the documentation of example_1.py says this:

Runs one instance of the Atari environment and optimizes using DQN algorithm.
Can use a GPU for the agent (applies to both sample and train). No parallelism
employed, so everything happens in one python process; can be easier to debug.

However I assumed “one python process = one core”. Perhaps this is not the right way to think about it. Is there a way to roughly estimate how many CPUs (or “cores” – I use terms interchangeably) will be used for a given training run?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
DanielTakeshicommented, Nov 19, 2019

Thanks @astooke

I did another test where I took example_5 and did one change, which is to replace MinibatchRlEval with MinibatchRl so that there’s no evaluation environment. I then ran it with 8 parallel envs as you can see below. htop is regularly showing at least 24 CPUs being used (on a 48 CPU machine) and nothing but my script is running on it.

Screenshot from 2019-11-18 19-22-46

Let me now do some testing with your suggestion of changing the num threads …

Update 1: was going to do some experimentation but saw @codelast put some new stuff below.

Update 2: actually I might have gotten confused. In example_5.py the batch_B is hard-coded at 32 so the n_parallel does not actually change it. I assumed n_parallel was the number of parallel environments, but I guess it’s more viewed as the amount of resources provided to a given program? I will continue investigating but do you have any quick suggestions / comments on the distinction between n_parallel and batch_B?

Update 3: figured out my prior question. batch_B is actually the number of parallel environments, and n_parallel is the number of workers that need to run those environments. If batch_B=32 and n_parallel=2 then we have [16,16] envs allocated to the two workers. If n_parallel=3 then it’s [11,11,10] envs allocated to the three workers, and we get a warning saying that performance may suffer due to unequal environment distribution.

1reaction
DanielTakeshicommented, Feb 20, 2020

I just ran some tests today, and indeed using taskset helped to limit my CPU usage, according to htop.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Htop shows one core at 100% cpu usage, but no processes ...
Finally found the answer. It turns out that the issue is the same as this question with additional information here and here.
Read more >
htop shows that cpu usage of per core over 100%?
I'm using htop to monitor the CPU usage of my task. However, the CPU% value exceed 100% sometimes, which really confused me.
Read more >
How to Monitor System Processes Using htop Command
The htop is a command-line utility that allows you to interactively monitor your system's vital resources or server processes in real-time.
Read more >
Understanding and using htop to monitor system resources
Htop is an interactive and real time process monitoring application for Linux which will show you your usage per cpu/core, ...
Read more >
How to Measure Separate CPU Core Usage for a Process
It only shows the overall used or idle CPU. If we want to know the CPU usage of each core, we can press...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found