Crashes in version 1.0: on psutil.Process.cpu_percent, when process is gone
See original GitHub issueDescribe the bug
Upgraded from version 0.6.0 which has worked fine for a long time to 1.0 and the program crashes regularly with an exit code of 1 and no error messages. The time of the crashes is random, from a few minutes to almost an hour. Also initialized a new virtual env but the problem still occurs, and switching back to 0.6.0 stops the crashes.
Screenshots or Program Output
$ gpustat --debug
> An error while retrieving `fan_speed`: Not Supported
Traceback (most recent call last):
File "path/to/home/dir/.venv/lib64/python3.10/site-packages/gpustat/core.py", line 425, in get_gpu_info
fan_speed = N.nvmlDeviceGetFanSpeed(handle)
File "path/to/home/dir/.venv/lib64/python3.10/site-packages/pynvml.py", line 1942, in nvmlDeviceGetFanSpeed
_nvmlCheckReturn(ret)
File "path/to/home/dir/.venv/lib64/python3.10/site-packages/pynvml.py", line 765, in _nvmlCheckReturn
raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported
> An error while retrieving `power_limit`: Not Supported
Traceback (most recent call last):
File "path/to/home/dir/.venv/lib64/python3.10/site-packages/gpustat/core.py", line 461, in get_gpu_info
power_limit = N.nvmlDeviceGetEnforcedPowerLimit(handle)
File "path/to/home/dir/.venv/lib64/python3.10/site-packages/pynvml.py", line 2025, in nvmlDeviceGetEnforcedPowerLimit
_nvmlCheckReturn(ret)
File "path/to/home/dir/.venv/lib64/python3.10/site-packages/pynvml.py", line 765, in _nvmlCheckReturn
raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported
> An error while retrieving `fan_speed`: Not Supported -> Total 1 occurrences.
> An error while retrieving `power_limit`: Not Supported -> Total 1 occurrences.
$ nvidia-smi
Fri Dec 2 13:31:01 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.56.06 Driver Version: 520.56.06 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| N/A 44C P5 20W / N/A | 1000MiB / 6144MiB | 21% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Environment information:
- OS: Fedora 36
- NVIDIA Driver version: 520.56.06
- The name(s) of GPU card: NVIDIA GeForce RTX 3060 Laptop GPU
- gpustat version: 1.0.0
- pynvml version:
nvidia-ml-py 11.495.46
nvidia-ml-py3 7.352.0
Command used:
gpustat -cp --watch
Issue Analytics
- State:
- Created 10 months ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
psutil documentation — psutil 5.9.5 documentation
psutil (python system and process utilities) is a cross-platform library for retrieving information on running processes and system utilization (CPU, ...
Read more >Cannot get_cpu_percent() when run as Admin on Win7 · Issue #161 ...
1. Create ProcWrapper.py as a class and put the following into the class as a method def get_current_processes(self): processes = [] for process...
Read more >Python: timed-out psutil process killed (as instructed) in ...
I randomly change some bytes in valid PDF files, and then test to see if any of the 'fuzzed' files crash any of...
Read more >fx-team - Mercurial - Mozilla
#26: psutil.process_iter() function to iterate over processes as - Process ... Metadata-Version: 1.1 Name: psutil -Version: 1.0.1 -Summary: A process and ...
Read more >psutil 1.0.1 - PyPI
A process and system utilities module for Python. ... iowait=1.5, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0) cpupercent(user=1.0, nice=0.0, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

So i went ahead and inserted the print statements, but still there were no error messages and just an exit code of 1. I realized that the terminal was probably being cleared in the process by the --watch option, so i just dumped everything to a log file and finally have the error messages.
This is from after the last successful run till the process crashed
On three different runs the crash was exactly from the same reason. Also to note that the v0.6 process has been running fine the entire time (2 days and some change).
@wookayin There have been no crashes since the fix, the issue is now resolved.