Salt job hangs/runs forever when printing large number of lines to stdout
See original GitHub issueDescription of Issue
Salt job runs forever when executing shell or python script on Windows Minion, but the same script runs just fine on Ubuntu/Mac minion and has complete under few seconds. After checking the job status after 12 hours, it was still running.
If I run the script directly on the Windows VM, it runs fine.
Setup
Python script: “CreateP4Tickets.py”
#!/bin/python3
import shlex
import subprocess
import os
import sys
import argparse
import re
import time
def login2P4(name, p4Port, p4User, p4Passwd):
print(f"\nINFO: ----- Creating P4 ticket for {name} -----")
cmd = f'p4 -p {p4Port} trust -y -f'
p0 = subprocess.Popen(shlex.split(cmd), encoding='utf-8', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if p0.returncode:
for line in p0.stderr.split('\n'):
if line != '':
print(f"ERROR: {line}")
return
p0.stdout.close()
# Read the password for input for next command.
cmd = f'echo {p4Passwd}'
p1 = subprocess.Popen(shlex.split(cmd), encoding='utf-8', stdout=subprocess.PIPE)
cmd = f'p4 -u walf -p {p4Port} login'
p2 = subprocess.Popen(shlex.split(cmd), encoding='utf-8', stdin=p1.stdout, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
p1.stdout.close()
output, err = p2.communicate()
if p2.returncode:
for line in err.split('\n'):
if line != '':
print(f"ERROR: {line}")
sys.exit(1)
print('INFO: P4 Ticket successfully created.')
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--p4User', default=os.getenv('P4USER'),
help='Perforce Username. Default is environment variable "P4USER"')
parser.add_argument('--p4Passwd', default=os.getenv('P4PASSWD'),
help='Perforce Password. Default is environment variable "P4PASSWD"')
parser.add_argument('--p4ServerList',
help='File with P4 server names and ports separated by "~". One P4 server per line. Default file is "P4ServerList.txt"')
args = parser.parse_args()
if args.p4Passwd is None or args.p4Passwd == "Dummy":
sys.exit('ERROR: Please provide a valid password or make sure environment variable "P4PASSWD" is set correctly')
if args.p4ServerList is None:
args.p4ServerList = os.path.join(os.path.dirname(os.path.realpath(__file__)), "P4ServerList.txt")
if not os.path.exists(args.p4ServerList):
sys.exit(f'ERROR: File "{args.p4ServerList}" doesn\'t exist')
with open(args.p4ServerList, encoding='utf-8') as f:
# login2P4('calf', 'ssl:basil:20460', args.p4User, args.p4Passwd)
for line in f:
if re.match(r"\s*(#|$)", line):
continue
line = line.strip()
name, p4Port = line.split('~')
login2P4(name, p4Port, args.p4User, args.p4Passwd)
if __name__ == "__main__":
start = time.perf_counter()
print('\nRunning...{}'.format(os.path.basename(sys.argv[0])))
main()
end = time.perf_counter()
minutes, seconds = divmod(end - start, 60)
if minutes > 0:
print('\nTime elapsed = {} min {} secs'.format(minutes, seconds))
else:
print('\nTotal time elapsed - {0:.2f} secs'.format(seconds))
print('Done.')
State File: “test-create-p4ticket.sls”
{% if grains['os'] == "Windows" %}
{% set tempFolder = "c:/tmp" %}
Create Temp Directory:
file.directory:
- name: {{ tempFolder }}
{% else %}
{% set tempFolder = '/tmp' %}
{% endif %}
{% set p4TicketScript = 'CreateP4Tickets.py' %}
{% set p4ServerList = 'P4ServerList.txt' %}
Copy script:
file.managed:
- name: {{ tempFolder }}/{{ p4TicketScript }}
- source: "salt://{{ p4TicketScript }}"
- skip_verify: True
- require:
- file: Copy serverlist
Copy serverlist:
file.managed:
- name: {{ tempFolder }}/{{ p4ServerList }}
- source: "salt://{{ p4ServerList }}"
- skip_verify: True
Create P4 Ticket:
cmd.run:
- name: python3 {{ p4TicketScript }} --p4User {{ pillar['l10nuser'] }} --p4Passwd {{ pillar['P4PASSWD'] }}
- cwd: {{ tempFolder }}
- runas: {{ pillar['l10nuser'] }}
- require:
- file: Copy script
Steps to Reproduce Issue
Here is direct command
salt -G 'os:windows' cmd.run 'python3 CreatePerforceTickets.py --p4User user --p4Passwd password' runas=localuser cwd="C:\\tmp"
The job runs forever.
I debugged the command using the below command
salt-call -l debug state.apply test-create-p4ticket
Here is the error I get
Unable to OpenProcess pid=4 name=System
salt -G 'os:windows' cmd.run 'cmd.exe /c python3 CreatePerforceTickets.py --p4User user --p4Passwd password' runas=localuser cwd="C:\\tmp" -l debug
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/file_roots.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/file_roots.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/master.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/master.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/salt-api.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/salt-api.conf
[DEBUG ] Changed git to gitfs in master opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: myhostname
[DEBUG ] Missing configuration file: /root/.saltrc
[DEBUG ] Configuration file path: /etc/salt/master
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Including configuration from '/etc/salt/master.d/file_roots.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/file_roots.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/master.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/master.conf
[DEBUG ] Including configuration from '/etc/salt/master.d/salt-api.conf'
[DEBUG ] Reading configuration from /etc/salt/master.d/salt-api.conf
[DEBUG ] Changed git to gitfs in master opts' fileserver_backend list
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: myhostname
[DEBUG ] Missing configuration file: /root/.saltrc
[DEBUG ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Popen(['git', 'version'], cwd=/root, universal_newlines=False, shell=None)
[DEBUG ] Popen(['git', 'version'], cwd=/root, universal_newlines=False, shell=None)
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'myhostname_master', 'tcp://xxx.xxx.xxx.xxx:4506', 'clear')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://xxx.xxx.xxx.xxx:4506
[DEBUG ] Trying to connect to: tcp://xxx.xxx.xxx.xxx:4506
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] LazyLoaded local_cache.get_load
[DEBUG ] Reading minion list from /var/cache/salt/master/jobs/f1/a0c99915c21419729d48151385b97705c9be807ef3f234faaeef8d32b8e9ac/.minions.p
[DEBUG ] get_iter_returns for jid 20200715210816677523 sent to {'minion-1'} will timeout at 14:11:36.692135
[DEBUG ] Checking whether jid 20200715210816677523 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'myhostname_master', 'tcp://xxx.xxx.xxx.xxx:4506', 'clear')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://xxx.xxx.xxx.xxx:4506
[DEBUG ] Trying to connect to: tcp://xxx.xxx.xxx.xxx:4506
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] retcode missing from client return
[DEBUG ] Checking whether jid 20200715210816677523 is still running
Versions Report
Salt Master
$ salt --versions-report
Salt Version:
Salt: 3001
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: 2.6.1
docker-py: Not Installed
gitdb: 2.0.3
gitpython: 2.1.8
Jinja2: 2.10
libgit2: 0.26.0
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.5.6
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: 3.4.7
pygit2: 0.26.2
Python: 3.6.9 (default, Apr 18 2020, 01:56:04)
python-gnupg: 0.4.1
PyYAML: 3.12
PyZMQ: 17.1.2
smmap: 2.0.3
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.2.5
System Versions:
dist: ubuntu 18.04 Bionic Beaver
locale: ISO-8859-1
machine: x86_64
release: 4.15.0-101-generic
system: Linux
version: Ubuntu 18.04 Bionic Beaver
Salt-minion
$ salt-minion --versions-report
Salt Version:
Salt: 3001
Dependency Versions:
cffi: 1.12.2
cherrypy: 17.4.1
dateutil: 2.8.0
docker-py: Not Installed
gitdb: 2.0.5
gitpython: 2.1.10
Jinja2: 2.10.1
libgit2: Not Installed
M2Crypto: Not Installed
Mako: 1.0.7
msgpack-pure: Not Installed
msgpack-python: 0.5.6
mysql-python: Not Installed
pycparser: 2.19
pycrypto: Not Installed
pycryptodome: 3.9.7
pygit2: Not Installed
Python: 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)]
python-gnupg: 0.4.4
PyYAML: 5.1.2
PyZMQ: 18.0.1
smmap: 2.0.5
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.1
System Versions:
dist: msys_nt-10.0-19041 3.0.7
locale: cp1252
machine: AMD64
release: 10
system: Windows
version: 10 10.0.19041 SP0
Salt master OS: Ubuntu 18.04 Salt-minion version: 3001 Salt-minion OS: Windows 10 version 2004
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (8 by maintainers)
I’ll have to get with @twangboy on this as I’ve been unable to reproduce that specific behavior when the minion just doesn’t respond anymore. But that makes sense as to what the actual issue. I’ll update the ticket accordingly.
Apologize for the delay, I will take a look in next couple of days and get back to you.
On Mon, Jan 25, 2021 at 3:54 PM Joe Eacott notifications@github.com wrote: