[BUG] Fatal error when using reload command
See original GitHub issueDescribe the bug
Attempting to use the @reload
command from in-game often produces a traceback of the webserver internal port being in-use that prevents the server from restarting, leaving the game dead.
20-01-18 17:21:03-05 [..] Server disconnected from the portal.
20-01-18 17:21:03-05 Main loop terminated.
20-01-18 17:21:03-05 Server Shut Down.
20-01-18 17:21:03-05 Loading /home/tehom/arx/evennia/evennia/server/server.py...
20-01-18 17:21:05-05 Loaded.
20-01-18 17:21:05-05 twistd 19.10.0 (/home/tehom/arx/arxvenv/bin/python 3.8.1) starting up.
20-01-18 17:21:05-05 reactor class: twisted.internet.epollreactor.EPollReactor.
20-01-18 17:21:05-05 Traceback (most recent call last):
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/internet/tcp.py", line 1361, in startListening
20-01-18 17:21:05-05 skt.bind(addr)
20-01-18 17:21:05-05 OSError: [Errno 98] Address already in use
20-01-18 17:21:05-05
20-01-18 17:21:05-05 During handling of the above exception, another exception occurred:
20-01-18 17:21:05-05
20-01-18 17:21:05-05 Traceback (most recent call last):
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/bin/twistd", line 11, in <module>
20-01-18 17:21:05-05 load_entry_point('Twisted==19.10.0', 'console_scripts', 'twistd')()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/twistd.py", line 31, in run
20-01-18 17:21:05-05 app.run(runApp, ServerOptions)
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/app.py", line 674, in run
20-01-18 17:21:05-05 runApp(config)
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/twistd.py", line 25, in runApp
20-01-18 17:21:05-05 runner.run()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/app.py", line 385, in run
20-01-18 17:21:05-05 self.postApplication()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/_twistd_unix.py", line 254, in postApplication
20-01-18 17:21:05-05 self.startApplication(self.application)
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/_twistd_unix.py", line 442, in startApplication
20-01-18 17:21:05-05 service.IService(application).privilegedStartService()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/service.py", line 283, in privilegedStartService
20-01-18 17:21:05-05 service.privilegedStartService()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/service.py", line 283, in privilegedStartService
20-01-18 17:21:05-05 service.privilegedStartService()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/internet.py", line 114, in privilegedStartService
20-01-18 17:21:05-05 self._port = self._getPort()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/internet.py", line 141, in _getPort
20-01-18 17:21:05-05 return getattr(_maybeGlobalReactor(self.reactor),
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/internet/posixbase.py", line 495, in listenTCP
20-01-18 17:21:05-05 p.startListening()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/internet/tcp.py", line 1363, in startListening
20-01-18 17:21:05-05 raise CannotListenError(self.interface, self.port, le)
20-01-18 17:21:05-05 twisted.internet.error.CannotListenError: Couldn't listen on 127.0.0.1:5001: [Errno 98] Address already in use.
To Reproduce
Steps to reproduce the behavior:
- Use
@reload
- Sometimes the above happens randomly
- See error
Expected behavior
The game to restart normally. Using evennia reload
from shell works just fine.
Environment, Evennia version, OS etc
Evennia version: 0.9.0 (rev 9649295) OS: posix Python: 3.8.1 Twisted: 19.10.0 Django: 2.2.9
Additional context
WEBSERVER_PORTAL_INTERNAL is set to 5001. So far that’s the only ‘port in use’ issue we’ve come across. Tentatively I want to say this is happening about half the time during our peak hours, but I’m not hammering reloads to test it.
Issue Analytics
- State:
- Created 4 years ago
- Comments:18 (18 by maintainers)
Top Results From Across the Web
reload command crashes irc.php · Issue #1218 - GitHub
Hi, Fatal error disappeared but the process just stops. Shouldn't it be reloaded? [Tue, 09 Jun 2015 14:51:51 +0200] Setting up IRC-Bot.
Read more >CSCse66611 - 'sysdb' detected the 'fatal' condition ... - Cisco Bug
Symptom: Under certain conditions, console may be inaccessible / may get error message : Command authorization failed - 'sysdb' detected the ...
Read more >Bug #1738412 “Init script fails test on reload/restart because o...”
There is a mistake in the expression that is looked for, though: instead of "FATAL: ", the initscript is looking for "FATAL "...
Read more >Error Message: FATAL error: timer array exceeded - AskF5
To work around this issue, you can restart the tmrouted process on the standby unit to reload the correct configuration. To do so,...
Read more >4144 – slurmctld dies with slurm_reconfigure error after ...
You have to restart slurmctld. > [2017-09-11T08:59:54.448] fatal: read_slurm_conf: hostnames inconsistency > detected > > The slurmctld daemon ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I suggest altering the reload process so that it uses os.exec() on the server side to re-launch itself after shutting down the reactor loop. This would ensure the Python interpreter reloads all modules, and, as long as all file handles are closed before doing so, is perfectly fine and will not leak resources. The portal can just keep trying to connect to the server.
That might be a good way to go about it. I did have it fail to restart on the first re-try once, when I did a reload, then attempted to restart it from the console 6 seconds later. So apparently sometimes the port can stay occupied after shutdown is initiated for at least 6 seconds.