Service not found error preventing spawn
See original GitHub issueI installed a jupterhub about one year ago, and it was working well. The hub runs in a docker, as the nginx proxy that allows an https access. The other nodes are managed in a docker swarm. The users are authenticated using CAS. The home directories of the users are stored on a nfs-mounted volume.
Randomly, users can’t login, and the log shows an error 403:
[W 2019-10-18 11:30:53.464 JupyterHub log:174] 403 POST /hub/api/users/some-user/activity
I should say that the hub has been booted hours ago, starting from an empty jupyterhub.sqlite file, never restarted, and the sqlite file left untouched. The error seems to appear more frequently when a user stops and restarts his server. The configuration is the same for all users, but the error is frequent for some users, rare for other users. Moreover, the error seems to disappear when I run the hub without any worker nodes in the swarm (all spawned servers running on the hub’s host). My own account (admin) seems not to be affected by the error.
Note: I submited this issue to the jupyterhub/jupyterhub gitlab project. minrk answered:
The 403 errors here are caused by the API request made by single-user servers to register activity. Where there’s definitely a bug (in SwarmSpawner, it looks like), is in the fact we see logs of the Spawner claiming that the service was not started, but it clearly was because it is making API requests to the Hub.
The sequence of events:
SwarmSpawner launches the service (bug somewhere) SwarmSpawner believes this has failed, but it has not. Hub begins cleanup of server, including revoking credentials for the API token allocated to the server. the service finishes starting, and starts making API requests, but its token has been revoked in step 3, resulting in 403.
So the 403 is a symptom, but the real error is the SwarmSpawner is starting servers, but it thinks it is failing somehow.
| [I 2019-10-22 12:40:53.176 JupyterHub base:812] User userc-testc took 53.770 seconds to start
the error seems to be related to the following sequence of events:
[D 2019-10-18 11:29:59.979 JupyterHub user:542] Calling Spawner.start for kaniav-kamary
[D 2019-10-18 11:29:59.988 JupyterHub dockerspawner:813] Getting container 'jupyter-kaniav-kamary' for dockerspawner::start before remove id:
[I 2019-10-18 11:29:59.994 JupyterHub dockerspawner:820] Service 'jupyter-kaniav-kamary' is gone
[I 2019-10-18 11:30:00.016 JupyterHub dockerspawner:1030] Created service jupyter-kaniav-kamary (id: ge4p07y) from image 160.228.22.168:5000/hdlbq/cs-notebook-r
[I 2019-10-18 11:30:00.017 JupyterHub dockerspawner:1053] Starting service jupyter-kaniav-kamary (id: ge4p07y)
[D 2019-10-18 11:30:00.017 JupyterHub swarmspawner:144] Getting task of service 'jupyter-kaniav-kamary'
| [D 2019-10-18 11:30:00.017 JupyterHub dockerspawner:813] Getting container 'jupyter-kaniav-kamary' for swarmspawner::get_task id:ge4p07y
| [D 2019-10-18 11:30:00.026 JupyterHub swarmspawner:256] Service ge4p07y state: pending
| [I 2019-10-18 11:30:00.920 JupyterHub log:174] 302 GET /hub/spawn -> /hub/spawn-pending/kaniav-kamary (kaniav-kamary@10.255.0.2) 1019.67ms
| [I 2019-10-18 11:30:00.960 JupyterHub pages:303] kaniav-kamary is pending spawn
| [I 2019-10-18 11:30:00.962 JupyterHub log:174] 200 GET /hub/spawn-pending/kaniav-kamary (kaniav-kamary@10.255.0.2) 18.15ms
| [D 2019-10-18 11:30:01.027 JupyterHub swarmspawner:144] Getting task of service 'jupyter-kaniav-kamary'
| [D 2019-10-18 11:30:01.028 JupyterHub dockerspawner:813] Getting container 'jupyter-kaniav-kamary' for swarmspawner::get_task id:ge4p07y
| [E 2019-10-18 11:30:01.036 JupyterHub user:626] Unhandled error starting kaniav-kamary's server: Service jupyter-kaniav-kamary not found
| [D 2019-10-18 11:30:01.036 JupyterHub user:724] Stopping kaniav-kamary
| [D 2019-10-18 11:30:01.036 JupyterHub swarmspawner:144] Getting task of service 'jupyter-kaniav-kamary'
| [D 2019-10-18 11:30:01.037 JupyterHub dockerspawner:813] Getting container 'jupyter-kaniav-kamary' for swarmspawner::get_task id:ge4p07y
| [W 2019-10-18 11:30:01.044 JupyterHub swarmspawner:128] Service jupyter-kaniav-kamary not found
| [D 2019-10-18 11:30:01.057 JupyterHub user:752] Deleting oauth client jupyterhub-user-kaniav-kamary
| [D 2019-10-18 11:30:01.069 JupyterHub user:755] Finished stopping kaniav-kamary
| ERROR:asyncio:Task exception was never retrieved
| future: <Task finished coro=<BaseHandler.spawn_single_user() done, defined at /opt/conda/lib/python3.6/site-packages/jupyterhub/handlers/base.py:697> exception=RuntimeError('Service jupyter-kaniav-kamary not found',)>
| Traceback (most recent call last):
| File "/opt/conda/lib/python3.6/site-packages/jupyterhub/handlers/base.py", line 889, in spawn_single_user
| timedelta(seconds=self.slow_spawn_timeout), finish_spawn_future
| File "/opt/conda/lib/python3.6/site-packages/jupyterhub/handlers/base.py", line 807, in finish_user_spawn
| await spawn_future
| File "/opt/conda/lib/python3.6/site-packages/jupyterhub/user.py", line 642, in spawn
| raise e
| File "/opt/conda/lib/python3.6/site-packages/jupyterhub/user.py", line 546, in spawn
| url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
| File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 736, in run
| yielded = self.gen.throw(*exc_info) # type: ignore
| File "/opt/conda/lib/python3.6/site-packages/dockerspawner/dockerspawner.py", line 1057, in start
| yield self.start_object()
| File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 729, in run
| File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 742, in run
| yielded = self.gen.send(value)
| File "/opt/conda/lib/python3.6/site-packages/dockerspawner/swarmspawner.py", line 252, in start_object
| raise RuntimeError("Service %s not found" % self.service_name)
| RuntimeError: Service jupyter-kaniav-kamary not found
| [I 2019-10-18 11:30:01.107 JupyterHub log:174] 200 GET /hub/api/users/kaniav-kamary/server/progress (kaniav-kamary@10.255.0.2) 16.13ms
| [W 2019-10-18 11:30:53.464 JupyterHub log:174] 403 POST /hub/api/users/kaniav-kamary/activity (@10.0.0.48) 3.26ms
| [W 2019-10-18 11:30:54.423 JupyterHub log:174] 403 POST /hub/api/users/kaniav-kamary/activity (@10.0.0.48) 3.25ms
| [W 2019-10-18 11:30:55.849 JupyterHub log:174] 403 POST /hub/api/users/kaniav-kamary/activity (@10.0.0.48) 3.77ms
| [W 2019-10-18 11:30:59.051 JupyterHub log:174] 403 POST /hub/api/users/kaniav-kamary/activity (@10.0.0.48) 3.36ms
Debian 9.11
Docker version 19.03.3, build a872fc2f86
jupyterhub docker version: latest
nginx docker version: latest
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (1 by maintainers)
Top GitHub Comments
Hi, Yes, you’re right. It seems that the error occurs:
Hello, this issue seems to be linked to the issue #330:“Timeout potentially due to docker inspect_container prior to creation”. I think I solved it by applying the patch suggested by gdbassett on 27 Aug.