Timeout potentially due to docker inspect_container prior to creation
See original GitHub issueI have a jupyterhub environment using swarmspawner. It works fine in my local testing, but when deployed to my rather old and slow production environment, it fails with the attached logs. However, when I look at the services, the service is up and running successfully. If I trace it back, it seems to go https://github.com/jupyterhub/dockerspawner/blob/9d4a35995d2c2dd992e070cc7ad260123308b606/dockerspawner/swarmspawner.py#L252 through get_task() to https://github.com/jupyterhub/dockerspawner/blob/9d4a35995d2c2dd992e070cc7ad260123308b606/dockerspawner/dockerspawner.py#L781 which calls inspect_service() from https://docker-py.readthedocs.io/en/stable/api.html#module-docker.api.service. If I manually run inspect_service() and tasks() I get back a running service.
However, inspecting during startup, I get this:
>>> for task in client.tasks(filters={"service": "jupyter-gdb"}):
... print(task['Status']['State'])
...
rejected
rejected
>>> for task in client.tasks(filters={"service": "jupyter-gdb"}):
... print(task['Status']['State'])
...
rejected
rejected
rejected
>>> for task in client.tasks(filters={"service": "jupyter-gdb"}):
... print(task['Status']['State'])
...
rejected
rejected
rejected
rejected
>>> for task in client.tasks(filters={"service": "jupyter-gdb"}):
... print(task['Status']['State'])
...
rejected
rejected
rejected
rejected
rejected
>>> for task in client.tasks(filters={"service": "jupyter-gdb"}):
... print(task['Status']['State'])
...
rejected
rejected
rejected
ready
rejected
>>> for task in client.tasks(filters={"service": "jupyter-gdb"}):
... print(task['Status']['State'])
...
rejected
rejected
rejected
running
rejected
For some reason, it appears as if the task gets rejected for a while before running and the swarmspawner picks this up as a failure. I wonder if it has to do https://github.com/jupyterhub/dockerspawner/blob/9d4a35995d2c2dd992e070cc7ad260123308b606/dockerspawner/swarmspawner.py#L257 checking ‘State’ instead of ‘Message’, as Message is ‘preparing’ when State is ‘rejected’:
'Status': {'ContainerStatus': {},
'Err': 'No such image: '
'<image name and hash>',
'Message': 'preparing',
'PortStatus': {},
'State': 'rejected',
'Timestamp': '2019-08-27T01:56:24.483764876Z'}
vs
'Status': {'ContainerStatus': {'ContainerID': '<container id>',
'PID': 3954},
'Message': 'started',
'PortStatus': {},
'State': 'running',
'Timestamp': '2019-08-27T01:56:34.956884935Z'},
Logs:
[D 2019-08-26 22:54:17.683 JupyterHub pages:165] Triggering spawn with default options for gdb
[D 2019-08-26 22:54:17.683 JupyterHub base:780] Initiating spawn for gdb
[D 2019-08-26 22:54:17.683 JupyterHub base:787] 0/100 concurrent spawns
[D 2019-08-26 22:54:17.683 JupyterHub base:792] 0/100 active servers
[D 2019-08-26 22:54:17.709 JupyterHub user:542] Calling Spawner.start for gdb
[W 2019-08-26 22:54:17.711 JupyterHub base:900] User gdb is slow to start (timeout=0)
[I 2019-08-26 22:54:17.712 JupyterHub log:174] 302 GET /hub/spawn -> /hub/spawn-pending/gdb (gdb@10.255.0.3) 34.88ms
[D 2019-08-26 22:54:17.726 JupyterHub dockerspawner:777] Getting container 'jupyter-gdb'
[I 2019-08-26 22:54:17.729 JupyterHub dockerspawner:784] Service 'jupyter-gdb' is gone
[I 2019-08-26 22:54:17.756 JupyterHub dockerspawner:990] Created service jupyter-gdb (id: q70k420) from image <image_name>
[I 2019-08-26 22:54:17.756 JupyterHub dockerspawner:1013] Starting service jupyter-gdb (id: q70k420)
[D 2019-08-26 22:54:17.756 JupyterHub swarmspawner:144] Getting task of service 'jupyter-gdb'
[D 2019-08-26 22:54:17.756 JupyterHub dockerspawner:777] Getting container 'jupyter-gdb'
[D 2019-08-26 22:54:17.764 JupyterHub swarmspawner:256] Service q70k420 state: pending
[I 2019-08-26 22:54:17.814 JupyterHub pages:303] gdb is pending spawn
[I 2019-08-26 22:54:17.818 JupyterHub log:174] 200 GET /hub/spawn-pending/gdb (gdb@<IP>) 10.49ms
[D 2019-08-26 22:54:18.765 JupyterHub swarmspawner:144] Getting task of service 'jupyter-gdb'
[D 2019-08-26 22:54:18.765 JupyterHub dockerspawner:777] Getting container 'jupyter-gdb'
[E 2019-08-26 22:54:18.771 JupyterHub user:626] Unhandled error starting gdb's server: Service jupyter-gdb not found
[D 2019-08-26 22:54:18.771 JupyterHub user:724] Stopping gdb
[D 2019-08-26 22:54:18.771 JupyterHub swarmspawner:144] Getting task of service 'jupyter-gdb'
[D 2019-08-26 22:54:18.772 JupyterHub dockerspawner:777] Getting container 'jupyter-gdb'
[W 2019-08-26 22:54:18.778 JupyterHub swarmspawner:128] Service jupyter-gdb not found
[D 2019-08-26 22:54:18.785 JupyterHub user:752] Deleting oauth client jupyterhub-user-gdb
[D 2019-08-26 22:54:18.793 JupyterHub user:755] Finished stopping gdb
[E 2019-08-26 22:54:18.797 JupyterHub gen:593] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /opt/conda/lib/python3.6/site-packages/jupyterhub/handlers/base.py:800> exception=RuntimeError('Service jupyter-gdb not found',)> after timeout
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 589, in error_callback
future.result()
File "/opt/conda/lib/python3.6/site-packages/jupyterhub/handlers/base.py", line 807, in finish_user_spawn
await spawn_future
File "/opt/conda/lib/python3.6/site-packages/jupyterhub/user.py", line 642, in spawn
raise e
File "/opt/conda/lib/python3.6/site-packages/jupyterhub/user.py", line 546, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
File "/opt/conda/lib/python3.6/site-packages/dockerspawner/dockerspawner.py", line 1017, in start
yield self.start_object()
File "/opt/conda/lib/python3.6/site-packages/dockerspawner/swarmspawner.py", line 252, in start_object
raise RuntimeError("Service %s not found" % self.service_name)
RuntimeError: Service jupyter-gdb not found
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (2 by maintainers)
Top GitHub Comments
@Wildcarde Here’s what I did. Pardon all the extra stuff to make it easy to load. I’m sure there’s a better way, but this is what got the job done:
swarmspawnergdb.zip
Here’s a patch I generated. I subclassed swarmspawner with these edits to get it to work in my environment.