404 errors when combined with wrapspawner
See original GitHub issueApologies if this is the wrong place,struggling through the documentation and hit a brick wall, this was my best guess for help
- Slurm HPC cluster
- jupyterhub server is a VM that is also a registered Slurm submission node (slurm jobs can be submitted directly from the jupyterhub node)
- conda, jupyterhub, etc are located on a shared drive for easy access to all the compute nodes
- modules are used to preserve multiple versions of software packages
- jupyterhub has been given it’s own private anaconda module
Problem: I got batchspawner working on it’s own first. I later added in lines for wrapspawner After adding wrapspawner options I now get the below errors:
Slurm output:
[I 2019-02-01 13:23:06.093 BatchSingleUserNotebookApp manager:46] [nb_conda_kernels] enabled, 0 kernels found
[I 2019-02-01 13:23:06.919 BatchSingleUserNotebookApp extension:168] JupyterLab extension loaded from /packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterlab
[I 2019-02-01 13:23:06.919 BatchSingleUserNotebookApp extension:169] JupyterLab application directory is /packages/7x/anaconda3/2018.12-jh/share/jupyter/lab
[W 2019-02-01 13:23:06.931 BatchSingleUserNotebookApp auth:303] Failed to check authorization: [404] Not Found
[W 2019-02-01 13:23:06.931 BatchSingleUserNotebookApp auth:304] {"status": 404, "message": "Not Found"}
Traceback (most recent call last):
File "/packages/7x/anaconda3/2018.12-jh/bin/batchspawner-singleuser", line 7, in <module>
exec(compile(f.read(), __file__, 'exec'))
File "/packages/7x/build/batchspawner/0.9.0dev0/batchspawner/scripts/batchspawner-singleuser", line 6, in <module>
File "/packages/7x/build/batchspawner/0.9.0dev0/batchspawner/batchspawner/singleuser.py", line 18, in main
return BatchSingleUserNotebookApp.launch_instance(argv)
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyter_core/application.py", line 266, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
File "/packages/7x/build/batchspawner/0.9.0dev0/batchspawner/batchspawner/singleuser.py", line 14, in start
json={'port' : self.port})
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/services/auth.py", line 305, in _api_request
raise HTTPError(500, "Failed to check authorization")
tornado.web.HTTPError: HTTP 500: Internal Server Error (Failed to check authorization)
From JupyterHub side:
[I 2019-02-01 13:23:01.411 JupyterHub batchspawner:243] Spawner submitted script:
#SBATCH -q debug
#SBATCH -p debug
#SBATCH -t 0-12:00:00
#SBATCH -n 1
#SBATCH -o /home/USER/jupyterhub.%j.out
#SBATCH -e /home/USER/jupyterhub.%j.err
#SBATCH --export ALL
###SBATCH -w cg1-6
source /etc/profile
module load anaconda3/.2018.12-jh
batchspawner-singleuser --ip="" --notebook-dir="~"
[I 2019-02-01 13:23:01.483 JupyterHub batchspawner:246] Job submitted. cmd: sudo -E -u USER sbatch --parsable output: 859661
[D 2019-02-01 13:23:01.484 JupyterHub batchspawner:269] Spawner querying job: sudo -E -u USER squeue -h -j 859661 -o '%T %B'
[D 2019-02-01 13:23:01.512 JupyterHub batchspawner:369] Job 859661 still pending
[D 2019-02-01 13:23:02.013 JupyterHub batchspawner:269] Spawner querying job: sudo -E -u USER squeue -h -j 859661 -o '%T %B'
[D 2019-02-01 13:23:02.044 JupyterHub batchspawner:369] Job 859661 still pending
[D 2019-02-01 13:23:02.547 JupyterHub batchspawner:269] Spawner querying job: sudo -E -u USER squeue -h -j 859661 -o '%T %B'
[W 2019-02-01 13:23:07.008 JupyterHub log:158] 404 POST /hub/api/batchspawner (USER@ 1.09ms
[W 2019-02-01 13:23:11.100 JupyterHub base:714] User USER is slow to start (timeout=10)
[I 2019-02-01 13:23:11.178 JupyterHub log:158] 302 POST /hub/spawn?next=%2Fhub%2Fuser%2FUSER%2F -> /hub/user/USER/ (USER@ 10146.75ms
[D 2019-02-01 13:23:11.280 JupyterHub base:1008] Waiting for USER pending spawn
[I 2019-02-01 13:23:21.281 JupyterHub base:1012] Pending spawn for USER didn't finish in 10.0 seconds
[I 2019-02-01 13:23:21.281 JupyterHub base:1018] USER is pending spawn
[I 2019-02-01 13:23:21.289 JupyterHub log:158] 200 GET /hub/user/USER/ (USER@ 10079.46ms
[D 2019-02-01 13:23:21.344 JupyterHub log:158] 200 GET /hub/static/css/style.min.css?v=dd1df30ccc6c4d3e9705d78012d25b57 (@ 2.31ms
[W 2019-02-01 13:24:01.483 JupyterHub user:471] USER's server failed to start in 60 seconds, giving up
[D 2019-02-01 13:24:01.484 JupyterHub batchspawner:269] Spawner querying job: sudo -E -u USER squeue -h -j 859661 -o '%T %B'
[D 2019-02-01 13:24:01.552 JupyterHub user:578] Deleting oauth client jupyterhub-user-USER
[E 2019-02-01 13:24:01.685 JupyterHub gen:974] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/handlers/base.py:619> exception=TimeoutError('Timeout')> after timeout
Traceback (most recent call last):
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/tornado/gen.py", line 970, in error_callback
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/handlers/base.py", line 626, in finish_user_spawn
await spawn_future
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/user.py", line 489, in spawn
raise e
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/user.py", line 409, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
[E 2019-02-01 13:24:01.699 JupyterHub gen:974] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/handlers/base.py:619> exception=TimeoutError('Timeout')> after timeout
Traceback (most recent call last):
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/tornado/gen.py", line 970, in error_callback
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/tornado/gen.py", line 970, in error_callback
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/handlers/base.py", line 626, in finish_user_spawn
await spawn_future
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/user.py", line 489, in spawn
raise e
File "/packages/7x/anaconda3/2018.12-jh/lib/python3.7/site-packages/jupyterhub/user.py", line 409, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
Config file:
## Load Batchspawner which enables intergration with SLURM
c.JupyterHub.spawner_class = 'wrapspawner.ProfilesSpawner'
c.Spawner.http_timeout = 120
# BatchSpawnerBase configuration
# These are simply setting parameters used in the job script template below
#c.BatchSpawnerBase.req_nprocs = '4'
#c.BatchSpawnerBase.req_queue = 'debug'
#c.BatchSpawnerBase.req_runtime = '0-8:00:00'
#c.BatchSpawnerBase.req_memory = '4gb'
c.Spawner.notebook_dir = '~'
# SlurmSpawner configuration
c.SlurmSpawner.batch_script = '''#!/bin/bash
#SBATCH -q {queue}
#SBATCH -p debug
#SBATCH -t {runtime}
#SBATCH -n {nprocs}
#SBATCH -o {homedir}/jupyterhub.%j.out
#SBATCH -e {homedir}/jupyterhub.%j.err
#SBATCH --export ALL
###SBATCH -w cg1-6
source /etc/profile
module load anaconda3/.2018.12-jh
## SSL Certificate locations
c.JupyterHub.ssl_cert = '/etc/pki/CA/certs/jupyter.crt'
c.JupyterHub.ssl_key = '/etc/pki/CA/private/jupyter.key'
## URL for Jupyterhub to bind to
#c.JupyterHub.bind_url = 'https://jupyterhub.localdomain.com:443'
c.JupyterHub.ip = 'jupyterhub.localdomain.com'
c.JupyterHub.port = 443
c.JupyterHub.hub_ip = 'jupyterhub.localdomain.com'
## Set authentication options
# prevents JupyterHub from creating local users
c.LocalAuthenticator.create_system_users = False
# Set admin users (admin users can run jobs and/or manage other users notebook servers)
c.Authenticator.admin_users = {'USER1', 'USER2'}
# Set the Jupyterhub log file location
c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'
# Set the log level by value or name.
c.JupyterHub.log_level = 'DEBUG'
# Spawner Profiles
c.ProfilesSpawner.profiles = [
( "Local Server", 'local', 'jupyterhub.spawner.LocalProcessSpawner', {'ip':''} ),
('clustername - 1 core, 4.5GB, 12 hours', 'clustername1c12h', 'batchspawner.SlurmSpawner',
dict(req_nprocs='1', req_queue='debug', req_runtime='0-12:00:00')),
('clustername - 4 cores, 18GB, 8 Hours', 'clustername4c8h', 'batchspawner.SlurmSpawner',
dict(req_nprocs='4', req_queue='debug', req_runtime='0-08:00:00')),
('clustername - 14 cores, 63GB, 4 hours', 'clustername14c4h', 'batchspawner.SlurmSpawner',
dict(req_nprocs='14', req_queue='debug', req_runtime='0-04:00:00')),
('clustername - 28 cores, 128GB 1 hour', 'clustername28c1h', 'batchspawner.SlurmSpawner',
dict(req_nprocs='28', req_queue='debug', req_runtime='0-01:00:00')),
TYIA, help is greatly appreciated
Top GitHub Comments
More infomation:
Same issue with master branch on both JupyterHub 0.9.4 and 0.8.1 with the master batchspawner master (0.9dev)
I got my system working using JupyterHub 0.8.1 and batchspawner tag 0.8.1.
I think this is updated in the current README now, with a solution of
import batchspawrer
, which is a bit more generic and works even if the API handling gets changed. Please let us know if more is needed.