question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

a binder deployment with authentication and persistent storage

See original GitHub issue

Currently we are working on a binder deployment with authentication and persistent storage enabled and with a user interface in JupyterHub home page, where users can manage their repositories/projects.

For this purpose we have now a deployment running on https://notebooks-test.gesis.org/jupyter/. When you first login, you will see the JupyterHub home page (https://notebooks-test.gesis.org/jupyter/hub/home) with 2 parts: “Your projects” table and the classical binder form with some parts hidden:

ss1

Binder is running under https://notebooks-test.gesis.org/jupyter/services/binder/ and you can also use it but in this deployment the idea is that you don’t need to use it directly.

How it works

Firstly some preliminary information:

  • Each user can start 1 server at a time (named servers are not activated)
  • Each user gets 1 persistent volume and it is mounted on /home/jovyan

Binder form

It is the classical form with ‘share url’ and ‘badge url’ parts are hidden. And it has 1 limitation: branch/tag/commit field is readonly and always “master”. When user launches a repo via form:

  • always the latest version of the repo is built (last commit in master branch) and server is started with this image
  • nbgitpuller is used to pull the code under a sub directory /home/jovyan/{repo_dir}. repo_dir is generated by using provider name, user/org name and repo name. And server is started on that sub directory (you can start a new terminal and there you can list all directories of projects). nbgitpuller is not executed for the default repo (gesiscss/data_science_image).
  • each new launched repo is added into “Your Projects” table. This list is saved in state field of Spawners table and only last 10 launched repos are saved.

In short, binder form is used to create a new project and update it from remote.

Your Projects

When first login, user has there only the default repo (gesiscss/data_science_image). Each repo which is built and launched via binder form is added in this table and user can re-start that repository by using the start buttons on each row. When user clicks on a start button in the table:

  • A server started by using the image (commit) that user last time worked with
  • Right now it is not working but we want to skip nbgitpuller command execution on server start when server is started from projects table, so that user can continue working on where they left. We can do this by passing an option to spawner (I think this is very related to https://github.com/jupyterhub/binderhub/issues/712)
  • We are also thinking about having a delete button in the actions of table which removes the repository from the table and deletes the folder of the repo in user’s persistent volume. Right now we have the button in the actions column but it doesn’t do anything.

In short, “Your Projects” table is used to continue working on a repo (when you don’t want o update the image or code base from remote).

Limitations and missing parts summary

  • nbgitpuller must be installed in user images, right now we use appendix to ensure its installation (maybe it can be added into repo2docker defaults)
  • Users can start a new project only from master (by using the binder form), they can’t start to work on a repo from previous version/commit of it
  • Server start from table also executes nbgitpuller
  • Delete button doesn’t do anything
  • Name generation of sub directories of each repo/project can be done better

Where to find helm config and custom templates

https://notebooks-test.gesis.org/jupyter/ uses github authenticator and everybody is welcome to login and try it out (it is just a test instance and will be deleted again). We really would like to get your feedback about what we have done so far. Probably most important question is if we are on the right track to accomplish what we want. And finally we are aware that there are a lot to improve for user interface.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:10
  • Comments:12 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
bitnikcommented, Dec 11, 2019

I am closing this issue. We can continue discussing this on https://discourse.jupyter.org/t/a-persistent-binderhub-deployment/2865.

0reactions
ltetrelcommented, Nov 18, 2019

Thanks @arnim But in our case we want persistent storage. We got it working by using these ideas here : https://discourse.jupyter.org/t/mounting-server-data-on-each-users-pod/641/4 We have a nfs storage mounted on each node to centralize the data administration and avoid duplication : https://github.com/neurolibre/neurolibre-binderhub/issues/18 We were also thinking to use an initContainer instead of putting repo2data into the config file. This has the advantage of making the process of downloading the data (if needed) more independent (running in a separate container instead).

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Persistent BinderHub Deployment - Jupyter Community Forum
We want to unite the best of JupyterHub and BinderHub. ... we added 2 new features to BinderHub, authentication and persistent storage.
Read more >
Toward versatile JupyterHub deployments, with the Binder ...
After authentication, the user faces a page that is similar to binder's main page: A form to describe and launch the desired persistent...
Read more >
Build your own BinderHub - The Turing Way
Build your own BinderHub#. mybinder.org is the free, public BinderHub that hosts almost 100k Binder launches per week. Why might you want to...
Read more >
Pangeo meets Binder - Medium
Over the last year, we've primarily focused on Pangeo's cloud-based JupyterHub deployment concept (see the figure below for a schematic and ...
Read more >
Providing Persistent Storage to Windows Containers
Used to mount an AWS EBS volume to a pod. It can only be mounted with access type RWO, making it available to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found