JupyterHub is the best way to serve Jupyter notebook for multiple users. It can be used in a class of students, a corporate data science group or scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.
To make life easier, JupyterHub has distributions. Be sure to take a look at them before continuing with the configuration of the broad original system of JupyterHub. Today, you can find two main cases:
Four subsystems make up JupyterHub:
Besides these central pieces, you can add optional configurations through a config.py file and manage users kernels on an admin panel. A simplification of the whole system can be seen in the figure below:
config.py
JupyterHub performs the following functions:
For convenient administration of the Hub, its users, and services, JupyterHub also provides a REST API.
The JupyterHub team and Project Jupyter value our community, and JupyterHub follows the Jupyter Community Guides.
A JupyterHub distribution is tailored towards a particular set of use cases. These are generally easier to set up than setting up JupyterHub from scratch, assuming they fit your use case.
The two popular ones are:
These sections cover how to get up-and-running with JupyterHub. They cover some basics of the tools needed to deploy JupyterHub as well as how to get it running on your own infrastructure.
Before installing JupyterHub, you will need:
a Linux/Unix based system
Python 3.5 or greater. An understanding of using pip or conda for installing Python packages is helpful.
pip
conda
nodejs/npm. Install nodejs/npm, using your operating system’s package manager.
If you are using conda, the nodejs and npm dependencies will be installed for you by conda.
If you are using pip, install a recent version of nodejs/npm. For example, install it on Linux (Debian/Ubuntu) using:
sudo apt-get install npm nodejs-legacy
The nodejs-legacy package installs the node executable and is currently required for npm to work on Debian/Ubuntu.
nodejs-legacy
node
A pluggable authentication module (PAM) to use the default Authenticator. PAM is often available by default on most distributions, if this is not the case it can be installed by using the operating system’s package manager.
TLS certificate and key for HTTPS communication
Domain name
Before running the single-user notebook servers (which may be on the same system as the Hub or not), you will need:
JupyterHub can be installed with pip (and the proxy with npm) or conda:
npm
pip, npm:
python3 -m pip install jupyterhub npm install -g configurable-http-proxy python3 -m pip install notebook # needed if running the notebook servers locally
conda (one command installs jupyterhub and proxy):
conda install -c conda-forge jupyterhub # installs jupyterhub and proxy conda install notebook # needed if running the notebook servers locally
Test your installation. If installed, these commands should return the packages’ help contents:
jupyterhub -h configurable-http-proxy -h
To start the Hub server, run the command:
jupyterhub
Visit https://localhost:8000 in your browser, and sign in with your unix credentials.
https://localhost:8000
To allow multiple users to sign in to the Hub server, you must start jupyterhub as a privileged user, such as root:
sudo jupyterhub
The wiki describes how to run the server as a less privileged user. This requires additional configuration of the system.
Important
We highly recommend following the Zero to JupyterHub tutorial for installing JupyterHub.
A ready to go docker image gives a straightforward deployment of JupyterHub.
Note
This jupyterhub/jupyterhub docker image is only an image for running the Hub service itself. It does not provide the other Jupyter components, such as Notebook installation, which are needed by the single-user servers. To run the single-user servers, which may be on the same system as the Hub or not, Jupyter Notebook version 4 or greater must be installed.
jupyterhub/jupyterhub
The JupyterHub docker image can be started with the following command:
docker run -d -p 8000:8000 --name jupyterhub jupyterhub/jupyterhub jupyterhub
This command will create a container named jupyterhub that you can stop and resume with docker stop/start.
docker stop/start
The Hub service will be listening on all interfaces at port 8000, which makes this a good choice for testing JupyterHub on your desktop or laptop.
If you want to run docker on a computer that has a public IP then you should (as in MUST) secure it with ssl by adding ssl options to your docker configuration or using a ssl enabled proxy.
Mounting volumes will allow you to store data outside the docker image (host system) so it will be persistent, even when you start a new image.
The command docker exec -it jupyterhub bash will spawn a root shell in your docker container. You can use the root shell to create system users in the container. These accounts will be used for authentication in JupyterHub’s default configuration.
docker exec -it jupyterhub bash
JupyterHub is supported on Linux/Unix based systems. To use JupyterHub, you need a Unix server (typically Linux) running somewhere that is accessible to your team on the network. The JupyterHub server can be on an internal network at your organization, or it can run on the public internet (in which case, take care with the Hub’s security).
JupyterHub officially does not support Windows. You may be able to use JupyterHub on Windows if you use a Spawner and Authenticator that work on Windows, but the JupyterHub defaults will not. Bugs reported on Windows will not be accepted, and the test suite will not run on Windows. Small patches that fix minor Windows compatibility issues (such as basic installation) may be accepted, however. For Windows-based systems, we would recommend running JupyterHub in a docker container or Linux VM.
Additional Reference: Tornado’s documentation on Windows platform support
Prior to beginning installation, it’s helpful to consider some of the following:
It is recommended to put all of the files used by JupyterHub into standard UNIX filesystem locations.
/srv/jupyterhub
/etc/jupyterhub
/var/log
The combination of JupyterHub and JupyterLab is a great way to make shared computing resources available to a group.
These instructions are a guide for a manual, ‘bare metal’ install of JupyterHub and JupyterLab. This is ideal for running on a single server: build a beast of a machine and share it within your lab, or use a virtual machine from any VPS or cloud provider.
This guide has similar goals to The Littlest JupyterHub setup script. However, instead of bundling all these step for you into one installer, we will perform every step manually. This makes it easy to customize any part (e.g. if you want to run other services on the same system and need to make them work together), as well as giving you full control and understanding of your setup.
Your own server with administrator (root) access. This could be a local machine, a remotely hosted one, or a cloud instance or VPS. Each user who will access JupyterHub should have a standard user account on the machine. The install will be done through the command line - useful if you log into your machine remotely using SSH.
This tutorial was tested on Ubuntu 18.04. No other Linux distributions have been tested, but the instructions should be reasonably straightforward to adapt.
JupyterLab enables access to a multiple ‘kernels’, each one being a given environment for a given language. The most common is a Python environment, for scientific computing usually one managed by the conda package manager.
This guide will set up JupyterHub and JupyterLab seperately from the Python environment. In other words, we treat JupyterHub+JupyterLab as a ‘app’ or webservice, which will connect to the kernels available on the system. Specifically:
/opt
The default JupyterHub Authenticator uses PAM to authenticate system users with their username and password. One can choose the authenticator that best suits their needs. In this guide we will use the default Authenticator because it makes it easy for everyone to manage data in their home folder and to mix and match different services and access methods (e.g. SSH) which all work using the Linux system user accounts. Therefore, each user of JupyterHub will need a standard system user account.
Another goal of this guide is to use system provided packages wherever possible. This has the advantage that these packages get automatic patches and security updates (be sure to turn on automatic updates in Ubuntu). This means less maintenance work and a more reliable system.
First we create a virtual environment under ‘/opt/jupyterhub’. The ‘/opt’ folder is where apps not belonging to the operating system are commonly installed. Both jupyterlab and jupyterhub will be installed into this virtualenv. Create it with the command:
sudo python3 -m venv /opt/jupyterhub/
Now we use pip to install the required Python packages into the new virtual environment. Be sure to install wheel first. Since we are separating the user interface from the computing kernels, we don’t install any Python scientific packages here. The only exception is ipywidgets because this is needed to allow connection between interactive tools running in the kernel and the user interface.
wheel
ipywidgets
Note that we use /opt/jupyterhub/bin/python3 -m pip install each time - this makes sure that the packages are installed to the correct virtual environment.
/opt/jupyterhub/bin/python3 -m pip install
Perform the install using the following commands:
sudo /opt/jupyterhub/bin/python3 -m pip install wheel sudo /opt/jupyterhub/bin/python3 -m pip install jupyterhub jupyterlab sudo /opt/jupyterhub/bin/python3 -m pip install ipywidgets
JupyterHub also currently defaults to requiring configurable-http-proxy, which needs nodejs and npm. The versions of these available in Ubuntu therefore need to be installed first (they are a bit old but this is ok for our needs):
configurable-http-proxy
nodejs
sudo apt install nodejs npm
Then install configurable-http-proxy:
sudo npm install -g configurable-http-proxy
Now we start creating configuration files. To keep everything together, we put all the configuration into the folder created for the virtualenv, under /opt/jupyterhub/etc/. For each thing needing configuration, we will create a further subfolder and necessary files.
/opt/jupyterhub/etc/
First create the folder for the JupyterHub configuration and navigate to it:
sudo mkdir -p /opt/jupyterhub/etc/jupyterhub/ cd /opt/jupyterhub/etc/jupyterhub/
Then generate the default configuration file
sudo /opt/jupyterhub/bin/jupyterhub --generate-config
This will produce the default configuration file /opt/jupyterhub/etc/jupyterhub/jupyterhub_config.py
/opt/jupyterhub/etc/jupyterhub/jupyterhub_config.py
You will need to edit the configuration file to make the JupyterLab interface by the default. Set the following configuration option in your jupyterhub_config.py file:
jupyterhub_config.py
c.Spawner.default_url = '/lab'
Further configuration options may be found in the documentation.
We will setup JupyterHub to run as a system service using Systemd (which is responsible for managing all services and servers that run on startup in Ubuntu). We will create a service file in a suitable location in the virtualenv folder and then link it to the system services. First create the folder for the service file:
sudo mkdir -p /opt/jupyterhub/etc/systemd
Then create the following text file using your favourite editor at
/opt/jupyterhub/etc/systemd/jupyterhub.service
Paste the following service unit definition into the file:
[Unit] Description=JupyterHub After=syslog.target network.target [Service] User=root Environment="PATH=/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/jupyterhub/bin" ExecStart=/opt/jupyterhub/bin/jupyterhub -f /opt/jupyterhub/etc/jupyterhub/jupyterhub_config.py [Install] WantedBy=multi-user.target
This sets up the environment to use the virtual environment we created, tells Systemd how to start jupyterhub using the configuration file we created, specifies that jupyterhub will be started as the root user (needed so that it can start jupyter on behalf of other logged in users), and specifies that jupyterhub should start on boot after the network is enabled.
root
Finally, we need to make systemd aware of our service file. First we symlink our file into systemd’s directory:
sudo ln -s /opt/jupyterhub/etc/systemd/jupyterhub.service /etc/systemd/system/jupyterhub.service
Then tell systemd to reload its configuration files
sudo systemctl daemon-reload
And finally enable the service
sudo systemctl enable jupyterhub.service
The service will start on reboot, but we can start it straight away using:
sudo systemctl start jupyterhub.service
…and check that it’s running using:
sudo systemctl status jupyterhub.service
You should now be already be able to access jupyterhub using <your servers ip>:8000 (assuming you haven’t already set up a firewall or something). However, when you log in the jupyter notebooks will be trying to use the Python virtualenv that was created to install JupyterHub, this is not what we want. So on to part 2
<your servers ip>:8000
We will use conda to manage Python environments. We will install the officially maintained conda packages for Ubuntu, this means they will get automatic updates with the rest of the system. Setup repo for the official Conda debian packages, instructions are copied from here:
Install Anacononda public gpg key to trusted store
curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg sudo install -o root -g root -m 644 conda.gpg /etc/apt/trusted.gpg.d/
Add Debian repo
echo "deb [arch=amd64] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | sudo tee /etc/apt/sources.list.d/conda.list
Install conda
sudo apt update sudo apt install conda
This will install conda into the folder /opt/conda/, with the conda command available at /opt/conda/bin/conda.
/opt/conda/
/opt/conda/bin/conda
Finally, we can make conda more easily available to users by symlinking the conda shell setup script to the profile ‘drop in’ folder so that it gets run on login
sudo ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
First create a folder for conda envs (might exist already):
sudo mkdir /opt/conda/envs/
Then create a conda environment to your liking within that folder. Here we have called it ‘python’ because it will be the obvious default - call it whatever you like. You can install whatever you like into this environment, but you MUST at least install ipykernel.
ipykernel
sudo /opt/conda/bin/conda create --prefix /opt/conda/envs/python python=3.7 ipykernel
Once your env is set up as desired, make it visible to Jupyter by installing the kernel spec. There are two options here:
1 ) Install into the JupyterHub virtualenv - this ensures it overrides the default python version. It will only be visible to the JupyterHub installation we have just created. This is useful to avoid conda environments appearing where they are not expected.
sudo /opt/conda/envs/python/bin/python -m ipykernel install --prefix=/opt/jupyterhub/ --name 'python' --display-name "Python (default)"
2 ) Install it system-wide by putting it into /usr/local. It will be visible to any parallel install of JupyterHub or JupyterLab, and will persist even if you later delete or modify the JupyterHub installation. This is useful if the kernels might be used by other services, or if you want to modify the JupyterHub installation independently from the conda environments.
/usr/local
sudo /opt/conda/envs/python/bin/python -m ipykernel install --prefix /usr/local/ --name 'python' --display-name "Python (default)"
There is relatively little for the administrator to do here, as users will have to set up their own environments using the shell. On login they should run conda init or /opt/conda/bin/conda. The can then use conda to set up their environment, although they must also install ipykernel. Once done, they can enable their kernel using:
conda init
/path/to/kernel/env/bin/python -m ipykernel install --name 'python-my-env' --display-name "Python My Env"
This will place the kernel spec into their home folder, where Jupyter will look for it on startup.
The guide so far results in JupyterHub running on port 8000. It is not generally advisable to run open web services in this way - instead, use a reverse proxy running on standard HTTP/HTTPS ports.
Important: Be aware of the security implications especially if you are running a server that is accessible from the open internet i.e. not protected within an institutional intranet or private home/office network. You should set up a firewall and HTTPS encryption, which is outside of the scope of this guide. For HTTPS consider using LetsEncrypt or setting up a self-signed certificate. Firewalls may be set up using ufw or firewalld and combined with fail2ban.
ufw
firewalld
fail2ban
Nginx is a mature and established web server and reverse proxy and is easy to install using sudo apt install nginx. Details on using Nginx as a reverse proxy can be found elsewhere. Here, we will only outline the additional steps needed to setup JupyterHub with Nginx and host it at a given URL e.g. <your-server-ip-or-url>/jupyter. This could be useful for example if you are running several services or web pages on the same server.
sudo apt install nginx
<your-server-ip-or-url>/jupyter
To achieve this needs a few tweaks to both the JupyterHub configuration and the Nginx config. First, edit the configuration file /opt/jupyterhub/etc/jupyterhub/jupyterhub_config.py and add the line:
c.JupyterHub.bind_url = 'http://:8000/jupyter'
where /jupyter will be the relative URL of the JupyterHub.
/jupyter
Now Nginx must be configured with a to pass all traffic from /jupyter to the the local address 127.0.0.1:8000. Add the following snippet to your nginx configuration file (e.g. /etc/nginx/sites-available/default).
127.0.0.1:8000
/etc/nginx/sites-available/default
location /jupyter/ { # NOTE important to also set base url of jupyterhub to /jupyter in its config proxy_pass http://127.0.0.1:8000; proxy_redirect off; proxy_set_header X-Real-IP $remote_addr; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # websocket headers proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; }
Also add this snippet before the server block:
map $http_upgrade $connection_upgrade { default upgrade; '' close; }
Nginx will not run if there are errors in the configuration, check your configuration using:
nginx -t
If there are no errors, you can restart the Nginx service for the new configuration to take effect.
sudo systemctl restart nginx.service
Once you have setup JupyterHub and Nginx proxy as described, you can browse to your JupyterHub IP or URL (e.g. if your server IP address is 123.456.789.1 and you decided to host JupyterHub at the /jupyter URL, browse to 123.456.789.1/jupyter). You will find a login page where you enter your Linux username and password. On login you will be presented with the JupyterLab interface, with the file browser pane showing the contents of your users’ home directory on the server.
123.456.789.1
123.456.789.1/jupyter
This section covers how to configure and customize JupyterHub for your needs. It contains information about authentication, networking, security, and other topics that are relevant to individuals or organizations deploying their own JupyterHub.
The section contains basic information about configuring settings for a JupyterHub deployment. The Technical Reference documentation provides additional details.
This section will help you learn how to:
On startup, JupyterHub will look by default for a configuration file, jupyterhub_config.py, in the current working directory.
To generate a default config file, jupyterhub_config.py:
jupyterhub --generate-config
This default jupyterhub_config.py file contains comments and guidance for all configuration variables and their default values. We recommend storing configuration files in the standard UNIX filesystem location, i.e. /etc/jupyterhub.
You can load a specific config file and start JupyterHub using:
jupyterhub -f /path/to/jupyterhub_config.py
If you have stored your configuration file in the recommended UNIX filesystem location, /etc/jupyterhub, the following command will start JupyterHub using the configuration file:
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py
The IPython documentation provides additional information on the config system that Jupyter uses.
To display all command line options that are available for configuration:
jupyterhub --help-all
Configuration using the command line options is done when launching JupyterHub. For example, to start JupyterHub on 10.0.1.2:443 with https, you would enter:
10.0.1.2:443
jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert
All configurable options may technically be set on the command-line, though some are inconvenient to type. To set a particular configuration parameter, c.Class.trait, you would use the command line option, --Class.trait, when starting JupyterHub. For example, to configure the c.Spawner.notebook_dir trait from the command-line, use the --Spawner.notebook_dir option:
c.Class.trait
--Class.trait
c.Spawner.notebook_dir
--Spawner.notebook_dir
jupyterhub --Spawner.notebook_dir='~/assignments'
The default authentication and process spawning mechanisms can be replaced, and specific authenticators and spawners can be set in the configuration file. This enables JupyterHub to be used with a variety of authentication methods or process control and deployment environments. Some examples, meant as illustration, are:
This is not strictly necessary, but useful in many cases. If you use a custom proxy (e.g. Traefik), this also not needed.
Connections to user servers go through the proxy, and not the hub itself. If the proxy stays running when the hub restarts (for maintenance, re-configuration, etc.), then use connections are not interrupted. For simplicity, by default the hub starts the proxy automatically, so if the hub restarts, the proxy restarts, and user connections are interrupted. It is easy to run the proxy separately, for information see the separate proxy page.
This section will help you with basic proxy and network configuration to:
hub_connect_ip
The Proxy’s main IP address setting determines where JupyterHub is available to users. By default, JupyterHub is configured to be available on all network interfaces ('') on port 8000. Note: Use of '*' is discouraged for IP configuration; instead, use of '0.0.0.0' is preferred.
''
'*'
'0.0.0.0'
Changing the Proxy’s main IP address and port can be done with the following JupyterHub command line options:
jupyterhub --ip=192.168.1.2 --port=443
Or by placing the following lines in a configuration file, jupyterhub_config.py:
c.JupyterHub.ip = '192.168.1.2' c.JupyterHub.port = 443
Port 443 is used in the examples since 443 is the default port for SSL/HTTPS.
Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub. However, more customized scenarios may need additional networking details to be configured.
Note that c.JupyterHub.ip and c.JupyterHub.port are single values, not tuples or lists – JupyterHub listens to only a single IP address and port.
c.JupyterHub.ip
c.JupyterHub.port
By default, this REST API listens on port 8001 of localhost only. The Hub service talks to the proxy via a REST API on a secondary port. The API URL can be configured separately and override the default settings.
localhost
The URL to access the API, c.configurableHTTPProxy.api_url, is configurable. An example entry to set the proxy’s API URL in jupyterhub_config.py is:
c.configurableHTTPProxy.api_url
c.ConfigurableHTTPProxy.api_url = 'http://10.0.1.4:5432'
If running the Proxy separate from the Hub, configure the REST API communication IP address and port by adding this to the jupyterhub_config.py file:
# ideally a private network address c.JupyterHub.proxy_api_ip = '10.0.1.4' c.JupyterHub.proxy_api_port = 5432
We recommend using the proxy’s api_url setting instead of the deprecated settings, proxy_api_ip and proxy_api_port.
api_url
proxy_api_ip
proxy_api_port
The Hub service listens only on localhost (port 8081) by default. The Hub needs to be accessible from both the proxy and all Spawners. When spawning local servers, an IP address setting of localhost is fine.
If either the Proxy or (more likely) the Spawners will be remote or isolated in containers, the Hub must listen on an IP that is accessible.
c.JupyterHub.hub_ip = '10.0.1.4' c.JupyterHub.hub_port = 54321
Added in 0.8: The c.JupyterHub.hub_connect_ip setting is the ip address or hostname that other services should use to connect to the Hub. A common configuration for, e.g. docker, is:
c.JupyterHub.hub_connect_ip
c.JupyterHub.hub_ip = '0.0.0.0' # listen on all interfaces c.JupyterHub.hub_connect_ip = '10.0.1.4' # ip as seen on the docker network. Can also be a hostname.
The hub will most commonly be running on a hostname of its own. If it is not – for example, if the hub is being reverse-proxied and being exposed at a URL such as https://proxy.example.org/jupyter/ – then you will need to tell JupyterHub the base URL of the service. In such a case, it is both necessary and sufficient to set c.JupyterHub.base_url = '/jupyter/' in the configuration.
https://proxy.example.org/jupyter/
c.JupyterHub.base_url = '/jupyter/'
You should not run JupyterHub without SSL encryption on a public network.
Security is the most important aspect of configuring Jupyter. Three configuration settings are the main aspects of security configuration:
The Hub hashes all secrets (e.g., auth tokens) before storing them in its database. A loss of control over read-access to the database should have minimal impact on your deployment; if your database has been compromised, it is still a good idea to revoke existing tokens.
Since JupyterHub includes authentication and allows arbitrary code execution, you should not run it without SSL (HTTPS).
This will require you to obtain an official, trusted SSL certificate or create a self-signed certificate. Once you have obtained and installed a key and certificate you need to specify their locations in the jupyterhub_config.py configuration file as follows:
c.JupyterHub.ssl_key = '/path/to/my.key' c.JupyterHub.ssl_cert = '/path/to/my.cert'
Some cert files also contain the key, in which case only the cert is needed. It is important that these files be put in a secure location on your server, where they are not readable by regular users.
If you are using a chain certificate, see also chained certificate for SSL in the JupyterHub Troubleshooting FAQ.
It is also possible to use letsencrypt to obtain a free, trusted SSL certificate. If you run letsencrypt using the default options, the needed configuration is (replace mydomain.tld by your fully qualified domain name):
mydomain.tld
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem' c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
If the fully qualified domain name (FQDN) is example.com, the following would be the needed configuration:
example.com
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem' c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
In certain cases, for example if the hub is running behind a reverse proxy, and SSL termination is being provided by NGINX, it is reasonable to run the hub without SSL.
To achieve this, simply omit the configuration settings c.JupyterHub.ssl_key and c.JupyterHub.ssl_cert (setting them to None does not have the same effect, and is an error).
c.JupyterHub.ssl_key
c.JupyterHub.ssl_cert
None
The Hub authenticates its requests to the Proxy using a secret token that the Hub and Proxy agree upon. Note that this applies to the default ConfigurableHTTPProxy implementation. Not all proxy implementations use an auth token.
ConfigurableHTTPProxy
The value of this token should be a random string (for example, generated by openssl rand -hex 32). You can store it in the configuration file or an environment variable
openssl rand -hex 32
You can set the value in the configuration file, jupyterhub_config.py:
c.ConfigurableHTTPProxy.api_token = 'abc123...' # any random string
You can pass this value of the proxy authentication token to the Hub and Proxy using the CONFIGPROXY_AUTH_TOKEN environment variable:
CONFIGPROXY_AUTH_TOKEN
export CONFIGPROXY_AUTH_TOKEN=$(openssl rand -hex 32)
This environment variable needs to be visible to the Hub and Proxy.
If you don’t set the Proxy authentication token, the Hub will generate a random key itself, which means that any time you restart the Hub you must also restart the Proxy. If the proxy is a subprocess of the Hub, this should happen automatically (this is the default configuration).
The cookie secret is an encryption key, used to encrypt the browser cookies which are used for authentication. Three common methods are described for generating and configuring the cookie secret.
The cookie secret should be 32 random bytes, encoded as hex, and is typically stored in a jupyterhub_cookie_secret file. An example command to generate the jupyterhub_cookie_secret file is:
jupyterhub_cookie_secret
openssl rand -hex 32 > /srv/jupyterhub/jupyterhub_cookie_secret
In most deployments of JupyterHub, you should point this to a secure location on the file system, such as /srv/jupyterhub/jupyterhub_cookie_secret.
/srv/jupyterhub/jupyterhub_cookie_secret
The location of the jupyterhub_cookie_secret file can be specified in the jupyterhub_config.py file as follows:
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/jupyterhub_cookie_secret'
If the cookie secret file doesn’t exist when the Hub starts, a new cookie secret is generated and stored in the file. The file must not be readable by group or other or the server won’t start. The recommended permissions for the cookie secret file are 600 (owner-only rw).
group
other
600
If you would like to avoid the need for files, the value can be loaded in the Hub process from the JPY_COOKIE_SECRET environment variable, which is a hex-encoded string. You can set it this way:
JPY_COOKIE_SECRET
export JPY_COOKIE_SECRET=$(openssl rand -hex 32)
For security reasons, this environment variable should only be visible to the Hub. If you set it dynamically as above, all users will be logged out each time the Hub starts.
You can also set the cookie secret in the configuration file itself, jupyterhub_config.py, as a binary string:
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
If the cookie secret value changes for the Hub, all single-user notebook servers must also be restarted.
The following cookies are used by the Hub for handling user authentication.
This section was created based on this post from Discourse.
This is the login token used when visiting Hub-served pages that are protected by authentication such as the main home, the spawn form, etc. If this cookie is set, then the user is logged in.
Resetting the Hub cookie secret effectively revokes this cookie.
This cookie is restricted to the path /hub/.
/hub/
This is the cookie used for authenticating with a single-user server. It is set by the single-user server after OAuth with the Hub.
Effectively the same as jupyterhub-hub-login, but for the single-user server instead of the Hub. It contains an OAuth access token, which is checked with the Hub to authenticate the browser.
jupyterhub-hub-login
Each OAuth access token is associated with a session id (see jupyterhub-session-id section below).
jupyterhub-session-id
To avoid hitting the Hub on every request, the authentication response is cached. And to avoid a stale cache the cache key is comprised of both the token and session id.
This cookie is restricted to the path /user/<username>, so that only the user’s server receives it.
/user/<username>
This is a random string, meaningless in itself, and the only cookie shared by the Hub and single-user servers.
Its sole purpose is to coordinate logout of the multiple OAuth cookies.
This cookie is set to / so all endpoints can receive it, or clear it, etc.
/
A short-lived cookie, used solely to store and validate OAuth state. It is only set while OAuth between the single-user server and the Hub is processing.
If you use your browser development tools, you should see this cookie for a very brief moment before your are logged in, with an expiration date shorter than jupyterhub-hub-login or jupyterhub-user-<username>.
jupyterhub-user-<username>
This cookie should not exist after you have successfully logged in.
The default Authenticator uses PAM to authenticate system users with their username and password. With the default Authenticator, any user with an account and password on the system will be allowed to login.
You can restrict which users are allowed to login with a set, Authenticator.allowed_users:
Authenticator.allowed_users
c.Authenticator.allowed_users = {'mal', 'zoe', 'inara', 'kaylee'}
Users in the allowed_users set are added to the Hub database when the Hub is started.
allowed_users
admin_users
Admin users of JupyterHub, admin_users, can add and remove users from the user allowed_users set. admin_users can take actions on other users’ behalf, such as stopping and restarting their servers.
A set of initial admin users, admin_users can configured be as follows:
c.Authenticator.admin_users = {'mal', 'zoe'}
Users in the admin set are automatically added to the user allowed_users set, if they are not already present.
Each authenticator may have different ways of determining whether a user is an administrator. By default JupyterHub use the PAMAuthenticator which provide the admin_groups option and can determine administrator status base on a user groups. For example we can let any users in the wheel group be admin:
admin_groups
c.PAMAuthenticator.admin_groups = {'wheel'}
admin_access
Since the default JupyterHub.admin_access setting is False, the admins do not have permission to log in to the single user notebook servers owned by other users. If JupyterHub.admin_access is set to True, then admins have permission to log in as other users on their respective machines, for debugging. As a courtesy, you should make sure your users know if admin_access is enabled.
JupyterHub.admin_access
Users can be added to and removed from the Hub via either the admin panel or the REST API. When a user is added, the user will be automatically added to the allowed users set and database. Restarting the Hub will not require manually updating the allowed users set in your config file, as the users will be loaded from the database.
After starting the Hub once, it is not sufficient to remove a user from the allowed users set in your config file. You must also remove the user from the Hub’s database, either by deleting the user from JupyterHub’s admin page, or you can clear the jupyterhub.sqlite database and start fresh.
jupyterhub.sqlite
The LocalAuthenticator is a special kind of authenticator that has the ability to manage users on the local system. When you try to add a new user to the Hub, a LocalAuthenticator will check if the user already exists. If you set the configuration value, create_system_users, to True in the configuration file, the LocalAuthenticator has the privileges to add users to the system. The setting in the config file is:
LocalAuthenticator
create_system_users
True
c.LocalAuthenticator.create_system_users = True
Adding a user to the Hub that doesn’t already exist on the system will result in the Hub creating that user via the system adduser command line tool. This option is typically used on hosted deployments of JupyterHub, to avoid the need to manually create all your users before launching the service. This approach is not recommended when running JupyterHub in situations where JupyterHub users map directly onto the system’s UNIX users.
adduser
JupyterHub’s OAuthenticator currently supports the following popular services:
A generic implementation, which you can use for OAuth authentication with any provider, is also available.
The :class:~jupyterhub.auth.DummyAuthenticator is a simple authenticator that allows for any username/password unless if a global password has been set. If set, it will allow for any username as long as the correct password is provided. To set a global password, add this to the config file:
~jupyterhub.auth.DummyAuthenticator
c.DummyAuthenticator.password = "some_password"
Since the single-user server is an instance of jupyter notebook, an entire separate multi-process application, there are many aspect of that server can configure, and a lot of ways to express that configuration.
jupyter notebook
At the JupyterHub level, you can set some values on the Spawner. The simplest of these is Spawner.notebook_dir, which lets you set the root directory for a user’s server. This root notebook directory is the highest level directory users will be able to access in the notebook dashboard. In this example, the root notebook directory is set to ~/notebooks, where ~ is expanded to the user’s home directory.
Spawner.notebook_dir
~/notebooks
~
c.Spawner.notebook_dir = '~/notebooks'
You can also specify extra command-line arguments to the notebook server with:
c.Spawner.args = ['--debug', '--profile=PHYS131']
This could be used to set the users default page for the single user server:
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
Since the single-user server extends the notebook server application, it still loads configuration from the jupyter_notebook_config.py config file. Each user may have one of these files in $HOME/.jupyter/. Jupyter also supports loading system-wide config files from /etc/jupyter/, which is the place to put configuration that you want to affect all of your users.
jupyter_notebook_config.py
$HOME/.jupyter/
/etc/jupyter/
When working with JupyterHub, a Service is defined as a process that interacts with the Hub’s REST API. A Service may perform a specific or action or task. For example, shutting down individuals’ single user notebook servers that have been idle for some time is a good example of a task that could be automated by a Service. Let’s look at how the jupyterhub_idle_culler script can be used as a Service.
JupyterHub has a REST API that can be used by external services. This document will:
Both examples for jupyterhub_idle_culler will communicate tasks to the Hub via the REST API.
jupyterhub_idle_culler
To run such an external service, an API token must be created and provided to the service.
As of version 0.6.0, the preferred way of doing this is to first generate an API token:
In version 0.8.0, a TOKEN request page for generating an API token is available from the JupyterHub user interface:
In the case of cull_idle_servers, it is passed as the environment variable called JUPYTERHUB_API_TOKEN.
cull_idle_servers
JUPYTERHUB_API_TOKEN
While API tokens are often associated with a specific user, API tokens can be used by services that require external access for activities that may not correspond to a specific human, e.g. adding users during setup for a tutorial or workshop. Add a service and its API token to the JupyterHub configuration file, jupyterhub_config.py:
c.JupyterHub.services = [ {'name': 'adding-users', 'api_token': 'super-secret-token'}, ]
Upon restarting JupyterHub, you should see a message like below in the logs:
Adding API token for <username>
In JupyterHub 0.7, there is no mechanism for token authentication to single-user servers, and only cookies can be used for authentication. 0.8 supports using JupyterHub API tokens to authenticate to single-user servers.
Install the idle culler:
pip install jupyterhub-idle-culler
In jupyterhub_config.py, add the following dictionary for the idle-culler Service to the c.JupyterHub.services list:
idle-culler
c.JupyterHub.services
c.JupyterHub.services = [ { 'name': 'idle-culler', 'admin': True, 'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600'], } ]
where:
'admin': True
'command'
cull-idle
Now you can run your script by providing it the API token and it will authenticate through the REST API to interact with it.
This will run the idle culler service manually. It can be run as a standalone script anywhere with access to the Hub, and will periodically check for idle servers and shut them down via the Hub’s REST API. In order to shutdown the servers, the token given to cull-idle must have admin privileges.
Generate an API token and store it in the JUPYTERHUB_API_TOKEN environment variable. Run jupyterhub_idle_culler manually.
export JUPYTERHUB_API_TOKEN='token' python -m jupyterhub_idle_culler [--timeout=900] [--url=http://127.0.0.1:8081/hub/api]
In short, where you see /user/name/notebooks/foo.ipynb use /hub/user-redirect/notebooks/foo.ipynb (replace /user/name with /hub/user-redirect).
/user/name/notebooks/foo.ipynb
/hub/user-redirect/notebooks/foo.ipynb
/user/name
/hub/user-redirect
Sharing links to notebooks is a common activity, and can look different based on what you mean. Your first instinct might be to copy the URL you see in the browser, e.g. hub.jupyter.org/user/yourname/notebooks/coolthing.ipynb. However, let’s break down what this URL means:
hub.jupyter.org/user/yourname/notebooks/coolthing.ipynb
hub.jupyter.org/user/yourname/ is the URL prefix handled by your server, which means that sharing this URL is asking the person you share the link with to come to your server and look at the exact same file. In most circumstances, this is forbidden by permissions because the person you share with does not have access to your server. What actually happens when someone visits this URL will depend on whether your server is running and other factors.
hub.jupyter.org/user/yourname/
But what is our actual goal? A typical situation is that you have some shared or common filesystem, such that the same path corresponds to the same document (either the exact same document or another copy of it). Typically, what folks want when they do sharing like this is for each visitor to open the same file on their own server, so Breq would open /user/breq/notebooks/foo.ipynb and Seivarden would open /user/seivarden/notebooks/foo.ipynb, etc.
/user/breq/notebooks/foo.ipynb
/user/seivarden/notebooks/foo.ipynb
JupyterHub has a special URL that does exactly this! It’s called /hub/user-redirect/... and after the visitor logs in, So if you replace /user/yourname in your URL bar with /hub/user-redirect any visitor should get the same URL on their own server, rather than visiting yours.
/hub/user-redirect/...
/user/yourname
In JupyterLab 2.0, this should also be the result of the “Copy Shareable Link” action in the file browser.
This page contains common questions from users of JupyterHub, broken down by their roles within organizations.
Yes! JupyterHub has been used at-scale for large pools of users, as well as complex and high-performance computing. For example, UC Berkeley uses JupyterHub for its Data Science Education Program courses (serving over 3,000 students). The Pangeo project uses JupyterHub to provide access to scalable cloud computing with Dask. JupyterHub is stable customizable to the use-cases of large organizations.
Here is a quick breakdown of these three tools:
.ipynb
JupyterHub provides a shared platform for data science and collaboration. It allows users to utilize familiar data science workflows (such as the scientific python stack, the R tidyverse, and Jupyter Notebooks) on institutional infrastructure. It also allows administrators some control over access to resources, security, environments, and authentication.
Yes - the core JupyterHub application recently reached 1.0 status, and is considered stable and performant for most institutions. JupyterHub has also been deployed (along with other tools) to work on scalable infrastructure, large datasets, and high-performance computing.
JupyterHub is used at a variety of institutions in academia, industry, and government research labs. It is most-commonly used by two kinds of groups:
Here are a sample of organizations that use JupyterHub:
See the Gallery of JupyterHub deployments for a more complete list of JupyterHub deployments at institutions.
JupyterHub puts you in control of your data, infrastructure, and coding environment. In addition, it is vendor neutral, which reduces lock-in to a particular vendor or service. JupyterHub provides access to interactive computing environments in the cloud (similar to each of these services). Compared with the tools above, it is more flexible, more customizable, free, and gives administrators more control over their setup and hardware.
Because JupyterHub is an open-source, community-driven tool, it can be extended and modified to fit an institution’s needs. It plays nicely with the open source data science stack, and can serve a variety of computing enviroments, user interfaces, and computational hardware. It can also be deployed anywhere - on enterprise cloud infrastructure, on High-Performance-Computing machines, on local hardware, or even on a single laptop, which is not possible with most other tools for shared interactive computing.
That depends on what kind of hardware you’ve got. JupyterHub is flexible enough to be deployed on a variety of hardware, including in-room hardware, on-prem clusters, cloud infrastructure, etc.
The most common way to set up a JupyterHub is to use a JupyterHub distribution, these are pre-configured and opinionated ways to set up a JupyterHub on particular kinds of infrastructure. The two distributions that we currently suggest are:
Yes - most deployments of JupyterHub are run via cloud infrastructure and on a variety of cloud providers. Depending on the distribution of JupyterHub that you’d like to use, you can also connect your JupyterHub deployment with a number of other cloud-native services so that users have access to other resources from their interactive computing sessions.
For example, if you use the Zero to JupyterHub for Kubernetes distribution, you’ll be able to utilize container-based workflows of other technologies such as the dask-kubernetes project for distributed computing.
The Z2JH Helm Chart also has some functionality built in for auto-scaling your cluster up and down as more resources are needed - allowing you to utilize the benefits of a flexible cloud-based deployment.
The short answer: yes. JupyterHub as a standalone application has been battle-tested at an institutional level for several years, and makes a number of “default” security decisions that are reasonable for most users.
The longer answer: it depends on your deployment. Because JupyterHub is very flexible, it can be used in a variety of deployment setups. This often entails connecting your JupyterHub to other infrastructure (such as a Dask Gateway service). There are many security decisions to be made in these cases, and the security of your JupyterHub deployment will often depend on these decisions.
If you are worried about security, don’t hesitate to reach out to the JupyterHub community in the Jupyter Community Forum. This community of practice has many individuals with experience running secure JupyterHub deployments.
No - JupyterHub manages user sessions and can control computing infrastructure, but it does not provide these things itself. You are expected to run JupyterHub on your own infrastructure (local or in the cloud). Moreover, JupyterHub has no internal concept of “data”, but is designed to be able to communicate with data repositories (again, either locally or remotely) for use within interactive computing sessions.
JupyterHub offers a few options for managing your users. Upon setting up a JupyterHub, you can choose what kind of authentication you’d like to use. For example, you can have users sign up with an institutional email address, or choose a username / password when they first log-in, or offload authentication onto another service such as an organization’s OAuth.
The users of a JupyterHub are stored locally, and can be modified manually by an administrator of the JupyterHub. Moreover, the active users on a JupyterHub can be found on the administrator’s page. This page gives you the abiltiy to stop or restart kernels, inspect user filesystems, and even take over user sessions to assist them with debugging.
A key benefit of JupyterHub is the ability for an administrator to define the environment(s) that users have access to. There are many ways to do this, depending on what kind of infrastructure you’re using for your JupyterHub.
For example, The Littlest JupyterHub runs on a single VM. In this case, the administrator defines an environment by installing packages to a shared folder that exists on the path of all users. The JupyterHub for Kubernetes deployment uses Docker images to define environments. You can create your own list of Docker images that users can select from, and can also control things like the amount of RAM available to users, or the types of machines that their sessions will use in the cloud.
For interactive computing sessions, JupyterHub controls computational resources via a spawner. Spawners define how a new user session is created, and are customized for particular kinds of infrastructure. For example, the KubeSpawner knows how to control a Kubernetes deployment to create new pods when users log in.
For more sophisticated computational resources (like distributed computing), JupyterHub can connect with other infrastructure tools (like Dask or Spark). This allows users to control scalable or high-performance resources from within their JupyterHub sessions. The logic of how those resources are controlled is taken care of by the non-JupyterHub application.
Yes - JupyterHub can provide access to many kinds of computing infrastructure. Especially when combined with other open-source schedulers such as Dask, you can manage fairly complex computing infrastructure from the interactive sessions of a JupyterHub. For example see the Dask HPC page.
This is highly configurable by the administrator. If you wish for your users to have simple data analytics environments for prototyping and light data exploring, you can restrict their memory and CPU based on the resources that you have available. If you’d like your JupyterHub to serve as a gateway to high-performance compute or data resources, you may increase the resources available on user machines, or connect them with computing infrastructure elsewhere.
JupyterHub provides some customization of the graphics displayed to users. The most common modification is to add custom branding to the JupyterHub login page, loading pages, and various elements that persist across all pages (such as headers).
Depending on the complexity of your setup, you’ll have different experiences with “out of the box” distributions of JupyterHub. If all of the resources you need will fit on a single VM, then The Littlest JupyterHub should get you up-and-running within a half day or so. For more complex setups, such as scalable Kubernetes clusters or access to high-performance computing and data, it will require more time and expertise with the technologies your JupyterHub will use (e.g., dev-ops knowledge with cloud computing).
In general, the base JupyterHub deployment is not the bottleneck for setup, it is connecting your JupyterHub with the various services and tools that you wish to provide to your users.
JupyterHub works well at both a small scale (e.g., a single VM or machine) as well as a high scale (e.g., a scalable Kubernetes cluster). It can be used for teams as small a 2, and for user bases as large as 10,000. The scalability of JupyterHub largely depends on the infrastructure on which it is deployed. JupyterHub has been designed to be lightweight and flexible, so you can tailor your JupyterHub deployment to your needs.
For JupyterHubs that are deployed in a containerized environment (e.g., Kubernetes), it is possible to configure the JupyterHub to be fairly resistant to failures in the system. For example, if JupyterHub fails, then user sessions will not be affected (though new users will not be able to log in). When a JupyterHub process is restarted, it should seamlessly connect with the user database and the system will return to normal. Again, the details of your JupyterHub deployment (e.g., whether it’s deployed on a scalable cluster) will affect the resiliency of the deployment.
Out of the box, JupyterHub supports a variety of popular data science interfaces for user sessions, such as JupyterLab, Jupyter Notebooks, and RStudio. Any interface that can be served via a web address can be served with a JupyterHub (with the right setup).
JupyterHub provides a standardized environment and access to shared resources for your teams. This greatly reduces the cost associated with sharing analyses and content with other team members, and makes it easier to collaborate and build off of one another’s ideas. Combined with access to high-performance computing and data, JupyterHub provides a common resource to amplify your team’s ability to prototype their analyses, scale them to larger data, and then share their results with one another.
JupyterHub also provides a computational framework to share computational narratives between different levels of an organization. For example, data scientists can share Jupyter Notebooks rendered as voila dashboards with those who are not familiar with programming, or create publicly-available interactive analyses to allow others to interact with your work.
Yes, Jupyter is a polyglot project, and there are over 40 community-provided kernels for a variety of languages (the most common being Python, Julia, and R). You can also use a JupyterHub to provide access to other interfaces, such as RStudio, that provide their own access to a language kernel.
This section covers more of the details of the JupyterHub architecture, as well as what happens under-the-hood when you deploy and configure your JupyterHub.
The Technical Overview section gives you a high-level view of:
The goal of this section is to share a deeper technical understanding of JupyterHub and how it works.
JupyterHub is a set of processes that together provide a single user Jupyter Notebook server for each person in a group. Three major subsystems are started by the jupyterhub command line program:
Users access JupyterHub through a web browser, by going to the IP address or the domain name of the server.
The basic principles of operation are:
The proxy is the only process that listens on a public interface. The Hub sits behind the proxy at /hub. Single-user servers sit behind the proxy at /user/[username].
/hub
/user/[username]
Different authenticators control access to JupyterHub. The default one (PAM) uses the user accounts on the server where JupyterHub is running. If you use this, you will need to create a user account on the system for each user on your team. Using other authenticators, you can allow users to sign in with e.g. a GitHub account, or with any single-sign-on system your organization has.
Next, spawners control how JupyterHub starts the individual notebook server for each user. The default spawner will start a notebook server on the same machine running under their system username. The other main option is to start each server in a separate container, often using Docker.
When a user accesses JupyterHub, the following events take place:
/user/[username]/*
The single-user server identifies the user with the Hub via OAuth:
/hub/login
By default, the Proxy listens on all public interfaces on port 8000. Thus you can reach JupyterHub through either:
http://localhost:8000
In their default configuration, the other services, the Hub and Single-User Notebook Servers, all communicate with each other on localhost only.
By default, starting JupyterHub will write two files to disk in the current working directory:
The location of these files can be specified via configuration settings. It is recommended that these files be stored in standard UNIX filesystem locations, such as /etc/jupyterhub for all configuration files and /srv/jupyterhub for all security and runtime files.
There are two basic extension points for JupyterHub:
Each is governed by a customizable class, and JupyterHub ships with basic defaults for each.
To enable custom authentication and/or spawning, subclass Authenticator or Spawner, and override the relevant methods.
Authenticator
Spawner
This document describes how JupyterHub routes requests.
This does not include the REST API urls.
In general, all URLs can be prefixed with c.JupyterHub.base_url to run the whole JupyterHub application on a prefix.
c.JupyterHub.base_url
All authenticated handlers redirect to /hub/login to login users prior to being redirected back to the originating page. The returned request should preserve all query parameters.
The top-level request is always a simple redirect to /hub/, to be handled by the default JupyterHub handler.
In general, all requests to /anything that do not start with /hub/ but are routed to the Hub, will be redirected to /hub/anything before being handled by the Hub.
/anything
/hub/anything
This is an authenticated URL.
This handler redirects users to the default URL of the application, which defaults to the user’s default server. That is, it redirects to /hub/spawn if the user’s server is not running, or the server itself (/user/:name) if the server is running.
/hub/spawn
/user/:name
This default url behavior can be customized in two ways:
To redirect users to the JupyterHub home page (/hub/home) instead of spawning their server, set redirect_to_server to False:
/hub/home
redirect_to_server
c.JupyterHub.redirect_to_server = False
This might be useful if you have a Hub where you expect users to be managing multiple server configurations and automatic spawning is not desirable.
Second, you can customise the landing page to any page you like, such as a custom service you have deployed e.g. with course information:
c.JupyterHub.default_url = '/services/my-landing-service'
By default, the Hub home page has just one or two buttons for starting and stopping the user’s server.
If named servers are enabled, there will be some additional tools for management of named servers.
Version added: 1.0 named server UI is new in 1.0.
This is the JupyterHub login page. If you have a form-based username+password login, such as the default PAMAuthenticator, this page will render the login form.
If login is handled by an external service, e.g. with OAuth, this page will have a button, declaring “Login with …” which users can click to login with the chosen service.
If you want to skip the user-interaction to initiate logging in via the button, you can set
c.Authenticator.auto_login = True
This can be useful when the user is “already logged in” via some mechanism, but a handshake via redirects is necessary to complete the authentication with JupyterHub.
/hub/logout
Visiting /hub/logout clears cookies from the current browser. Note that logging out does not stop a user’s server(s) by default.
If you would like to shutdown user servers on logout, you can enable this behavior with:
c.JupyterHub.shutdown_on_logout = True
Be careful with this setting because logging out one browser does not mean the user is no longer actively using their server from another machine.
/user/:username[/:servername]
If a user’s server is running, this URL is handled by the user’s given server, not the Hub. The username is the first part and, if using named servers, the server name is the second part.
If the user’s server is not running, this will be redirected to /hub/user/:username/...
/hub/user/:username/...
/hub/user/:username[/:servername]
This URL indicates a request for a user server that is not running (because /user/... would have been handled by the notebook server if the specified server were running).
/user/...
Handling this URL is the most complicated condition in JupyterHub, because there can be many states:
If the server is pending spawn, the browser will be redirected to /hub/spawn-pending/:username/:servername to see a progress page while waiting for the server to be ready.
/hub/spawn-pending/:username/:servername
If the server is not active at all, a page will be served with a link to /hub/spawn/:username/:servername. Following that link will launch the requested server. The HTTP status will be 503 in this case because a request has been made for a server that is not running.
/hub/spawn/:username/:servername
If the server is ready, it is assumed that the proxy has not yet registered the route. Some checks are performed and a delay is added before redirecting back to /user/:username/:servername/.... If something is really wrong, this can result in a redirect loop.
/user/:username/:servername/...
Visiting this page will never result in triggering the spawn of servers without additional user action (i.e. clicking the link on the page)
Version changed: 1.0
Prior to 1.0, this URL itself was responsible for spawning servers, and served the progress page if it was pending, redirected to running servers, and This was useful because it made sure that requested servers were restarted after they stopped, but could also be harmful because unused servers would continuously be restarted if e.g. an idle JupyterLab frontend were open pointed at it, which constantly makes polling requests.
Requests to /user/:username[/:servername]/api/... are assumed to be from applications connected to stopped servers. These are failed with 503 and an informative JSON error message indicating how to spawn the server. This is meant to help applications such as JupyterLab that are connected to a server that has stopped.
/user/:username[/:servername]/api/...
JupyterHub 0.9 failed these API requests with status 404, but 1.0 uses 503.
/user-redirect/...
This URL is for sharing a URL that will redirect a user to a path on their own default server. This is useful when users have the same file at the same URL on their servers, and you want a single link to give to any user that will open that file on their server.
e.g. a link to /user-redirect/notebooks/Index.ipynb will send user hortense to /user/hortense/notebooks/Index.ipynb
/user-redirect/notebooks/Index.ipynb
hortense
/user/hortense/notebooks/Index.ipynb
DO NOT share links to your own server with other users. This will not work in general, unless you grant those users access to your server.
Contributions welcome: The JupyterLab “shareable link” should share this link when run with JupyterHub, but it does not. See jupyterlab-hub where this should probably be done and this issue in JupyterLab that is intended to make it possible.
/hub/spawn[/:username[/:servername]]
Requesting /hub/spawn will spawn the default server for the current user. If username and optionally servername are specified, then the specified server for the specified user will be spawned. Once spawn has been requested, the browser is redirected to /hub/spawn-pending/....
username
servername
/hub/spawn-pending/...
If Spawner.options_form is used, this will render a form, and a POST request will trigger the actual spawn and redirect.
Spawner.options_form
Version added: 1.0
1.0 adds the ability to specify username and servername. Prior to 1.0, only /hub/spawn was recognized for the default server.
Prior to 1.0, this page redirected back to /hub/user/:username, which was responsible for triggering spawn and rendering progress, etc.
/hub/user/:username
/hub/spawn-pending[/:username[/:servername]]
Version added: 1.0 this URL is new in JupyterHub 1.0.
This page renders the progress view for the given spawn request. Once the server is ready, the browser is redirected to the running server at /user/:username/:servername/....
If this page is requested at any time after the specified server is ready, the browser will be redirected to the running server.
Requesting this page will never trigger any side effects. If the server is not running (e.g. because the spawn has failed), the spawn failure message (if applicable) will be displayed, and the page will show a link back to /hub/spawn/....
/hub/spawn/...
/hub/token
On this page, users can manage their JupyterHub API tokens. They can revoke access and request new tokens for writing scripts against the JupyterHub REST API.
/hub/admin
Administrators can take various administrative actions from this page:
The Security Overview section helps you learn about:
This overview also helps you obtain a deeper understanding of how JupyterHub works.
JupyterHub is designed to be a simple multi-user server for modestly sized groups of semi-trusted users. While the design reflects serving semi-trusted users, JupyterHub is not necessarily unsuitable for serving untrusted users.
Using JupyterHub with untrusted users does mean more work by the administrator. Much care is required to secure a Hub, with extra caution on protecting users from each other as the Hub is serving untrusted users.
One aspect of JupyterHub’s design simplicity for semi-trusted users is that the Hub and single-user servers are placed in a single domain, behind a proxy. If the Hub is serving untrusted users, many of the web’s cross-site protections are not applied between single-user servers and the Hub, or between single-user servers and each other, since browsers see the whole thing (proxy, Hub, and single user servers) as a single website (i.e. single domain).
To protect users from each other, a user must never be able to write arbitrary HTML and serve it to another user on the Hub’s domain. JupyterHub’s authentication setup prevents a user writing arbitrary HTML and serving it to another user because only the owner of a given single-user notebook server is allowed to view user-authored pages served by the given single-user notebook server.
To protect all users from each other, JupyterHub administrators must ensure that:
PATH
jupyterhub-singleuser
~/.jupyter
JUPYTER_CONFIG_DIR
If any additional services are run on the same domain as the Hub, the services must never display user-authored HTML that is neither sanitized nor sandboxed (e.g. IFramed) to any user that lacks authentication as the author of a file.
Several approaches to mitigating these issues with configuration options provided by JupyterHub include:
JupyterHub provides the ability to run single-user servers on their own subdomains. This means the cross-origin protections between servers has the desired effect, and user servers and the Hub are protected from each other. A user’s single-user server will be at username.jupyter.mydomain.com. This also requires all user subdomains to point to the same address, which is most easily accomplished with wildcard DNS. Since this spreads the service across multiple domains, you will need wildcard SSL, as well. Unfortunately, for many institutional domains, wildcard DNS and SSL are not available. If you do plan to serve untrusted users, enabling subdomains is highly encouraged, as it resolves the cross-site issues.
username.jupyter.mydomain.com
If subdomains are not available or not desirable, JupyterHub provides a configuration option Spawner.disable_user_config, which can be set to prevent the user-owned configuration files from being loaded. After implementing this option, PATHs and package installation and PATHs are the other things that the admin must enforce.
Spawner.disable_user_config
For most Spawners, PATH is not something users can influence, but care should be taken to ensure that the Spawner does not evaluate shell configuration files prior to launching the server.
Package isolation is most easily handled by running the single-user server in a virtualenv with disabled system-site-packages. The user should not have permission to install packages into this environment.
It is important to note that the control over the environment only affects the single-user server, and not the environment(s) in which the user’s kernel(s) may run. Installing additional packages in the kernel environment does not pose additional risk to the web application’s security.
By default, all communication on the server, between the proxy, hub, and single -user notebooks is performed unencrypted. Setting the internal_ssl flag in jupyterhub_config.py secures the aforementioned routes. Turning this feature on does require that the enabled Spawner can use the certificates generated by the Hub (the default LocalProcessSpawner can, for instance).
internal_ssl
Hub
LocalProcessSpawner
It is also important to note that this encryption does not (yet) cover the zmq tcp sockets between the Notebook client and kernel. While users cannot submit arbitrary commands to another user’s kernel, they can bind to these sockets and listen. When serving untrusted users, this eavesdropping can be mitigated by setting KernelManager.transport to ipc. This applies standard Unix permissions to the communication sockets thereby restricting communication to the socket owner. The internal_ssl option will eventually extend to securing the tcp sockets as well.
zmq tcp
KernelManager.transport
ipc
tcp
We recommend that you do periodic reviews of your deployment’s security. It’s good practice to keep JupyterHub, configurable-http-proxy, and nodejs versions up to date.
A handy website for testing your deployment is Qualsys’ SSL analyzer tool.
If you believe you’ve found a security vulnerability in JupyterHub, or any Jupyter project, please report it to security@ipython.org. If you prefer to encrypt your security reports, you can use this PGP public key.
The Authenticator is the mechanism for authorizing users to use the Hub and single user notebook servers.
JupyterHub ships with the default PAM-based Authenticator, for logging in with local user accounts via a username and password.
Some login mechanisms, such as OAuth, don’t map onto username and password authentication, and instead use tokens. When using these mechanisms, you can override the login handlers.
You can see an example implementation of an Authenticator that uses GitHub OAuth at OAuthenticator.
When testing, it may be helpful to use the :class:~jupyterhub.auth.DummyAuthenticator. This allows for any username and password unless if a global password has been set. Once set, any username will still be accepted but the correct password will need to be provided.
A partial list of other authenticators is available on the JupyterHub wiki.
The base authenticator uses simple username and password authentication.
The base Authenticator has one central method:
Authenticator.authenticate(handler, data)
This method is passed the Tornado RequestHandler and the POST data from JupyterHub’s login form. Unless the login form has been customized, data will have two keys:
RequestHandler
POST data
data
password
The authenticate method’s job is simple:
authenticate
Writing an Authenticator that looks up passwords in a dictionary requires only overriding this one method:
from IPython.utils.traitlets import Dict from jupyterhub.auth import Authenticator class DictionaryAuthenticator(Authenticator): passwords = Dict(config=True, help="""dict of username:password for authentication""" ) async def authenticate(self, handler, data): if self.passwords.get(data['username']) == data['password']: return data['username']
Since the Authenticator and Spawner both use the same username, sometimes you want to transform the name coming from the authentication service (e.g. turning email addresses into local system usernames) before adding them to the Hub service. Authenticators can define normalize_username, which takes a username. The default normalization is to cast names to lowercase
normalize_username
For simple mappings, a configurable dict Authenticator.username_map is used to turn one name into another:
Authenticator.username_map
c.Authenticator.username_map = { 'service-name': 'localname' }
When using PAMAuthenticator, you can set c.PAMAuthenticator.pam_normalize_username = True, which will normalize usernames using PAM (basically round-tripping them: username to uid to username), which is useful in case you use some external service that allows multiple usernames mapping to the same user (such as ActiveDirectory, yes, this really happens). When pam_normalize_username is on, usernames are not normalized to lowercase.
PAMAuthenticator
c.PAMAuthenticator.pam_normalize_username = True
pam_normalize_username
In most cases, there is a very limited set of acceptable usernames. Authenticators can define validate_username(username), which should return True for a valid username and False for an invalid one. The primary effect this has is improving error messages during user creation.
validate_username(username)
The default behavior is to use configurable Authenticator.username_pattern, which is a regular expression string for validation.
Authenticator.username_pattern
To only allow usernames that start with ‘w’:
c.Authenticator.username_pattern = r'w.*'
You can use custom Authenticator subclasses to enable authentication via other mechanisms. One such example is using GitHub OAuth.
Because the username is passed from the Authenticator to the Spawner, a custom Authenticator and Spawner are often used together. For example, the Authenticator methods, pre_spawn_start(user, spawner) and post_spawn_stop(user, spawner), are hooks that can be used to do auth-related startup (e.g. opening PAM sessions) and cleanup (e.g. closing PAM sessions).
See a list of custom Authenticators on the wiki.
If you are interested in writing a custom authenticator, you can read this tutorial.
As of JupyterHub 1.0, custom authenticators can register themselves via the jupyterhub.authenticators entry point metadata. To do this, in your setup.py add:
jupyterhub.authenticators
setup.py
setup( ... entry_points={ 'jupyterhub.authenticators': [ 'myservice = mypackage:MyAuthenticator', ], }, )
If you have added this metadata to your package, users can select your authenticator with the configuration:
c.JupyterHub.authenticator_class = 'myservice'
instead of the full
c.JupyterHub.authenticator_class = 'mypackage:MyAuthenticator'
previously required. Additionally, configurable attributes for your authenticator will appear in jupyterhub help output and auto-generated configuration files via jupyterhub --generate-config.
JupyterHub 0.8 adds the ability to persist state related to authentication, such as auth-related tokens. If such state should be persisted, .authenticate() should return a dictionary of the form:
.authenticate()
{ 'name': username, 'auth_state': { 'key': 'value', } }
where username is the username that has been authenticated, and auth_state is any JSON-serializable dictionary.
auth_state
Because auth_state may contain sensitive information, it is encrypted before being stored in the database. To store auth_state, two conditions must be met:
persisting auth state must be enabled explicitly via configuration
c.Authenticator.enable_auth_state = True
encryption must be enabled by the presence of JUPYTERHUB_CRYPT_KEY environment variable, which should be a hex-encoded 32-byte key. For example:
JUPYTERHUB_CRYPT_KEY
export JUPYTERHUB_CRYPT_KEY=$(openssl rand -hex 32)
JupyterHub uses Fernet to encrypt auth_state. To facilitate key-rotation, JUPYTERHUB_CRYPT_KEY may be a semicolon-separated list of encryption keys. If there are multiple keys present, the first key is always used to persist any new auth_state.
Typically, if auth_state is persisted it is desirable to affect the Spawner environment in some way. This may mean defining environment variables, placing certificate in the user’s home directory, etc. The Authenticator.pre_spawn_start method can be used to pass information from authenticator state to Spawner environment:
Authenticator.pre_spawn_start
class MyAuthenticator(Authenticator): async def authenticate(self, handler, data=None): username = await identify_user(handler, data) upstream_token = await token_for_user(username) return { 'name': username, 'auth_state': { 'upstream_token': upstream_token, }, } async def pre_spawn_start(self, user, spawner): """Pass upstream_token to spawner via environment variable""" auth_state = await user.get_auth_state() if not auth_state: # auth_state not enabled return spawner.environment['UPSTREAM_TOKEN'] = auth_state['upstream_token']
Authenticators uses two hooks, pre_spawn_start(user, spawner) and post_spawn_stop(user, spawner) to add pass additional state information between the authenticator and a spawner. These hooks are typically used auth-related startup, i.e. opening a PAM session, and auth-related cleanup, i.e. closing a PAM session.
Beginning with version 0.8, JupyterHub is an OAuth provider.
A Spawner starts each single-user notebook server. The Spawner represents an abstract interface to a process, and a custom Spawner needs to be able to take three actions:
Custom Spawners for JupyterHub can be found on the JupyterHub wiki. Some examples include:
dockerspawner.DockerSpawner
dockerspawner.SystemUserSpawner
DockerSpawner
SystemUserSpawner
sudo
Spawner.start should start the single-user server for a single user. Information about the user can be retrieved from self.user, an object encapsulating the user’s name, authentication, and server info.
Spawner.start
self.user
The return value of Spawner.start should be the (ip, port) of the running server.
NOTE: When writing coroutines, never yield in between a database change and a commit.
yield
Most Spawner.start functions will look similar to this example:
def start(self): self.ip = '127.0.0.1' self.port = random_port() # get environment variables, # several of which are required for configuring the single-user server env = self.get_env() cmd = [] # get jupyterhub command to run, # typically ['jupyterhub-singleuser'] cmd.extend(self.cmd) cmd.extend(self.get_args()) yield self._actually_start_server_somehow(cmd, env) return (self.ip, self.port)
When Spawner.start returns, the single-user server process should actually be running, not just requested. JupyterHub can handle Spawner.start being very slow (such as PBS-style batch queues, or instantiating whole AWS instances) via relaxing the Spawner.start_timeout config value.
Spawner.start_timeout
Spawner.poll should check if the spawner is still running. It should return None if it is still running, and an integer exit status, otherwise.
Spawner.poll
For the local process case, Spawner.poll uses os.kill(PID, 0) to check if the local process is still running. On Windows, it uses psutil.pid_exists.
os.kill(PID, 0)
psutil.pid_exists
Spawner.stop should stop the process. It must be a tornado coroutine, which should return when the process has finished exiting.
Spawner.stop
JupyterHub should be able to stop and restart without tearing down single-user notebook servers. To do this task, a Spawner may need to persist some information that can be restored later. A JSON-able dictionary of state can be used to store persisted information.
Unlike start, stop, and poll methods, the state methods must not be coroutines.
For the single-process case, the Spawner state is only the process ID of the server:
def get_state(self): """get the current state""" state = super().get_state() if self.pid: state['pid'] = self.pid return state def load_state(self, state): """load state from the database""" super().load_state(state) if 'pid' in state: self.pid = state['pid'] def clear_state(self): """clear any state (called after shutdown)""" super().clear_state() self.pid = 0
(new in 0.4)
Some deployments may want to offer options to users to influence how their servers are started. This may include cluster-based deployments, where users specify what resources should be available, or docker-based deployments where users can select from a list of base images.
This feature is enabled by setting Spawner.options_form, which is an HTML form snippet inserted unmodified into the spawn form. If the Spawner.options_form is defined, when a user tries to start their server, they will be directed to a form page, like this:
If Spawner.options_form is undefined, the user’s server is spawned directly, and no spawn page is rendered.
See this example for a form that allows custom CLI args for the local spawner.
Spawner.options_from_form
Options from this form will always be a dictionary of lists of strings, e.g.:
{ 'integer': ['5'], 'text': ['some text'], 'select': ['a', 'b'], }
When formdata arrives, it is passed through Spawner.options_from_form(formdata), which is a method to turn the form data into the correct structure. This method must return a dictionary, and is meant to interpret the lists-of-strings into the correct types. For example, the options_from_form for the above form would look like:
formdata
Spawner.options_from_form(formdata)
options_from_form
def options_from_form(self, formdata): options = {} options['integer'] = int(formdata['integer'][0]) # single integer value options['text'] = formdata['text'][0] # single string value options['select'] = formdata['select'] # list already correct options['notinform'] = 'extra info' # not in the form at all return options
which would return:
{ 'integer': 5, 'text': 'some text', 'select': ['a', 'b'], 'notinform': 'extra info', }
When Spawner.start is called, this dictionary is accessible as self.user_options.
self.user_options
If you are interested in building a custom spawner, you can read this tutorial.
As of JupyterHub 1.0, custom Spawners can register themselves via the jupyterhub.spawners entry point metadata. To do this, in your setup.py add:
jupyterhub.spawners
setup( ... entry_points={ 'jupyterhub.spawners': [ 'myservice = mypackage:MySpawner', ], }, )
If you have added this metadata to your package, users can select your spawner with the configuration:
c.JupyterHub.spawner_class = 'myservice'
c.JupyterHub.spawner_class = 'mypackage:MySpawner'
previously required. Additionally, configurable attributes for your spawner will appear in jupyterhub help output and auto-generated configuration files via jupyterhub --generate-config.
Some spawners of the single-user notebook servers allow setting limits or guarantees on resources, such as CPU and memory. To provide a consistent experience for sysadmins and users, we provide a standard way to set and discover these resource limits and guarantees, such as for memory and CPU. For the limits and guarantees to be useful, the spawner must implement support for them. For example, LocalProcessSpawner, the default spawner, does not support limits and guarantees. One of the spawners that supports limits and guarantees is the systemdspawner.
systemdspawner
c.Spawner.mem_limit: A limit specifies the maximum amount of memory that may be allocated, though there is no promise that the maximum amount will be available. In supported spawners, you can set c.Spawner.mem_limit to limit the total amount of memory that a single-user notebook server can allocate. Attempting to use more memory than this limit will cause errors. The single-user notebook server can discover its own memory limit by looking at the environment variable MEM_LIMIT, which is specified in absolute bytes.
c.Spawner.mem_limit
MEM_LIMIT
c.Spawner.mem_guarantee: Sometimes, a guarantee of a minimum amount of memory is desirable. In this case, you can set c.Spawner.mem_guarantee to to provide a guarantee that at minimum this much memory will always be available for the single-user notebook server to use. The environment variable MEM_GUARANTEE will also be set in the single-user notebook server.
c.Spawner.mem_guarantee
MEM_GUARANTEE
The spawner’s underlying system or cluster is responsible for enforcing these limits and providing these guarantees. If these values are set to None, no limits or guarantees are provided, and no environment values are set.
c.Spawner.cpu_limit: In supported spawners, you can set c.Spawner.cpu_limit to limit the total number of cpu-cores that a single-user notebook server can use. These can be fractional - 0.5 means 50% of one CPU core, 4.0 is 4 cpu-cores, etc. This value is also set in the single-user notebook server’s environment variable CPU_LIMIT. The limit does not claim that you will be able to use all the CPU up to your limit as other higher priority applications might be taking up CPU.
c.Spawner.cpu_limit
0.5
4.0
CPU_LIMIT
c.Spawner.cpu_guarantee: You can set c.Spawner.cpu_guarantee to provide a guarantee for CPU usage. The environment variable CPU_GUARANTEE will be set in the single-user notebook server when a guarantee is being provided.
c.Spawner.cpu_guarantee
CPU_GUARANTEE
Communication between the Proxy, Hub, and Notebook can be secured by turning on internal_ssl in jupyterhub_config.py. For a custom spawner to utilize these certs, there are two methods of interest on the base Spawner class: .create_certs and .move_certs.
Proxy
Notebook
.create_certs
.move_certs
The first method, .create_certs will sign a key-cert pair using an internally trusted authority for notebooks. During this process, .create_certs can apply ip and dns name information to the cert via an alt_names kwarg. This is used for certificate authentication (verification). Without proper verification, the Notebook will be unable to communicate with the Hub and vice versa when internal_ssl is enabled. For example, given a deployment using the DockerSpawner which will start containers with ips from the docker subnet pool, the DockerSpawner would need to instead choose a container ip prior to starting and pass that to .create_certs (TODO: edit).
ip
dns
alt_names
kwarg
ips
docker
In general though, this method will not need to be changed and the default ip/dns (localhost) info will suffice.
When .create_certs is run, it will .create_certs in a default, central location specified by c.JupyterHub.internal_certs_location. For Spawners that need access to these certs elsewhere (i.e. on another host altogether), the .move_certs method can be overridden to move the certs appropriately. Again, using DockerSpawner as an example, this would entail moving certs to a directory that will get mounted into the container this spawner starts.
c.JupyterHub.internal_certs_location
Spawners
With version 0.7, JupyterHub adds support for Services.
This section provides the following information about Services:
When working with JupyterHub, a Service is defined as a process that interacts with the Hub’s REST API. A Service may perform a specific action or task. For example, the following tasks can each be a unique Service:
Two key features help define a Service:
Currently, these characteristics distinguish two types of Services:
A Service may have the following properties:
name: str
admin: bool (default - false)
url: str (default - None)
/services/:name
api_token: str (default - None)
If a service is also to be managed by the Hub, it has a few extra options:
command: (str/Popen list)
environment: dict
user: str
A Hub-Managed Service is started by the Hub, and the Hub is responsible for the Service’s actions. A Hub-Managed Service can only be a local subprocess of the Hub. The Hub will take care of starting the process and restarts it if it stops.
While Hub-Managed Services share some similarities with notebook Spawners, there are no plans for Hub-Managed Services to support the same spawning abstractions as a notebook Spawner.
If you wish to run a Service in a Docker container or other deployment environments, the Service can be registered as an Externally-Managed Service, as described below.
A Hub-Managed Service is characterized by its specified command for launching the Service. For example, a ‘cull idle’ notebook server task configured as a Hub-Managed Service would include:
command
This example would be configured as follows in jupyterhub_config.py:
c.JupyterHub.services = [ { 'name': 'idle-culler', 'admin': True, 'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600'] } ]
A Hub-Managed Service may also be configured with additional optional parameters, which describe the environment needed to start the Service process:
cwd: path
The Hub will pass the following environment variables to launch the Service:
JUPYTERHUB_SERVICE_NAME: The name of the service JUPYTERHUB_API_TOKEN: API token assigned to the service JUPYTERHUB_API_URL: URL for the JupyterHub API (default, http://127.0.0.1:8080/hub/api) JUPYTERHUB_BASE_URL: Base URL of the Hub (https://mydomain[:port]/) JUPYTERHUB_SERVICE_PREFIX: URL path prefix of this service (/services/:service-name/) JUPYTERHUB_SERVICE_URL: Local URL where the service is expected to be listening. Only for proxied web services.
For the previous ‘cull idle’ Service example, these environment variables would be passed to the Service when the Hub starts the ‘cull idle’ Service:
JUPYTERHUB_SERVICE_NAME: 'idle-culler' JUPYTERHUB_API_TOKEN: API token assigned to the service JUPYTERHUB_API_URL: http://127.0.0.1:8080/hub/api JUPYTERHUB_BASE_URL: https://mydomain[:port] JUPYTERHUB_SERVICE_PREFIX: /services/idle-culler/
See the GitHub repo for additional information about the jupyterhub_idle_culler.
You may prefer to use your own service management tools, such as Docker or systemd, to manage a JupyterHub Service. These Externally-Managed Services, unlike Hub-Managed Services, are not subprocesses of the Hub. You must tell JupyterHub which API token the Externally-Managed Service is using to perform its API requests. Each Externally-Managed Service will need a unique API token, because the Hub authenticates each API request and the API token is used to identify the originating Service or user.
A configuration example of an Externally-Managed Service with admin access and running its own web server is:
c.JupyterHub.services = [ { 'name': 'my-web-service', 'url': 'https://10.0.1.1:1984', # any secret >8 characters, you'll use api_token to # authenticate api requests to the hub from your service 'api_token': 'super-secret', } ]
In this case, the url field will be passed along to the Service as JUPYTERHUB_SERVICE_URL.
url
JUPYTERHUB_SERVICE_URL
When writing your own services, you have a few decisions to make (in addition to what your service does!):
When a Service is managed by JupyterHub, the Hub will pass the necessary information to the Service via the environment variables described above. A flexible Service, whether managed by the Hub or not, can make use of these same environment variables.
When you run a service that has a url, it will be accessible under a /services/ prefix, such as https://myhub.horse/services/my-service/. For your service to route proxied requests properly, it must take JUPYTERHUB_SERVICE_PREFIX into account when routing requests. For example, a web service would normally service its root handler at '/', but the proxied service would need to serve JUPYTERHUB_SERVICE_PREFIX.
/services/
https://myhub.horse/services/my-service/
JUPYTERHUB_SERVICE_PREFIX
'/'
Note that JUPYTERHUB_SERVICE_PREFIX will contain a trailing slash. This must be taken into consideration when creating the service routes. If you include an extra slash you might get unexpected behavior. For example if your service has a /foo endpoint, the route would be JUPYTERHUB_SERVICE_PREFIX + foo, and /foo/bar would be JUPYTERHUB_SERVICE_PREFIX + foo/bar.
/foo
JUPYTERHUB_SERVICE_PREFIX + foo
/foo/bar
JUPYTERHUB_SERVICE_PREFIX + foo/bar
JupyterHub 0.7 introduces some utilities for using the Hub’s authentication mechanism to govern access to your service. When a user logs into JupyterHub, the Hub sets a cookie (jupyterhub-services). The service can use this cookie to authenticate requests.
jupyterhub-services
JupyterHub ships with a reference implementation of Hub authentication that can be used by services. You may go beyond this reference implementation and create custom hub-authenticating clients and services. We describe the process below.
The reference, or base, implementation is the HubAuth class, which implements the requests to the Hub.
HubAuth
To use HubAuth, you must set the .api_token, either programmatically when constructing the class, or via the JUPYTERHUB_API_TOKEN environment variable.
.api_token
Most of the logic for authentication implementation is found in the HubAuth.user_for_cookie and in the HubAuth.user_for_token methods, which makes a request of the Hub, and returns:
HubAuth.user_for_cookie
HubAuth.user_for_token
None, if no user could be identified, or
a dict of the following form:
{ "name": "username", "groups": ["list", "of", "groups"], "admin": False, # or True }
You are then free to use the returned user information to take appropriate action.
HubAuth also caches the Hub’s response for a number of seconds, configurable by the cookie_cache_max_age setting (default: five minutes).
cookie_cache_max_age
For example, you have a Flask service that returns information about a user. JupyterHub’s HubAuth class can be used to authenticate requests to the Flask service. See the service-whoami-flask example in the JupyterHub GitHub repo for more details.
service-whoami-flask
from functools import wraps import json import os from urllib.parse import quote from flask import Flask, redirect, request, Response from jupyterhub.services.auth import HubAuth prefix = os.environ.get('JUPYTERHUB_SERVICE_PREFIX', '/') auth = HubAuth( api_token=os.environ['JUPYTERHUB_API_TOKEN'], cache_max_age=60, ) app = Flask(__name__) def authenticated(f): """Decorator for authenticating with the Hub""" @wraps(f) def decorated(*args, **kwargs): cookie = request.cookies.get(auth.cookie_name) token = request.headers.get(auth.auth_header_name) if cookie: user = auth.user_for_cookie(cookie) elif token: user = auth.user_for_token(token) else: user = None if user: return f(user, *args, **kwargs) else: # redirect to login url on failed auth return redirect(auth.login_url + '?next=%s' % quote(request.path)) return decorated @app.route(prefix) @authenticated def whoami(user): return Response( json.dumps(user, indent=1, sort_keys=True), mimetype='application/json', )
Since most Jupyter services are written with tornado, we include a mixin class, HubAuthenticated, for quickly authenticating your own tornado services with JupyterHub.
HubAuthenticated
Tornado’s @web.authenticated method calls a Handler’s .get_current_user method to identify the user. Mixing in HubAuthenticated defines get_current_user to use HubAuth. If you want to configure the HubAuth instance beyond the default, you’ll want to define an initialize method, such as:
@web.authenticated
.get_current_user
get_current_user
initialize
class MyHandler(HubAuthenticated, web.RequestHandler): hub_users = {'inara', 'mal'} def initialize(self, hub_auth): self.hub_auth = hub_auth @web.authenticated def get(self): ...
The HubAuth will automatically load the desired configuration from the Service environment variables.
If you want to limit user access, you can specify allowed users through either the .hub_users attribute or .hub_groups. These are sets that check against the username and user group list, respectively. If a user matches neither the user list nor the group list, they will not be allowed access. If both are left undefined, then any user will be allowed.
.hub_users
.hub_groups
If you don’t want to use the reference implementation (e.g. you find the implementation a poor fit for your Flask app), you can implement authentication via the Hub yourself. We recommend looking at the HubAuth class implementation for reference, and taking note of the following process:
retrieve the cookie jupyterhub-services from the request.
Make an API request GET /hub/api/authorizations/cookie/jupyterhub-services/cookie-value, where cookie-value is the url-encoded value of the jupyterhub-services cookie. This request must be authenticated with a Hub API token in the Authorization header, for example using the api_token from your external service’s configuration.
GET /hub/api/authorizations/cookie/jupyterhub-services/cookie-value
Authorization
api_token
For example, with requests:
r = requests.get( '/'.join(["http://127.0.0.1:8081/hub/api", "authorizations/cookie/jupyterhub-services", quote(encrypted_cookie, safe=''), ]), headers = { 'Authorization' : 'token %s' % api_token, }, ) r.raise_for_status() user = r.json()
On success, the reply will be a JSON model describing the user:
{ "name": "inara", "groups": ["serenity", "guild"], }
An example of using an Externally-Managed Service and authentication is in nbviewer README section on securing the notebook viewer, and an example of its configuration is found here. nbviewer can also be run as a Hub-Managed Service as described nbviewer README section on securing the notebook viewer.
JupyterHub 0.8 introduced the ability to write a custom implementation of the proxy. This enables deployments with different needs than the default proxy, configurable-http-proxy (CHP). CHP is a single-process nodejs proxy that the Hub manages by default as a subprocess (it can be run externally, as well, and typically is in production deployments).
The upside to CHP, and why we use it by default, is that it’s easy to install and run (if you have nodejs, you are set!). The downsides are that it’s a single process and does not support any persistence of the routing table. So if the proxy process dies, your whole JupyterHub instance is inaccessible until the Hub notices, restarts the proxy, and restores the routing table. For deployments that want to avoid such a single point of failure, or leverage existing proxy infrastructure in their chosen deployment (such as Kubernetes ingress objects), the Proxy API provides a way to do that.
In general, for a proxy to be usable by JupyterHub, it must:
Optionally, if the JupyterHub deployment is to use host-based routing, the Proxy must additionally support routing based on the Host of the request.
To start, any Proxy implementation should subclass the base Proxy class, as is done with custom Spawners and Authenticators.
from jupyterhub.proxy import Proxy class MyProxy(Proxy): """My Proxy implementation""" ...
If your proxy should be launched when the Hub starts, you must define how to start and stop your proxy:
class MyProxy(Proxy): ... async def start(self): """Start the proxy""" async def stop(self): """Stop the proxy"""
These methods may be coroutines.
c.Proxy.should_start is a configurable flag that determines whether the Hub should call these methods when the Hub itself starts and stops.
c.Proxy.should_start
When using internal_ssl to encrypt traffic behind the proxy, at minimum, your Proxy will need client ssl certificates which the Hub must be made aware of. These can be generated with the command jupyterhub --generate-certs which will write them to the internal_certs_location in folders named proxy_api and proxy_client. Alternatively, these can be provided to the hub via the jupyterhub_config.py file by providing a dict of named paths to the external_authorities option. The hub will include all certificates provided in that dict in the trust bundle utilized by all internal components.
jupyterhub --generate-certs
internal_certs_location
proxy_api
proxy_client
dict
external_authorities
Probably most custom proxies will be externally managed, such as Kubernetes ingress-based implementations. In this case, you do not need to define start and stop. To disable the methods, you can define should_start = False at the class level:
start
stop
should_start = False
class MyProxy(Proxy): should_start = False
At its most basic, a Proxy implementation defines a mechanism to add, remove, and retrieve routes. A proxy that implements these three methods is complete. Each of these methods may be a coroutine.
Definition: routespec
A routespec, which will appear in these methods, is a string describing a route to be proxied, such as /user/name/. A routespec will:
/user/name/
/proxy/path/
host.tld/proxy/path/
When adding a route, JupyterHub may pass a JSON-serializable dict as a data argument that should be attached to the proxy route. When that route is retrieved, the data argument should be returned as well. If your proxy implementation doesn’t support storing data attached to routes, then your Python wrapper may have to handle storing the data piece itself, e.g in a simple file or database.
async def add_route(self, routespec, target, data): """Proxy `routespec` to `target`. Store `data` associated with the routespec for retrieval later. """
Adding a route for a user looks like this:
await proxy.add_route('/user/pgeorgiou/', 'http://127.0.0.1:1227', {'user': 'pgeorgiou'})
delete_route() is given a routespec to delete. If there is no such route, delete_route should still succeed, but a warning may be issued.
delete_route()
delete_route
async def delete_route(self, routespec): """Delete the route"""
For retrieval, you only need to implement a single method that retrieves all routes. The return value for this function should be a dictionary, keyed by routespect, of dicts whose keys are the same three arguments passed to add_route (routespec, target, data)
routespect
add_route
routespec
target
async def get_all_routes(self): """Return all routes, keyed by routespec"""
{ '/proxy/path/': { 'routespec': '/proxy/path/', 'target': 'http://...', 'data': {}, }, }
JupyterHub can track activity of users, for use in services such as culling idle servers. As of JupyterHub 0.8, this activity tracking is the responsibility of the proxy. If your proxy implementation can track activity to endpoints, it may add a last_activity key to the data of routes retrieved in .get_all_routes(). If present, the value of last_activity should be an ISO8601 UTC date string:
last_activity
.get_all_routes()
{ '/user/pgeorgiou/': { 'routespec': '/user/pgeorgiou/', 'target': 'http://127.0.0.1:1227', 'data': { 'user': 'pgeourgiou', 'last_activity': '2017-10-03T10:33:49.570Z', }, }, }
If the proxy does not track activity, then only activity to the Hub itself is tracked, and services such as cull-idle will not work.
Now that notebook-5.0 tracks activity internally, we can retrieve activity information from the single-user servers instead, removing the need to track activity in the proxy. But this is not yet implemented in JupyterHub 0.8.0.
notebook-5.0
As of JupyterHub 1.0, custom proxy implementations can register themselves via the jupyterhub.proxies entry point metadata. To do this, in your setup.py add:
jupyterhub.proxies
setup( ... entry_points={ 'jupyterhub.proxies': [ 'mything = mypackage:MyProxy', ], }, )
If you have added this metadata to your package, users can select your proxy with the configuration:
c.JupyterHub.proxy_class = 'mything'
c.JupyterHub.proxy_class = 'mypackage:MyProxy'
previously required. Additionally, configurable attributes for your proxy will appear in jupyterhub help output and auto-generated configuration files via jupyterhub --generate-config.
The thing which users directly connect to is the proxy, by default configurable-http-proxy. The proxy either redirects users to the hub (for login and managing servers), or to their own single-user servers. Thus, as long as the proxy stays running, access to existing servers continues, even if the hub itself restarts or goes down.
When you first configure the hub, you may not even realize this because the proxy is automatically managed by the hub. This is great for getting started and even most use, but everytime you restart the hub, all user connections also get restarted. But it’s also simple to run the proxy as a service separate from the hub, so that you are free to reconfigure the hub while only interrupting users who are currently actively starting the hub.
The default JupyterHub proxy is configurable-http-proxy, and that page has some docs. If you are using a different proxy, such as Traefik, these instructions are probably not relevant to you.
c.JupyterHub.cleanup_servers = False should be set, which tells the hub to not stop servers when the hub restarts (this is useful even if you don’t run the proxy separately).
c.JupyterHub.cleanup_servers = False
c.ConfigurableHTTPProxy.should_start = False should be set, which tells the hub that the proxy should not be started (because you start it yourself).
c.ConfigurableHTTPProxy.should_start = False
c.ConfigurableHTTPProxy.auth_token = "CONFIGPROXY_AUTH_TOKEN" should be set to a token for authenticating communication with the proxy.
c.ConfigurableHTTPProxy.auth_token = "CONFIGPROXY_AUTH_TOKEN"
c.ConfigurableHTTPProxy.api_url = 'http://localhost:8001' should be set to the URL which the hub uses to connect to the proxy’s API.
c.ConfigurableHTTPProxy.api_url = 'http://localhost:8001'
You need to configure a service to start the proxy. An example command line for this is configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error. (Details for how to do this is out of scope for this tutorial - for example it might be a systemd service on within another docker cotainer). The proxy has no configuration files, all configuration is via the command line and environment variables.
configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error
--api-ip and --api-port (which tells the proxy where to listen) should match the hub’s ConfigurableHTTPProxy.api_url.
--api-ip
--api-port
ConfigurableHTTPProxy.api_url
--ip, -port, and other options configure the user connections to the proxy.
--ip
-port
--default-target and --error-target should point to the hub, and used when users navigate to the proxy originally.
--default-target
--error-target
You must define the environment variable CONFIGPROXY_AUTH_TOKEN to match the token given to c.ConfigurableHTTPProxy.auth_token.
c.ConfigurableHTTPProxy.auth_token
You should check the configurable-http-proxy options to see what other options are needed, for example SSL options. Note that these are configured in the hub if the hub is starting the proxy - you need to move the options to here.
You can use jupyterhub configurable-http-proxy docker image to run the proxy.
This section will give you information on:
Using the JupyterHub REST API, you can perform actions on the Hub, such as:
A REST API provides a standard way for users to get and send information to the Hub.
To send requests using JupyterHub API, you must pass an API token with the request.
As of version 0.6.0, the preferred way of generating an API token is:
This openssl command generates a potential token that can then be added to JupyterHub using .api_tokens configuration setting in jupyterhub_config.py.
openssl
.api_tokens
Alternatively, use the jupyterhub token command to generate a token for a specific hub user by passing the ‘username’:
jupyterhub token
jupyterhub token <username>
This command generates a random string to use as a token and registers it for the given user with the Hub’s database.
This is deprecated. We are in no rush to remove this feature, but please consider if service tokens are right for you.
You may also add a dictionary of API tokens and usernames to the hub’s configuration file, jupyterhub_config.py (note that the key is the ‘secret-token’ while the value is the ‘username’):
c.JupyterHub.api_tokens = { 'secret-token': 'username', }
The api_tokens configuration has been softly deprecated since the introduction of services. We have no plans to remove it, but users are encouraged to use service configuration instead.
api_tokens
If you have been using api_tokens to create an admin user and a token for that user to perform some automations, the services mechanism may be a better fit. If you have the following configuration:
c.JupyterHub.admin_users = {"service-admin",} c.JupyterHub.api_tokens = { "secret-token": "service-admin", }
This can be updated to create an admin service, with the following configuration:
c.JupyterHub.services = [ { "name": "service-token", "admin": True, "api_token": "secret-token", }, ]
The token will have the same admin permissions, but there will no longer be a user account created to house it. The main noticeable difference is that there will be no notebook server associated with the account and the service will not show up in the various user list pages and APIs.
To authenticate your requests, pass the API token in the request’s Authorization header.
Using the popular Python requests library, here’s example code to make an API request for the users of a JupyterHub deployment. An API GET request is made, and the request sends an API token for authorization. The response contains information about the users:
import requests api_url = 'http://127.0.0.1:8081/hub/api' r = requests.get(api_url + '/users', headers={ 'Authorization': 'token %s' % token, } ) r.raise_for_status() users = r.json()
This example provides a slightly more complicated request, yet the process is very similar:
import requests api_url = 'http://127.0.0.1:8081/hub/api' data = {'name': 'mygroup', 'users': ['user1', 'user2']} r = requests.post(api_url + '/groups/formgrade-data301/users', headers={ 'Authorization': 'token %s' % token, }, json=data ) r.raise_for_status() r.json()
The same API token can also authorize access to the Jupyter Notebook REST API provided by notebook servers managed by JupyterHub if one of the following is true:
c.JupyterHub.admin_access
With JupyterHub version 0.8, support for multiple servers per user has landed. Prior to that, each user could only launch a single default server via the API like this:
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/server"
With the named-server functionality, it’s now possible to launch more than one specifically named servers against a given user. This could be used, for instance, to launch each server based on a different image.
First you must enable named-servers by including the following setting in the jupyterhub_config.py file.
c.JupyterHub.allow_named_servers = True
If using the zero-to-jupyterhub-k8s set-up to run JupyterHub, then instead of editing the jupyterhub_config.py file directly, you could pass the following as part of the config.yaml file, as per the tutorial:
config.yaml
hub: extraConfig: | c.JupyterHub.allow_named_servers = True
With that setting in place, a new named-server is activated like this:
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/servers/<serverA>" curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/servers/<serverB>"
The same servers can be stopped by substituting DELETE for POST above.
DELETE
POST
For named-servers via the API to work, the spawner used to spawn these servers will need to be able to handle the case of multiple servers per user and ensure uniqueness of names, particularly if servers are spawned via docker containers or kubernetes pods.
You can see the full JupyterHub REST API for details. This REST API Spec can be viewed in a more interactive style on swagger’s petstore. Both resources contain the same information and differ only in its display. Note: The Swagger specification is being renamed the OpenAPI Initiative.
This section covers details on monitoring the state of your JupyterHub installation.
JupyterHub expose the /metrics endpoint that returns text describing its current operational state formatted in a way Prometheus understands.
/metrics
Prometheus is a separate open source tool that can be configured to repeatedly poll JupyterHub’s /metrics endpoint to parse and save its current state.
By doing so, Prometheus can describe JupyterHub’s evolving state over time. This evolving state can then be accessed through Prometheus that expose its underlying storage to those allowed to access it, and be presented with dashboards by a tool like Grafana.
JupyterHub uses a database to store information about users, services, and other data needed for operating the Hub.
The default database for JupyterHub is a SQLite database. We have chosen SQLite as JupyterHub’s default for its lightweight simplicity in certain uses such as testing, small deployments and workshops.
For production systems, SQLite has some disadvantages when used with JupyterHub:
upgrade-db
downgrade-db
The sqlite documentation provides a helpful page about when to use SQLite and where traditional RDBMS may be a better choice.
When running a long term deployment or a production system, we recommend using a traditional RDBMS database, such as PostgreSQL or MySQL, that supports the SQL ALTER TABLE statement.
ALTER TABLE
The SQLite database should not be used on NFS. SQLite uses reader/writer locks to control access to the database. This locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations. Therefore, you should avoid putting SQLite database files on NFS since it will not handle well multiple processes which might try to access the file at the same time.
fcntl()
We recommend using PostgreSQL for production if you are unsure whether to use MySQL or PostgreSQL or if you do not have a strong preference. There is additional configuration required for MySQL that is not needed for PostgreSQL.
pymysql
pool_recycle
utf8mb4
1709, Index column size too large
innodb_large_prefix
innodb_file_format
Barracuda
row_format
DYNAMIC
The pages of the JupyterHub application are generated from Jinja templates. These allow the header, for example, to be defined once and incorporated into all pages. By providing your own templates, you can have complete control over JupyterHub’s appearance.
JupyterHub will look for custom templates in all of the paths in the JupyterHub.template_paths configuration option, falling back on the default templates if no custom template with that name is found. This fallback behavior is new in version 0.9; previous versions searched only those paths explicitly included in template_paths. You may override as many or as few templates as you desire.
JupyterHub.template_paths
template_paths
Jinja provides a mechanism to extend templates. A base template can define a block, and child templates can replace or supplement the material in the block. The JupyterHub templates make extensive use of blocks, which allows you to customize parts of the interface easily.
block
In general, a child template can extend a base template, page.html, by beginning with:
page.html
{% extends "page.html" %}
This works, unless you are trying to extend the default template for the same file name. Starting in version 0.9, you may refer to the base file with a templates/ prefix. Thus, if you are writing a custom page.html, start the file with this block:
templates/
{% extends "templates/page.html" %}
By defining blocks with same name as in the base template, child templates can replace those sections with custom content. The content from the base template can be included with the {{ super() }} directive.
{{ super() }}
To add an additional message to the spawn-pending page, below the existing text about the server starting up, place this content in a file named spawn_pending.html in a directory included in the JupyterHub.template_paths configuration option.
spawn_pending.html
{% extends "templates/spawn_pending.html" %} {% block message %} {{ super() }} <p>Patience is a virtue.</p> {% endblock %}
To add announcements to be displayed on a page, you have two options:
If you set the configuration variable JupyterHub.template_vars = {'announcement': 'some_text'}, the given some_text will be placed on the top of all pages. The more specific variables announcement_login, announcement_spawn, announcement_home, and announcement_logout are more specific and only show on their respective pages (overriding the global announcement variable). Note that changing these variables require a restart, unlike direct template extension.
JupyterHub.template_vars = {'announcement': 'some_text'}
some_text
announcement_login
announcement_spawn
announcement_home
announcement_logout
announcement
You can get the same effect by extending templates, which allows you to update the messages without restarting. Set c.JupyterHub.template_paths as mentioned above, and then create a template (for example, login.html) with:
c.JupyterHub.template_paths
login.html
{% extends "templates/login.html" %} {% set announcement = 'some message' %}
Extending page.html puts the message on all pages, but note that extending page.html take precedence over an extension of a specific page (unlike the variable-based approach above).
JupyterHub can be configured to record structured events from a running server using Jupyter’s Telemetry System. The types of events that JupyterHub emits are defined by JSON schemas listed below
emitted as JSON data, defined and validated by the JSON schemas listed below.
Event logging is handled by its Eventlog object. This leverages Python’s standing logging library to emit, filter, and collect event data.
Eventlog
To begin recording events, you’ll need to set two configurations:
handlers: tells the EventLog where to route your events. This trait is a list of Python logging handlers that route events to allows_schemas: tells the EventLog which events should be recorded. No events are emitted by default; all recorded events must be listed here.
handlers
allows_schemas
Here’s a basic example:
The output is a file, "event.log", with events recorded as JSON data.
"event.log"
Record actions on user servers made via JupyterHub.
JupyterHub can perform various actions on user servers via direct interaction from users, or via the API. This event is recorded whenever either of those happen.
Limitations:
Action performed by JupyterHub.
This is a required field.
Possibl Values:
Name of the user whose server this action was performed on.
This is the normalized name used by JupyterHub itself, which is derived from the authentication provider used but might not be the same as used in the authentication provider.
Name of the server this action was performed on.
JupyterHub supports each user having multiple servers with arbitrary names, and this field specifies the name of the server.
The ‘default’ server is denoted by the empty string
Deploying JupyterHub means you are providing Jupyter notebook environments for multiple users. Often, this includes a desire to configure the user environment in some way.
Since the jupyterhub-singleuser server extends the standard Jupyter notebook server, most configuration and documentation that applies to Jupyter Notebook applies to the single-user environments. Configuration of user environments typically does not occur through JupyterHub itself, but rather through system- wide configuration of Jupyter, which is inherited by jupyterhub-singleuser.
Tip: When searching for configuration tips for JupyterHub user environments, try removing JupyterHub from your search because there are a lot more people out there configuring Jupyter than JupyterHub and the configuration is the same.
This section will focus on user environments, including:
To make packages available to users, you generally will install packages system-wide or in a shared environment.
This installation location should always be in the same environment that jupyterhub-singleuser itself is installed in, and must be readable and executable by your users. If you want users to be able to install additional packages, it must also be writable by your users.
If you are using a standard system Python install, you would use:
sudo python3 -m pip install numpy
to install the numpy package in the default system Python 3 environment (typically /usr/local).
You may also use conda to install packages. If you do, you should make sure that the conda environment has appropriate permissions for users to be able to run Python code in the env.
Jupyter and IPython have their own configuration systems.
As a JupyterHub administrator, you will typically want to install and configure environments for all JupyterHub users. For example, you wish for each student in a class to have the same user environment configuration.
Jupyter and IPython support “system-wide” locations for configuration, which is the logical place to put global configuration that you want to affect all users. It’s generally more efficient to configure user environments “system-wide”, and it’s a good idea to avoid creating files in users’ home directories.
The typical locations for these config files are:
/etc/{jupyter|ipython}
{sys.prefix}/etc/{jupyter|ipython}
For example, to enable the cython IPython extension for all of your users, create the file /etc/ipython/ipython_config.py:
cython
/etc/ipython/ipython_config.py
c.InteractiveShellApp.extensions.append("cython")
To enable Jupyter notebook’s internal idle-shutdown behavior (requires notebook ≥ 5.4), set the following in the /etc/jupyter/jupyter_notebook_config.py file:
/etc/jupyter/jupyter_notebook_config.py
# shutdown the server after no activity for an hour c.NotebookApp.shutdown_no_activity_timeout = 60 * 60 # shutdown kernels after no activity for 20 minutes c.MappingKernelManager.cull_idle_timeout = 20 * 60 # check for idle kernels every two minutes c.MappingKernelManager.cull_interval = 2 * 60
You may have multiple Jupyter kernels installed and want to make sure that they are available to all of your users. This means installing kernelspecs either system-wide (e.g. in /usr/local/) or in the sys.prefix of JupyterHub itself.
sys.prefix
Jupyter kernelspec installation is system wide by default, but some kernels may default to installing kernelspecs in your home directory. These will need to be moved system-wide to ensure that they are accessible.
You can see where your kernelspecs are with:
jupyter kernelspec list
Assuming I have a Python 2 and Python 3 environment that I want to make sure are available, I can install their specs system-wide (in /usr/local) with:
/path/to/python3 -m IPython kernel install --prefix=/usr/local /path/to/python2 -m IPython kernel install --prefix=/usr/local
There are two broad categories of user environments that depend on what Spawner you choose:
How you configure user environments for each category can differ a bit depending on what Spawner you are using.
The first category is a shared system (multi-user host) where each user has a JupyterHub account and a home directory as well as being a real system user. In this example, shared configuration and installation must be in a ‘system-wide’ location, such as /etc/ or /usr/local or a custom prefix such as /opt/conda.
/etc/
/opt/conda
When JupyterHub uses container-based Spawners (e.g. KubeSpawner or DockerSpawner), the ‘system-wide’ environment is really the container image which you are using for users.
In both cases, you want to avoid putting configuration in user home directories because users can change those configuration settings. Also, home directories typically persist once they are created, so they are difficult for admins to update later.
By default, in a JupyterHub deployment each user has exactly one server.
JupyterHub can, however, have multiple servers per user. This is most useful in deployments where users can configure the environment in which their server will start (e.g. resource requests on an HPC cluster), so that a given user can have multiple configurations running at the same time, without having to stop and restart their one server.
To allow named servers:
Named servers were implemented in the REST API in JupyterHub 0.8, and JupyterHub 1.0 introduces UI for managing named servers via the user home page:
as well as the admin page:
Named servers can be accessed, created, started, stopped, and deleted from these pages. Activity tracking is now per-server as well.
The number of named servers per user can be limited by setting
c.JupyterHub.named_server_limit_per_user = 5
The following sections provide examples, including configuration files and tips, for the following:
In this example, we show a configuration file for a fairly standard JupyterHub deployment with the following assumptions:
spawner_class
~/assignments
Welcome.ipynb
The jupyterhub_config.py file would have these settings:
# jupyterhub_config.py file c = get_config() import os pjoin = os.path.join runtime_dir = os.path.join('/srv/jupyterhub') ssl_dir = pjoin(runtime_dir, 'ssl') if not os.path.exists(ssl_dir): os.makedirs(ssl_dir) # Allows multiple single-server per user c.JupyterHub.allow_named_servers = True # https on :443 c.JupyterHub.port = 443 c.JupyterHub.ssl_key = pjoin(ssl_dir, 'ssl.key') c.JupyterHub.ssl_cert = pjoin(ssl_dir, 'ssl.cert') # put the JupyterHub cookie secret and state db # in /var/run/jupyterhub c.JupyterHub.cookie_secret_file = pjoin(runtime_dir, 'cookie_secret') c.JupyterHub.db_url = pjoin(runtime_dir, 'jupyterhub.sqlite') # or `--db=/path/to/jupyterhub.sqlite` on the command-line # use GitHub OAuthenticator for local users c.JupyterHub.authenticator_class = 'oauthenticator.LocalGitHubOAuthenticator' c.GitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL'] # create system users that don't exist yet c.LocalAuthenticator.create_system_users = True # specify users and admin c.Authenticator.allowed_users = {'rgbkrk', 'minrk', 'jhamrick'} c.Authenticator.admin_users = {'jhamrick', 'rgbkrk'} # uses the default spawner # To use a different spawner, uncomment `spawner_class` and set to desired # spawner (e.g. SudoSpawner). Follow instructions for desired spawner # configuration. # c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner' # start single-user notebook servers in ~/assignments, # with ~/assignments/Welcome.ipynb as the default landing page # this config could also be put in # /etc/jupyter/jupyter_notebook_config.py c.Spawner.notebook_dir = '~/assignments' c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
Using the GitHub Authenticator requires a few additional environment variable to be set prior to launching JupyterHub:
export GITHUB_CLIENT_ID=github_id export GITHUB_CLIENT_SECRET=github_secret export OAUTH_CALLBACK_URL=https://example.com/hub/oauth_callback export CONFIGPROXY_AUTH_TOKEN=super-secret # append log output to log file /var/log/jupyterhub.log jupyterhub -f /etc/jupyterhub/jupyterhub_config.py &>> /var/log/jupyterhub.log
In the following example, we show configuration files for a JupyterHub server running locally on port 8000 but accessible from the outside on the standard SSL port 443. This could be useful if the JupyterHub server machine is also hosting other domains or content on 443. The goal in this example is to satisfy the following:
8000
443
HUB.DOMAIN.TLD:443
NO_HUB.DOMAIN.TLD
nginx
apache
Let’s start out with needed JupyterHub configuration in jupyterhub_config.py:
# Force the proxy to only listen to connections to 127.0.0.1 (on port 8000) c.JupyterHub.bind_url = 'http://127.0.0.1:8000'
(For Jupyterhub < 0.9 use c.JupyterHub.ip = '127.0.0.1'.)
c.JupyterHub.ip = '127.0.0.1'
For high-quality SSL configuration, we also generate Diffie-Helman parameters. This can take a few minutes:
openssl dhparam -out /etc/ssl/certs/dhparam.pem 4096
This nginx config file is fairly standard fare except for the two location blocks within the main section for HUB.DOMAIN.tld. To create a new site for jupyterhub in your nginx config, make a new file in sites.enabled, e.g. /etc/nginx/sites.enabled/jupyterhub.conf:
location
sites.enabled
/etc/nginx/sites.enabled/jupyterhub.conf
# top-level http config for websocket headers # If Upgrade is defined, Connection = upgrade # If Upgrade is empty, Connection = close map $http_upgrade $connection_upgrade { default upgrade; '' close; } # HTTP server to redirect all 80 traffic to SSL/HTTPS server { listen 80; server_name HUB.DOMAIN.TLD; # Tell all requests to port 80 to be 302 redirected to HTTPS return 302 https://$host$request_uri; } # HTTPS server to handle JupyterHub server { listen 443; ssl on; server_name HUB.DOMAIN.TLD; ssl_certificate /etc/letsencrypt/live/HUB.DOMAIN.TLD/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/HUB.DOMAIN.TLD/privkey.pem; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_prefer_server_ciphers on; ssl_dhparam /etc/ssl/certs/dhparam.pem; ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA'; ssl_session_timeout 1d; ssl_session_cache shared:SSL:50m; ssl_stapling on; ssl_stapling_verify on; add_header Strict-Transport-Security max-age=15768000; # Managing literal requests to the JupyterHub front end location / { proxy_pass http://127.0.0.1:8000; proxy_set_header X-Real-IP $remote_addr; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # websocket headers proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; proxy_set_header X-Scheme $scheme; proxy_buffering off; } # Managing requests to verify letsencrypt host location ~ /.well-known { allow all; } }
If nginx is not running on port 443, substitute $http_host for $host on the lines setting the Host header.
$http_host
$host
Host
nginx will now be the front facing element of JupyterHub on 443 which means it is also free to bind other servers, like NO_HUB.DOMAIN.TLD to the same port on the same machine and network interface. In fact, one can simply use the same server blocks as above for NO_HUB and simply add line for the root directory of the site as well as the applicable location call:
NO_HUB
server { listen 80; server_name NO_HUB.DOMAIN.TLD; # Tell all requests to port 80 to be 302 redirected to HTTPS return 302 https://$host$request_uri; } server { listen 443; ssl on; # INSERT OTHER SSL PARAMETERS HERE AS ABOVE # SSL cert may differ # Set the appropriate root directory root /var/www/html # Set URI handling location / { try_files $uri $uri/ =404; } # Managing requests to verify letsencrypt host location ~ /.well-known { allow all; } }
Now restart nginx, restart the JupyterHub, and enjoy accessing https://HUB.DOMAIN.TLD while serving other content securely on https://NO_HUB.DOMAIN.TLD.
https://HUB.DOMAIN.TLD
https://NO_HUB.DOMAIN.TLD
On distributions with SELinux enabled (e.g. Fedora), one may encounter permission errors when the nginx service is started.
We need to allow nginx to perform network relay and connect to the jupyterhub port. The following commands do that:
semanage port -a -t http_port_t -p tcp 8000 setsebool -P httpd_can_network_relay 1 setsebool -P httpd_can_network_connect 1
Replace 8000 with the port the jupyterhub server is running from.
As with nginx above, you can use Apache as the reverse proxy. First, we will need to enable the apache modules that we are going to need:
a2enmod ssl rewrite proxy proxy_http proxy_wstunnel
Our Apache configuration is equivalent to the nginx configuration above:
# redirect HTTP to HTTPS Listen 80 <VirtualHost HUB.DOMAIN.TLD:80> ServerName HUB.DOMAIN.TLD Redirect / https://HUB.DOMAIN.TLD/ </VirtualHost> Listen 443 <VirtualHost HUB.DOMAIN.TLD:443> ServerName HUB.DOMAIN.TLD # configure SSL SSLEngine on SSLCertificateFile /etc/letsencrypt/live/HUB.DOMAIN.TLD/fullchain.pem SSLCertificateKeyFile /etc/letsencrypt/live/HUB.DOMAIN.TLD/privkey.pem SSLProtocol All -SSLv2 -SSLv3 SSLOpenSSLConfCmd DHParameters /etc/ssl/certs/dhparam.pem SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH # Use RewriteEngine to handle websocket connection upgrades RewriteEngine On RewriteCond %{HTTP:Connection} Upgrade [NC] RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteRule /(.*) ws://127.0.0.1:8000/$1 [P,L] <Location "/"> # preserve Host header to avoid cross-origin problems ProxyPreserveHost on # proxy to JupyterHub ProxyPass http://127.0.0.1:8000/ ProxyPassReverse http://127.0.0.1:8000/ </Location> </VirtualHost>
In case of the need to run the jupyterhub under /jhub/ or other location please use the below configurations:
httpd.conf amendments:
RewriteRule /jhub/(.*) ws://127.0.0.1:8000/jhub/$1 [NE.P,L] RewriteRule /jhub/(.*) http://127.0.0.1:8000/jhub/$1 [NE,P,L] ProxyPass /jhub/ http://127.0.0.1:8000/jhub/ ProxyPassReverse /jhub/ http://127.0.0.1:8000/jhub/
jupyterhub_config.py amendments:
--The public facing URL of the whole JupyterHub application. --This is the address on which the proxy will bind. Sets protocol, ip, base_url c.JupyterHub.bind_url = 'http://127.0.0.1:8000/jhub/'
Note: Setting up sudo permissions involves many pieces of system configuration. It is quite easy to get wrong and very difficult to debug. Only do this if you are very sure you must.
There are many Authenticators and Spawners available for JupyterHub. Some, such as DockerSpawner or OAuthenticator, do not need any elevated permissions. This document describes how to get the full default behavior of JupyterHub while running notebook servers as real system users on a shared system without running the Hub itself as root.
Since JupyterHub needs to spawn processes as other users, the simplest way is to run it as root, spawning user servers with setuid. But this isn’t especially safe, because you have a process running on the public web as root.
A more prudent way to run the server while preserving functionality is to create a dedicated user with sudo access restricted to launching and monitoring single-user servers.
To do this, first create a user that will run the Hub:
sudo useradd rhea
This user shouldn’t have a login shell or password (possible with -r).
Next, you will need sudospawner to enable monitoring the single-user servers with sudo:
sudo python3 -m pip install sudospawner
Now we have to configure sudo to allow the Hub user (rhea) to launch the sudospawner script on behalf of our hub users (here zoe and wash). We want to confine these permissions to only what we really need.
rhea
zoe
wash
/etc/sudoers
To do this we add to /etc/sudoers (use visudo for safe editing of sudoers):
visudo
JUPYTER_USERS
JUPYTER_CMD
For example:
# comma-separated list of users that can spawn single-user servers # this should include all of your Hub users Runas_Alias JUPYTER_USERS = rhea, zoe, wash # the command(s) the Hub can run on behalf of the above users without needing a password # the exact path may differ, depending on how sudospawner was installed Cmnd_Alias JUPYTER_CMD = /usr/local/bin/sudospawner # actually give the Hub user permission to run the above command on behalf # of the above users without prompting for a password rhea ALL=(JUPYTER_USERS) NOPASSWD:JUPYTER_CMD
It might be useful to modify secure_path to add commands in path.
secure_path
As an alternative to adding every user to the /etc/sudoers file, you can use a group in the last line above, instead of JUPYTER_USERS:
rhea ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD
If the jupyterhub group exists, there will be no need to edit /etc/sudoers again. A new user will gain access to the application when added to the group:
$ adduser -G jupyterhub newuser
Test that the new user doesn’t need to enter a password to run the sudospawner command.
This should prompt for your password to switch to rhea, but not prompt for any password for the second switch. It should show some help output about logging options:
$ sudo -u rhea sudo -n -u $USER /usr/local/bin/sudospawner --help Usage: /usr/local/bin/sudospawner [OPTIONS] Options: --help show this help information ...
And this should fail:
$ sudo -u rhea sudo -n -u $USER echo 'fail' sudo: a password is required
By default, PAM authentication is used by JupyterHub. To use PAM, the process may need to be able to read the shadow password database.
Note: On Fedora based distributions there is no clear way to configure the PAM database to allow sufficient access for authenticating with the target user’s password from JupyterHub. As a workaround we recommend use an alternative authentication method.
$ ls -l /etc/shadow -rw-r----- 1 root shadow 2197 Jul 21 13:41 shadow
If there’s already a shadow group, you are set. If its permissions are more like:
$ ls -l /etc/shadow -rw------- 1 root wheel 2197 Jul 21 13:41 shadow
Then you may want to add a shadow group, and make the shadow file group-readable:
$ sudo groupadd shadow $ sudo chgrp shadow /etc/shadow $ sudo chmod g+r /etc/shadow
We want our new user to be able to read the shadow passwords, so add it to the shadow group:
$ sudo usermod -a -G shadow rhea
If you want jupyterhub to serve pages on a restricted port (such as port 80 for http), then you will need to give node permission to do so:
sudo setcap 'cap_net_bind_service=+ep' /usr/bin/node
However, you may want to further understand the consequences of this.
You may also be interested in limiting the amount of CPU any process can use on your server. cpulimit is a useful tool that is available for many Linux distributions’ packaging system. This can be used to keep any user’s process from using too much CPU cycles. You can configure it accoring to these instructions.
cpulimit
NOTE: This has not been tested and may not work as expected.
$ ls -l /etc/spwd.db /etc/master.passwd -rw------- 1 root wheel 2516 Aug 22 13:35 /etc/master.passwd -rw------- 1 root wheel 40960 Aug 22 13:35 /etc/spwd.db
Add a shadow group if there isn’t one, and make the shadow file group-readable:
$ sudo pw group add shadow $ sudo chgrp shadow /etc/spwd.db $ sudo chmod g+r /etc/spwd.db $ sudo chgrp shadow /etc/master.passwd $ sudo chmod g+r /etc/master.passwd
$ sudo pw user mod rhea -G shadow
We can verify that PAM is working, with:
$ sudo -u rhea python3 -c "import pamela, getpass; print(pamela.authenticate('$USER', getpass.getpass()))" Password: [enter your unix password]
JupyterHub stores its state in a database, so it needs write access to a directory. The simplest way to deal with this is to make a directory owned by your Hub user, and use that as the CWD when launching the server.
$ sudo mkdir /etc/jupyterhub $ sudo chown rhea /etc/jupyterhub
Finally, start the server as our newly configured user, rhea:
$ cd /etc/jupyterhub $ sudo -u rhea jupyterhub --JupyterHub.spawner_class=sudospawner.SudoSpawner
And try logging in.
If you still get a generic Permission denied PermissionError, it’s possible SELinux is blocking you.Here’s how you can make a module to allow this. First, put this in a file named sudo_exec_selinux.te:
Permission denied
PermissionError
sudo_exec_selinux.te
module sudo_exec_selinux 1.1; require { type unconfined_t; type sudo_exec_t; class file { read entrypoint }; } #============= unconfined_t ============== allow unconfined_t sudo_exec_t:file entrypoint;
Then run all of these commands as root:
$ checkmodule -M -m -o sudo_exec_selinux.mod sudo_exec_selinux.te $ semodule_package -o sudo_exec_selinux.pp -m sudo_exec_selinux.mod $ semodule -i sudo_exec_selinux.pp
If the PAM authentication doesn’t work and you see errors for login:session-auth, or similar, considering updating to a more recent version of jupyterhub and disabling the opening of PAM sessions with c.PAMAuthenticator.open_sessions=False.
login:session-auth
c.PAMAuthenticator.open_sessions=False
Make sure the version of JupyterHub for this documentation matches your installation version, as the output of this command may change between versions.
As explained in the Configuration Basics section, the jupyterhub_config.py can be automatically generated via
The following contains the output of that command for reference.
# Configuration file for jupyterhub. #------------------------------------------------------------------------------ # Application(SingletonConfigurable) configuration #------------------------------------------------------------------------------ ## This is an application. ## The date format used by logging formatters for %(asctime)s # Default: '%Y-%m-%d %H:%M:%S' # c.Application.log_datefmt = '%Y-%m-%d %H:%M:%S' ## The Logging format template # Default: '[%(name)s]%(highlevel)s %(message)s' # c.Application.log_format = '[%(name)s]%(highlevel)s %(message)s' ## Set the log level by value or name. # Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL'] # Default: 30 # c.Application.log_level = 30 ## Instead of starting the Application, dump configuration to stdout # Default: False # c.Application.show_config = False ## Instead of starting the Application, dump configuration to stdout (as JSON) # Default: False # c.Application.show_config_json = False #------------------------------------------------------------------------------ # JupyterHub(Application) configuration #------------------------------------------------------------------------------ ## An Application for starting a Multi-User Jupyter Notebook server. ## Maximum number of concurrent servers that can be active at a time. # # Setting this can limit the total resources your users can consume. # # An active server is any server that's not fully stopped. It is considered # active from the time it has been requested until the time that it has # completely stopped. # # If this many user servers are active, users will not be able to launch new # servers until a server is shutdown. Spawn requests will be rejected with a 429 # error asking them to try again. # # If set to 0, no limit is enforced. # Default: 0 # c.JupyterHub.active_server_limit = 0 ## Duration (in seconds) to determine the number of active users. # Default: 1800 # c.JupyterHub.active_user_window = 1800 ## Resolution (in seconds) for updating activity # # If activity is registered that is less than activity_resolution seconds more # recent than the current value, the new value will be ignored. # # This avoids too many writes to the Hub database. # Default: 30 # c.JupyterHub.activity_resolution = 30 ## Grant admin users permission to access single-user servers. # # Users should be properly informed if this is enabled. # Default: False # c.JupyterHub.admin_access = False ## DEPRECATED since version 0.7.2, use Authenticator.admin_users instead. # Default: set() # c.JupyterHub.admin_users = set() ## Allow named single-user servers per user # Default: False # c.JupyterHub.allow_named_servers = False ## Answer yes to any questions (e.g. confirm overwrite) # Default: False # c.JupyterHub.answer_yes = False ## PENDING DEPRECATION: consider using services # # Dict of token:username to be loaded into the database. # # Allows ahead-of-time generation of API tokens for use by externally managed # services, which authenticate as JupyterHub users. # # Consider using services for general services that talk to the JupyterHub API. # Default: {} # c.JupyterHub.api_tokens = {} ## Authentication for prometheus metrics # Default: True # c.JupyterHub.authenticate_prometheus = True ## Class for authenticating users. # # This should be a subclass of :class:`jupyterhub.auth.Authenticator` # # with an :meth:`authenticate` method that: # # - is a coroutine (asyncio or tornado) # - returns username on success, None on failure # - takes two arguments: (handler, data), # where `handler` is the calling web.RequestHandler, # and `data` is the POST form data from the login page. # # .. versionchanged:: 1.0 # authenticators may be registered via entry points, # e.g. `c.JupyterHub.authenticator_class = 'pam'` # # Currently installed: # - default: jupyterhub.auth.PAMAuthenticator # - dummy: jupyterhub.auth.DummyAuthenticator # - pam: jupyterhub.auth.PAMAuthenticator # Default: 'jupyterhub.auth.PAMAuthenticator' # c.JupyterHub.authenticator_class = 'jupyterhub.auth.PAMAuthenticator' ## The base URL of the entire application. # # Add this to the beginning of all JupyterHub URLs. Use base_url to run # JupyterHub within an existing website. # # .. deprecated: 0.9 # Use JupyterHub.bind_url # Default: '/' # c.JupyterHub.base_url = '/' ## The public facing URL of the whole JupyterHub application. # # This is the address on which the proxy will bind. Sets protocol, ip, base_url # Default: 'http://:8000' # c.JupyterHub.bind_url = 'http://:8000' ## Whether to shutdown the proxy when the Hub shuts down. # # Disable if you want to be able to teardown the Hub while leaving the proxy # running. # # Only valid if the proxy was starting by the Hub process. # # If both this and cleanup_servers are False, sending SIGINT to the Hub will # only shutdown the Hub, leaving everything else running. # # The Hub should be able to resume from database state. # Default: True # c.JupyterHub.cleanup_proxy = True ## Whether to shutdown single-user servers when the Hub shuts down. # # Disable if you want to be able to teardown the Hub while leaving the single- # user servers running. # # If both this and cleanup_proxy are False, sending SIGINT to the Hub will only # shutdown the Hub, leaving everything else running. # # The Hub should be able to resume from database state. # Default: True # c.JupyterHub.cleanup_servers = True ## Maximum number of concurrent users that can be spawning at a time. # # Spawning lots of servers at the same time can cause performance problems for # the Hub or the underlying spawning system. Set this limit to prevent bursts of # logins from attempting to spawn too many servers at the same time. # # This does not limit the number of total running servers. See # active_server_limit for that. # # If more than this many users attempt to spawn at a time, their requests will # be rejected with a 429 error asking them to try again. Users will have to wait # for some of the spawning services to finish starting before they can start # their own. # # If set to 0, no limit is enforced. # Default: 100 # c.JupyterHub.concurrent_spawn_limit = 100 ## The config file to load # Default: 'jupyterhub_config.py' # c.JupyterHub.config_file = 'jupyterhub_config.py' ## DEPRECATED: does nothing # Default: False # c.JupyterHub.confirm_no_ssl = False ## Number of days for a login cookie to be valid. Default is two weeks. # Default: 14 # c.JupyterHub.cookie_max_age_days = 14 ## The cookie secret to use to encrypt cookies. # # Loaded from the JPY_COOKIE_SECRET env variable by default. # # Should be exactly 256 bits (32 bytes). # Default: b'' # c.JupyterHub.cookie_secret = b'' ## File in which to store the cookie secret. # Default: 'jupyterhub_cookie_secret' # c.JupyterHub.cookie_secret_file = 'jupyterhub_cookie_secret' ## The location of jupyterhub data files (e.g. /usr/local/share/jupyterhub) # Default: '$HOME/checkouts/readthedocs.org/user_builds/jupyterhub/envs/1.3.0/share/jupyterhub' # c.JupyterHub.data_files_path = '/home/docs/checkouts/readthedocs.org/user_builds/jupyterhub/envs/1.3.0/share/jupyterhub' ## Include any kwargs to pass to the database connection. See # sqlalchemy.create_engine for details. # Default: {} # c.JupyterHub.db_kwargs = {} ## url for the database. e.g. `sqlite:///jupyterhub.sqlite` # Default: 'sqlite:///jupyterhub.sqlite' # c.JupyterHub.db_url = 'sqlite:///jupyterhub.sqlite' ## log all database transactions. This has A LOT of output # Default: False # c.JupyterHub.debug_db = False ## DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug # Default: False # c.JupyterHub.debug_proxy = False ## If named servers are enabled, default name of server to spawn or open, e.g. by # user-redirect. # Default: '' # c.JupyterHub.default_server_name = '' ## The default URL for users when they arrive (e.g. when user directs to "/") # # By default, redirects users to their own server. # # Can be a Unicode string (e.g. '/hub/home') or a callable based on the handler # object: # # :: # # def default_url_fn(handler): # user = handler.current_user # if user and user.admin: # return '/hub/admin' # return '/hub/home' # # c.JupyterHub.default_url = default_url_fn # Default: traitlets.Undefined # c.JupyterHub.default_url = traitlets.Undefined ## Dict authority:dict(files). Specify the key, cert, and/or ca file for an # authority. This is useful for externally managed proxies that wish to use # internal_ssl. # # The files dict has this format (you must specify at least a cert):: # # { # 'key': '/path/to/key.key', # 'cert': '/path/to/cert.crt', # 'ca': '/path/to/ca.crt' # } # # The authorities you can override: 'hub-ca', 'notebooks-ca', 'proxy-api-ca', # 'proxy-client-ca', and 'services-ca'. # # Use with internal_ssl # Default: {} # c.JupyterHub.external_ssl_authorities = {} ## Register extra tornado Handlers for jupyterhub. # # Should be of the form ``("<regex>", Handler)`` # # The Hub prefix will be added, so `/my-page` will be served at `/hub/my-page`. # Default: [] # c.JupyterHub.extra_handlers = [] ## DEPRECATED: use output redirection instead, e.g. # # jupyterhub &>> /var/log/jupyterhub.log # Default: '' # c.JupyterHub.extra_log_file = '' ## Extra log handlers to set on JupyterHub logger # Default: [] # c.JupyterHub.extra_log_handlers = [] ## Generate certs used for internal ssl # Default: False # c.JupyterHub.generate_certs = False ## Generate default config file # Default: False # c.JupyterHub.generate_config = False ## The URL on which the Hub will listen. This is a private URL for internal # communication. Typically set in combination with hub_connect_url. If a unix # socket, hub_connect_url **must** also be set. # # For example: # # "http://127.0.0.1:8081" # "unix+http://%2Fsrv%2Fjupyterhub%2Fjupyterhub.sock" # # .. versionadded:: 0.9 # Default: '' # c.JupyterHub.hub_bind_url = '' ## The ip or hostname for proxies and spawners to use for connecting to the Hub. # # Use when the bind address (`hub_ip`) is 0.0.0.0, :: or otherwise different # from the connect address. # # Default: when `hub_ip` is 0.0.0.0 or ::, use `socket.gethostname()`, otherwise # use `hub_ip`. # # Note: Some spawners or proxy implementations might not support hostnames. # Check your spawner or proxy documentation to see if they have extra # requirements. # # .. versionadded:: 0.8 # Default: '' # c.JupyterHub.hub_connect_ip = '' ## DEPRECATED # # Use hub_connect_url # # .. versionadded:: 0.8 # # .. deprecated:: 0.9 # Use hub_connect_url # Default: 0 # c.JupyterHub.hub_connect_port = 0 ## The URL for connecting to the Hub. Spawners, services, and the proxy will use # this URL to talk to the Hub. # # Only needs to be specified if the default hub URL is not connectable (e.g. # using a unix+http:// bind url). # # .. seealso:: # JupyterHub.hub_connect_ip # JupyterHub.hub_bind_url # # .. versionadded:: 0.9 # Default: '' # c.JupyterHub.hub_connect_url = '' ## The ip address for the Hub process to *bind* to. # # By default, the hub listens on localhost only. This address must be accessible # from the proxy and user servers. You may need to set this to a public ip or '' # for all interfaces if the proxy or user servers are in containers or on a # different host. # # See `hub_connect_ip` for cases where the bind and connect address should # differ, or `hub_bind_url` for setting the full bind URL. # Default: '127.0.0.1' # c.JupyterHub.hub_ip = '127.0.0.1' ## The internal port for the Hub process. # # This is the internal port of the hub itself. It should never be accessed # directly. See JupyterHub.port for the public port to use when accessing # jupyterhub. It is rare that this port should be set except in cases of port # conflict. # # See also `hub_ip` for the ip and `hub_bind_url` for setting the full bind URL. # Default: 8081 # c.JupyterHub.hub_port = 8081 ## Trigger implicit spawns after this many seconds. # # When a user visits a URL for a server that's not running, they are shown a # page indicating that the requested server is not running with a button to # spawn the server. # # Setting this to a positive value will redirect the user after this many # seconds, effectively clicking this button automatically for the users, # automatically beginning the spawn process. # # Warning: this can result in errors and surprising behavior when sharing access # URLs to actual servers, since the wrong server is likely to be started. # Default: 0 # c.JupyterHub.implicit_spawn_seconds = 0 ## Timeout (in seconds) to wait for spawners to initialize # # Checking if spawners are healthy can take a long time if many spawners are # active at hub start time. # # If it takes longer than this timeout to check, init_spawner will be left to # complete in the background and the http server is allowed to start. # # A timeout of -1 means wait forever, which can mean a slow startup of the Hub # but ensures that the Hub is fully consistent by the time it starts responding # to requests. This matches the behavior of jupyterhub 1.0. # # .. versionadded: 1.1.0 # Default: 10 # c.JupyterHub.init_spawners_timeout = 10 ## The location to store certificates automatically created by JupyterHub. # # Use with internal_ssl # Default: 'internal-ssl' # c.JupyterHub.internal_certs_location = 'internal-ssl' ## Enable SSL for all internal communication # # This enables end-to-end encryption between all JupyterHub components. # JupyterHub will automatically create the necessary certificate authority and # sign notebook certificates as they're created. # Default: False # c.JupyterHub.internal_ssl = False ## The public facing ip of the whole JupyterHub application (specifically # referred to as the proxy). # # This is the address on which the proxy will listen. The default is to listen # on all interfaces. This is the only address through which JupyterHub should be # accessed by users. # # .. deprecated: 0.9 # Use JupyterHub.bind_url # Default: '' # c.JupyterHub.ip = '' ## Supply extra arguments that will be passed to Jinja environment. # Default: {} # c.JupyterHub.jinja_environment_options = {} ## Interval (in seconds) at which to update last-activity timestamps. # Default: 300 # c.JupyterHub.last_activity_interval = 300 ## Dict of 'group': ['usernames'] to load at startup. # # This strictly *adds* groups and users to groups. # # Loading one set of groups, then starting JupyterHub again with a different set # will not remove users or groups from previous launches. That must be done # through the API. # Default: {} # c.JupyterHub.load_groups = {} ## The date format used by logging formatters for %(asctime)s # See also: Application.log_datefmt # c.JupyterHub.log_datefmt = '%Y-%m-%d %H:%M:%S' ## The Logging format template # See also: Application.log_format # c.JupyterHub.log_format = '[%(name)s]%(highlevel)s %(message)s' ## Set the log level by value or name. # See also: Application.log_level # c.JupyterHub.log_level = 30 ## Specify path to a logo image to override the Jupyter logo in the banner. # Default: '' # c.JupyterHub.logo_file = '' ## Maximum number of concurrent named servers that can be created by a user at a # time. # # Setting this can limit the total resources a user can consume. # # If set to 0, no limit is enforced. # Default: 0 # c.JupyterHub.named_server_limit_per_user = 0 ## File to write PID Useful for daemonizing JupyterHub. # Default: '' # c.JupyterHub.pid_file = '' ## The public facing port of the proxy. # # This is the port on which the proxy will listen. This is the only port through # which JupyterHub should be accessed by users. # # .. deprecated: 0.9 # Use JupyterHub.bind_url # Default: 8000 # c.JupyterHub.port = 8000 ## DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url # Default: '' # c.JupyterHub.proxy_api_ip = '' ## DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url # Default: 0 # c.JupyterHub.proxy_api_port = 0 ## DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token # Default: '' # c.JupyterHub.proxy_auth_token = '' ## Interval (in seconds) at which to check if the proxy is running. # Default: 30 # c.JupyterHub.proxy_check_interval = 30 ## The class to use for configuring the JupyterHub proxy. # # Should be a subclass of :class:`jupyterhub.proxy.Proxy`. # # .. versionchanged:: 1.0 # proxies may be registered via entry points, # e.g. `c.JupyterHub.proxy_class = 'traefik'` # # Currently installed: # - configurable-http-proxy: jupyterhub.proxy.ConfigurableHTTPProxy # - default: jupyterhub.proxy.ConfigurableHTTPProxy # Default: 'jupyterhub.proxy.ConfigurableHTTPProxy' # c.JupyterHub.proxy_class = 'jupyterhub.proxy.ConfigurableHTTPProxy' ## DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command # Default: [] # c.JupyterHub.proxy_cmd = [] ## Recreate all certificates used within JupyterHub on restart. # # Note: enabling this feature requires restarting all notebook servers. # # Use with internal_ssl # Default: False # c.JupyterHub.recreate_internal_certs = False ## Redirect user to server (if running), instead of control panel. # Default: True # c.JupyterHub.redirect_to_server = True ## Purge and reset the database. # Default: False # c.JupyterHub.reset_db = False ## Interval (in seconds) at which to check connectivity of services with web # endpoints. # Default: 60 # c.JupyterHub.service_check_interval = 60 ## Dict of token:servicename to be loaded into the database. # # Allows ahead-of-time generation of API tokens for use by externally managed # services. # Default: {} # c.JupyterHub.service_tokens = {} ## List of service specification dictionaries. # # A service # # For instance:: # # services = [ # { # 'name': 'cull_idle', # 'command': ['/path/to/cull_idle_servers.py'], # }, # { # 'name': 'formgrader', # 'url': 'http://127.0.0.1:1234', # 'api_token': 'super-secret', # 'environment': # } # ] # Default: [] # c.JupyterHub.services = [] ## Instead of starting the Application, dump configuration to stdout # See also: Application.show_config # c.JupyterHub.show_config = False ## Instead of starting the Application, dump configuration to stdout (as JSON) # See also: Application.show_config_json # c.JupyterHub.show_config_json = False ## Shuts down all user servers on logout # Default: False # c.JupyterHub.shutdown_on_logout = False ## The class to use for spawning single-user servers. # # Should be a subclass of :class:`jupyterhub.spawner.Spawner`. # # .. versionchanged:: 1.0 # spawners may be registered via entry points, # e.g. `c.JupyterHub.spawner_class = 'localprocess'` # # Currently installed: # - default: jupyterhub.spawner.LocalProcessSpawner # - localprocess: jupyterhub.spawner.LocalProcessSpawner # - simple: jupyterhub.spawner.SimpleLocalProcessSpawner # Default: 'jupyterhub.spawner.LocalProcessSpawner' # c.JupyterHub.spawner_class = 'jupyterhub.spawner.LocalProcessSpawner' ## Path to SSL certificate file for the public facing interface of the proxy # # When setting this, you should also set ssl_key # Default: '' # c.JupyterHub.ssl_cert = '' ## Path to SSL key file for the public facing interface of the proxy # # When setting this, you should also set ssl_cert # Default: '' # c.JupyterHub.ssl_key = '' ## Host to send statsd metrics to. An empty string (the default) disables sending # metrics. # Default: '' # c.JupyterHub.statsd_host = '' ## Port on which to send statsd metrics about the hub # Default: 8125 # c.JupyterHub.statsd_port = 8125 ## Prefix to use for all metrics sent by jupyterhub to statsd # Default: 'jupyterhub' # c.JupyterHub.statsd_prefix = 'jupyterhub' ## Run single-user servers on subdomains of this host. # # This should be the full `https://hub.domain.tld[:port]`. # # Provides additional cross-site protections for javascript served by single- # user servers. # # Requires `<username>.hub.domain.tld` to resolve to the same host as # `hub.domain.tld`. # # In general, this is most easily achieved with wildcard DNS. # # When using SSL (i.e. always) this also requires a wildcard SSL certificate. # Default: '' # c.JupyterHub.subdomain_host = '' ## Paths to search for jinja templates, before using the default templates. # Default: [] # c.JupyterHub.template_paths = [] ## Extra variables to be passed into jinja templates # Default: {} # c.JupyterHub.template_vars = {} ## Extra settings overrides to pass to the tornado application. # Default: {} # c.JupyterHub.tornado_settings = {} ## Trust user-provided tokens (via JupyterHub.service_tokens) to have good # entropy. # # If you are not inserting additional tokens via configuration file, this flag # has no effect. # # In JupyterHub 0.8, internally generated tokens do not pass through additional # hashing because the hashing is costly and does not increase the entropy of # already-good UUIDs. # # User-provided tokens, on the other hand, are not trusted to have good entropy # by default, and are passed through many rounds of hashing to stretch the # entropy of the key (i.e. user-provided tokens are treated as passwords instead # of random keys). These keys are more costly to check. # # If your inserted tokens are generated by a good-quality mechanism, e.g. # `openssl rand -hex 32`, then you can set this flag to True to reduce the cost # of checking authentication tokens. # Default: False # c.JupyterHub.trust_user_provided_tokens = False ## Names to include in the subject alternative name. # # These names will be used for server name verification. This is useful if # JupyterHub is being run behind a reverse proxy or services using ssl are on # different hosts. # # Use with internal_ssl # Default: [] # c.JupyterHub.trusted_alt_names = [] ## Downstream proxy IP addresses to trust. # # This sets the list of IP addresses that are trusted and skipped when # processing the `X-Forwarded-For` header. For example, if an external proxy is # used for TLS termination, its IP address should be added to this list to # ensure the correct client IP addresses are recorded in the logs instead of the # proxy server's IP address. # Default: [] # c.JupyterHub.trusted_downstream_ips = [] ## Upgrade the database automatically on start. # # Only safe if database is regularly backed up. Only SQLite databases will be # backed up to a local file automatically. # Default: False # c.JupyterHub.upgrade_db = False ## Callable to affect behavior of /user-redirect/ # # Receives 4 parameters: 1. path - URL path that was provided after /user- # redirect/ 2. request - A Tornado HTTPServerRequest representing the current # request. 3. user - The currently authenticated user. 4. base_url - The # base_url of the current hub, for relative redirects # # It should return the new URL to redirect to, or None to preserve current # behavior. # Default: None # c.JupyterHub.user_redirect_hook = None #------------------------------------------------------------------------------ # Spawner(LoggingConfigurable) configuration #------------------------------------------------------------------------------ ## Base class for spawning single-user notebook servers. # # Subclass this, and override the following methods: # # - load_state - get_state - start - stop - poll # # As JupyterHub supports multiple users, an instance of the Spawner subclass is # created for each user. If there are 20 JupyterHub users, there will be 20 # instances of the subclass. ## Extra arguments to be passed to the single-user server. # # Some spawners allow shell-style expansion here, allowing you to use # environment variables here. Most, including the default, do not. Consult the # documentation for your spawner to verify! # Default: [] # c.Spawner.args = [] ## An optional hook function that you can implement to pass `auth_state` to the # spawner after it has been initialized but before it starts. The `auth_state` # dictionary may be set by the `.authenticate()` method of the authenticator. # This hook enables you to pass some or all of that information to your spawner. # # Example:: # # def userdata_hook(spawner, auth_state): # spawner.userdata = auth_state["userdata"] # # c.Spawner.auth_state_hook = userdata_hook # Default: None # c.Spawner.auth_state_hook = None ## The command used for starting the single-user server. # # Provide either a string or a list containing the path to the startup script # command. Extra arguments, other than this path, should be provided via `args`. # # This is usually set if you want to start the single-user server in a different # python environment (with virtualenv/conda) than JupyterHub itself. # # Some spawners allow shell-style expansion here, allowing you to use # environment variables. Most, including the default, do not. Consult the # documentation for your spawner to verify! # Default: ['jupyterhub-singleuser'] # c.Spawner.cmd = ['jupyterhub-singleuser'] ## Maximum number of consecutive failures to allow before shutting down # JupyterHub. # # This helps JupyterHub recover from a certain class of problem preventing # launch in contexts where the Hub is automatically restarted (e.g. systemd, # docker, kubernetes). # # A limit of 0 means no limit and consecutive failures will not be tracked. # Default: 0 # c.Spawner.consecutive_failure_limit = 0 ## Minimum number of cpu-cores a single-user notebook server is guaranteed to # have available. # # If this value is set to 0.5, allows use of 50% of one CPU. If this value is # set to 2, allows use of up to 2 CPUs. # # **This is a configuration setting. Your spawner must implement support for the # limit to work.** The default spawner, `LocalProcessSpawner`, does **not** # implement this support. A custom spawner **must** add support for this setting # for it to be enforced. # Default: None # c.Spawner.cpu_guarantee = None ## Maximum number of cpu-cores a single-user notebook server is allowed to use. # # If this value is set to 0.5, allows use of 50% of one CPU. If this value is # set to 2, allows use of up to 2 CPUs. # # The single-user notebook server will never be scheduled by the kernel to use # more cpu-cores than this. There is no guarantee that it can access this many # cpu-cores. # # **This is a configuration setting. Your spawner must implement support for the # limit to work.** The default spawner, `LocalProcessSpawner`, does **not** # implement this support. A custom spawner **must** add support for this setting # for it to be enforced. # Default: None # c.Spawner.cpu_limit = None ## Enable debug-logging of the single-user server # Default: False # c.Spawner.debug = False ## The URL the single-user server should start in. # # `{username}` will be expanded to the user's username # # Example uses: # # - You can set `notebook_dir` to `/` and `default_url` to `/tree/home/{username}` to allow people to # navigate the whole filesystem from their notebook server, but still start in their home directory. # - Start with `/notebooks` instead of `/tree` if `default_url` points to a notebook instead of a directory. # - You can set this to `/lab` to have JupyterLab start by default, rather than Jupyter Notebook. # Default: '' # c.Spawner.default_url = '' ## Disable per-user configuration of single-user servers. # # When starting the user's single-user server, any config file found in the # user's $HOME directory will be ignored. # # Note: a user could circumvent this if the user modifies their Python # environment, such as when they have their own conda environments / virtualenvs # / containers. # Default: False # c.Spawner.disable_user_config = False ## List of environment variables for the single-user server to inherit from the # JupyterHub process. # # This list is used to ensure that sensitive information in the JupyterHub # process's environment (such as `CONFIGPROXY_AUTH_TOKEN`) is not passed to the # single-user server's process. # Default: ['PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VIRTUAL_ENV', 'LANG', 'LC_ALL', 'JUPYTERHUB_SINGLEUSER_APP'] # c.Spawner.env_keep = ['PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VIRTUAL_ENV', 'LANG', 'LC_ALL', 'JUPYTERHUB_SINGLEUSER_APP'] ## Extra environment variables to set for the single-user server's process. # # Environment variables that end up in the single-user server's process come from 3 sources: # - This `environment` configurable # - The JupyterHub process' environment variables that are listed in `env_keep` # - Variables to establish contact between the single-user notebook and the hub (such as JUPYTERHUB_API_TOKEN) # # The `environment` configurable should be set by JupyterHub administrators to # add installation specific environment variables. It is a dict where the key is # the name of the environment variable, and the value can be a string or a # callable. If it is a callable, it will be called with one parameter (the # spawner instance), and should return a string fairly quickly (no blocking # operations please!). # # Note that the spawner class' interface is not guaranteed to be exactly same # across upgrades, so if you are using the callable take care to verify it # continues to work after upgrades! # # .. versionchanged:: 1.2 # environment from this configuration has highest priority, # allowing override of 'default' env variables, # such as JUPYTERHUB_API_URL. # Default: {} # c.Spawner.environment = {} ## Timeout (in seconds) before giving up on a spawned HTTP server # # Once a server has successfully been spawned, this is the amount of time we # wait before assuming that the server is unable to accept connections. # Default: 30 # c.Spawner.http_timeout = 30 ## The IP address (or hostname) the single-user server should listen on. # # The JupyterHub proxy implementation should be able to send packets to this # interface. # Default: '' # c.Spawner.ip = '' ## Minimum number of bytes a single-user notebook server is guaranteed to have # available. # # Allows the following suffixes: # - K -> Kilobytes # - M -> Megabytes # - G -> Gigabytes # - T -> Terabytes # # **This is a configuration setting. Your spawner must implement support for the # limit to work.** The default spawner, `LocalProcessSpawner`, does **not** # implement this support. A custom spawner **must** add support for this setting # for it to be enforced. # Default: None # c.Spawner.mem_guarantee = None ## Maximum number of bytes a single-user notebook server is allowed to use. # # Allows the following suffixes: # - K -> Kilobytes # - M -> Megabytes # - G -> Gigabytes # - T -> Terabytes # # If the single user server tries to allocate more memory than this, it will # fail. There is no guarantee that the single-user notebook server will be able # to allocate this much memory - only that it can not allocate more than this. # # **This is a configuration setting. Your spawner must implement support for the # limit to work.** The default spawner, `LocalProcessSpawner`, does **not** # implement this support. A custom spawner **must** add support for this setting # for it to be enforced. # Default: None # c.Spawner.mem_limit = None ## Path to the notebook directory for the single-user server. # # The user sees a file listing of this directory when the notebook interface is # started. The current interface does not easily allow browsing beyond the # subdirectories in this directory's tree. # # `~` will be expanded to the home directory of the user, and {username} will be # replaced with the name of the user. # # Note that this does *not* prevent users from accessing files outside of this # path! They can do so with many other means. # Default: '' # c.Spawner.notebook_dir = '' ## An HTML form for options a user can specify on launching their server. # # The surrounding `<form>` element and the submit button are already provided. # # For example: # # .. code:: html # # Set your key: # <input name="key" val="default_key"></input> # <br> # Choose a letter: # <select name="letter" multiple="true"> # <option value="A">The letter A</option> # <option value="B">The letter B</option> # </select> # # The data from this form submission will be passed on to your spawner in # `self.user_options` # # Instead of a form snippet string, this could also be a callable that takes as # one parameter the current spawner instance and returns a string. The callable # will be called asynchronously if it returns a future, rather than a str. Note # that the interface of the spawner class is not deemed stable across versions, # so using this functionality might cause your JupyterHub upgrades to break. # Default: traitlets.Undefined # c.Spawner.options_form = traitlets.Undefined ## Interpret HTTP form data # # Form data will always arrive as a dict of lists of strings. Override this # function to understand single-values, numbers, etc. # # This should coerce form data into the structure expected by self.user_options, # which must be a dict, and should be JSON-serializeable, though it can contain # bytes in addition to standard JSON data types. # # This method should not have any side effects. Any handling of `user_options` # should be done in `.start()` to ensure consistent behavior across servers # spawned via the API and form submission page. # # Instances will receive this data on self.user_options, after passing through # this function, prior to `Spawner.start`. # # .. versionchanged:: 1.0 # user_options are persisted in the JupyterHub database to be reused # on subsequent spawns if no options are given. # user_options is serialized to JSON as part of this persistence # (with additional support for bytes in case of uploaded file data), # and any non-bytes non-jsonable values will be replaced with None # if the user_options are re-used. # Default: traitlets.Undefined # c.Spawner.options_from_form = traitlets.Undefined ## Interval (in seconds) on which to poll the spawner for single-user server's # status. # # At every poll interval, each spawner's `.poll` method is called, which checks # if the single-user server is still running. If it isn't running, then # JupyterHub modifies its own state accordingly and removes appropriate routes # from the configurable proxy. # Default: 30 # c.Spawner.poll_interval = 30 ## The port for single-user servers to listen on. # # Defaults to `0`, which uses a randomly allocated port number each time. # # If set to a non-zero value, all Spawners will use the same port, which only # makes sense if each server is on a different address, e.g. in containers. # # New in version 0.7. # Default: 0 # c.Spawner.port = 0 ## An optional hook function that you can implement to do work after the spawner # stops. # # This can be set independent of any concrete spawner implementation. # Default: None # c.Spawner.post_stop_hook = None ## An optional hook function that you can implement to do some bootstrapping work # before the spawner starts. For example, create a directory for your user or # load initial content. # # This can be set independent of any concrete spawner implementation. # # This maybe a coroutine. # # Example:: # # from subprocess import check_call # def my_hook(spawner): # username = spawner.user.name # check_call(['./examples/bootstrap-script/bootstrap.sh', username]) # # c.Spawner.pre_spawn_hook = my_hook # Default: None # c.Spawner.pre_spawn_hook = None ## List of SSL alt names # # May be set in config if all spawners should have the same value(s), or set at # runtime by Spawner that know their names. # Default: [] # c.Spawner.ssl_alt_names = [] ## Whether to include DNS:localhost, IP:127.0.0.1 in alt names # Default: True # c.Spawner.ssl_alt_names_include_local = True ## Timeout (in seconds) before giving up on starting of single-user server. # # This is the timeout for start to return, not the timeout for the server to # respond. Callers of spawner.start will assume that startup has failed if it # takes longer than this. start should return when the server process is started # and its location is known. # Default: 60 # c.Spawner.start_timeout = 60 #------------------------------------------------------------------------------ # Authenticator(LoggingConfigurable) configuration #------------------------------------------------------------------------------ ## Base class for implementing an authentication provider for JupyterHub ## Set of users that will have admin rights on this JupyterHub. # # Admin users have extra privileges: # - Use the admin panel to see list of users logged in # - Add / remove users in some authenticators # - Restart / halt the hub # - Start / stop users' single-user servers # - Can access each individual users' single-user server (if configured) # # Admin access should be treated the same way root access is. # # Defaults to an empty set, in which case no user has admin access. # Default: set() # c.Authenticator.admin_users = set() ## Set of usernames that are allowed to log in. # # Use this with supported authenticators to restrict which users can log in. # This is an additional list that further restricts users, beyond whatever # restrictions the authenticator has in place. # # If empty, does not perform any additional restriction. # # .. versionchanged:: 1.2 # `Authenticator.whitelist` renamed to `allowed_users` # Default: set() # c.Authenticator.allowed_users = set() ## The max age (in seconds) of authentication info before forcing a refresh of # user auth info. # # Refreshing auth info allows, e.g. requesting/re-validating auth tokens. # # See :meth:`.refresh_user` for what happens when user auth info is refreshed # (nothing by default). # Default: 300 # c.Authenticator.auth_refresh_age = 300 ## Automatically begin the login process # # rather than starting with a "Login with..." link at `/hub/login` # # To work, `.login_url()` must give a URL other than the default `/hub/login`, # such as an oauth handler or another automatic login handler, registered with # `.get_handlers()`. # # .. versionadded:: 0.8 # Default: False # c.Authenticator.auto_login = False ## Set of usernames that are not allowed to log in. # # Use this with supported authenticators to restrict which users can not log in. # This is an additional block list that further restricts users, beyond whatever # restrictions the authenticator has in place. # # If empty, does not perform any additional restriction. # # .. versionadded: 0.9 # # .. versionchanged:: 1.2 # `Authenticator.blacklist` renamed to `blocked_users` # Default: set() # c.Authenticator.blocked_users = set() ## Delete any users from the database that do not pass validation # # When JupyterHub starts, `.add_user` will be called on each user in the # database to verify that all users are still valid. # # If `delete_invalid_users` is True, any users that do not pass validation will # be deleted from the database. Use this if users might be deleted from an # external system, such as local user accounts. # # If False (default), invalid users remain in the Hub's database and a warning # will be issued. This is the default to avoid data loss due to config changes. # Default: False # c.Authenticator.delete_invalid_users = False ## Enable persisting auth_state (if available). # # auth_state will be encrypted and stored in the Hub's database. This can # include things like authentication tokens, etc. to be passed to Spawners as # environment variables. # # Encrypting auth_state requires the cryptography package. # # Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one # (or more, separated by ;) 32B encryption keys. These can be either base64 or # hex-encoded. # # If encryption is unavailable, auth_state cannot be persisted. # # New in JupyterHub 0.8 # Default: False # c.Authenticator.enable_auth_state = False ## An optional hook function that you can implement to do some bootstrapping work # during authentication. For example, loading user account details from an # external system. # # This function is called after the user has passed all authentication checks # and is ready to successfully authenticate. This function must return the # authentication dict reguardless of changes to it. # # This maybe a coroutine. # # .. versionadded: 1.0 # # Example:: # # import os, pwd # def my_hook(authenticator, handler, authentication): # user_data = pwd.getpwnam(authentication['name']) # spawn_data = { # 'pw_data': user_data # 'gid_list': os.getgrouplist(authentication['name'], user_data.pw_gid) # } # # if authentication['auth_state'] is None: # authentication['auth_state'] = {} # authentication['auth_state']['spawn_data'] = spawn_data # # return authentication # # c.Authenticator.post_auth_hook = my_hook # Default: None # c.Authenticator.post_auth_hook = None ## Force refresh of auth prior to spawn. # # This forces :meth:`.refresh_user` to be called prior to launching a server, to # ensure that auth state is up-to-date. # # This can be important when e.g. auth tokens that may have expired are passed # to the spawner via environment variables from auth_state. # # If refresh_user cannot refresh the user auth data, launch will fail until the # user logs in again. # Default: False # c.Authenticator.refresh_pre_spawn = False ## Dictionary mapping authenticator usernames to JupyterHub users. # # Primarily used to normalize OAuth user names to local users. # Default: {} # c.Authenticator.username_map = {} ## Regular expression pattern that all valid usernames must match. # # If a username does not match the pattern specified here, authentication will # not be attempted. # # If not set, allow any username. # Default: '' # c.Authenticator.username_pattern = '' ## Deprecated, use `Authenticator.allowed_users` # Default: set() # c.Authenticator.whitelist = set() #------------------------------------------------------------------------------ # CryptKeeper(SingletonConfigurable) configuration #------------------------------------------------------------------------------ ## Encapsulate encryption configuration # # Use via the encryption_config singleton below. # Default: [] # c.CryptKeeper.keys = [] ## The number of threads to allocate for encryption # Default: 2 # c.CryptKeeper.n_threads = 2 #------------------------------------------------------------------------------ # Pagination(Configurable) configuration #------------------------------------------------------------------------------ ## Default number of entries per page for paginated results. # Default: 100 # c.Pagination.default_per_page = 100 ## Maximum number of entries per page for paginated results. # Default: 250 # c.Pagination.max_per_page = 250
This section contains the output of the command jupyterhub --help-all.
Start a multi-user Jupyter Notebook server Spawns a configurable-http-proxy and multi-user Hub, which authenticates users and spawns single-user Notebook servers on behalf of users. Subcommands =========== Subcommands are launched as `jupyterhub cmd [args]`. For information on using subcommand 'cmd', do: `jupyterhub cmd -h`. token Generate an API token for a user upgrade-db Upgrade your JupyterHub state database to the current version. Options ======= The options below are convenience aliases to configurable class-options, as listed in the "Equivalent to" description-line of the aliases. To see all configurable class-options for some <cmd>, use: <cmd> --help-all --debug set log level to logging.DEBUG (maximize logging output) Equivalent to: [--Application.log_level=10] --generate-config generate default config file Equivalent to: [--JupyterHub.generate_config=True] --generate-certs generate certificates used for internal ssl Equivalent to: [--JupyterHub.generate_certs=True] --no-db disable persisting state database to disk Equivalent to: [--JupyterHub.db_url=sqlite:///:memory:] --upgrade-db Automatically upgrade the database if needed on startup. Only safe if the database has been backed up. Only SQLite database files will be backed up automatically. Equivalent to: [--JupyterHub.upgrade_db=True] --no-ssl [DEPRECATED in 0.7: does nothing] Equivalent to: [--JupyterHub.confirm_no_ssl=True] --base-url=<URLPrefix> The base URL of the entire application. Add this to the beginning of all JupyterHub URLs. Use base_url to run JupyterHub within an existing website. .. deprecated: 0.9 Use JupyterHub.bind_url Default: '/' Equivalent to: [--JupyterHub.base_url] -y=<Bool> Answer yes to any questions (e.g. confirm overwrite) Default: False Equivalent to: [--JupyterHub.answer_yes] --ssl-key=<Unicode> Path to SSL key file for the public facing interface of the proxy When setting this, you should also set ssl_cert Default: '' Equivalent to: [--JupyterHub.ssl_key] --ssl-cert=<Unicode> Path to SSL certificate file for the public facing interface of the proxy When setting this, you should also set ssl_key Default: '' Equivalent to: [--JupyterHub.ssl_cert] --url=<Unicode> The public facing URL of the whole JupyterHub application. This is the address on which the proxy will bind. Sets protocol, ip, base_url Default: 'http://:8000' Equivalent to: [--JupyterHub.bind_url] --ip=<Unicode> The public facing ip of the whole JupyterHub application (specifically referred to as the proxy). This is the address on which the proxy will listen. The default is to listen on all interfaces. This is the only address through which JupyterHub should be accessed by users. .. deprecated: 0.9 Use JupyterHub.bind_url Default: '' Equivalent to: [--JupyterHub.ip] --port=<Int> The public facing port of the proxy. This is the port on which the proxy will listen. This is the only port through which JupyterHub should be accessed by users. .. deprecated: 0.9 Use JupyterHub.bind_url Default: 8000 Equivalent to: [--JupyterHub.port] --pid-file=<Unicode> File to write PID Useful for daemonizing JupyterHub. Default: '' Equivalent to: [--JupyterHub.pid_file] --log-file=<Unicode> DEPRECATED: use output redirection instead, e.g. jupyterhub &>> /var/log/jupyterhub.log Default: '' Equivalent to: [--JupyterHub.extra_log_file] --log-level=<Enum> Set the log level by value or name. Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL'] Default: 30 Equivalent to: [--Application.log_level] -f=<Unicode> The config file to load Default: 'jupyterhub_config.py' Equivalent to: [--JupyterHub.config_file] --config=<Unicode> The config file to load Default: 'jupyterhub_config.py' Equivalent to: [--JupyterHub.config_file] --db=<Unicode> url for the database. e.g. `sqlite:///jupyterhub.sqlite` Default: 'sqlite:///jupyterhub.sqlite' Equivalent to: [--JupyterHub.db_url] Class options ============= The command-line option below sets the respective configurable class-parameter: --Class.parameter=value This line is evaluated in Python, so simple expressions are allowed. For instance, to set `C.a=[0,1,2]`, you may type this: --C.a='range(3)' Application(SingletonConfigurable) options ------------------------------------------ --Application.log_datefmt=<Unicode> The date format used by logging formatters for %(asctime)s Default: '%Y-%m-%d %H:%M:%S' --Application.log_format=<Unicode> The Logging format template Default: '[%(name)s]%(highlevel)s %(message)s' --Application.log_level=<Enum> Set the log level by value or name. Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL'] Default: 30 --Application.show_config=<Bool> Instead of starting the Application, dump configuration to stdout Default: False --Application.show_config_json=<Bool> Instead of starting the Application, dump configuration to stdout (as JSON) Default: False JupyterHub(Application) options ------------------------------- --JupyterHub.active_server_limit=<Int> Maximum number of concurrent servers that can be active at a time. Setting this can limit the total resources your users can consume. An active server is any server that's not fully stopped. It is considered active from the time it has been requested until the time that it has completely stopped. If this many user servers are active, users will not be able to launch new servers until a server is shutdown. Spawn requests will be rejected with a 429 error asking them to try again. If set to 0, no limit is enforced. Default: 0 --JupyterHub.active_user_window=<Int> Duration (in seconds) to determine the number of active users. Default: 1800 --JupyterHub.activity_resolution=<Int> Resolution (in seconds) for updating activity If activity is registered that is less than activity_resolution seconds more recent than the current value, the new value will be ignored. This avoids too many writes to the Hub database. Default: 30 --JupyterHub.admin_access=<Bool> Grant admin users permission to access single-user servers. Users should be properly informed if this is enabled. Default: False --JupyterHub.admin_users=<set-item-1>... DEPRECATED since version 0.7.2, use Authenticator.admin_users instead. Default: set() --JupyterHub.allow_named_servers=<Bool> Allow named single-user servers per user Default: False --JupyterHub.answer_yes=<Bool> Answer yes to any questions (e.g. confirm overwrite) Default: False --JupyterHub.api_tokens=<key-1>=<value-1>... PENDING DEPRECATION: consider using services Dict of token:username to be loaded into the database. Allows ahead-of-time generation of API tokens for use by externally managed services, which authenticate as JupyterHub users. Consider using services for general services that talk to the JupyterHub API. Default: {} --JupyterHub.authenticate_prometheus=<Bool> Authentication for prometheus metrics Default: True --JupyterHub.authenticator_class=<EntryPointType> Class for authenticating users. This should be a subclass of :class:`jupyterhub.auth.Authenticator` with an :meth:`authenticate` method that: - is a coroutine (asyncio or tornado) - returns username on success, None on failure - takes two arguments: (handler, data), where `handler` is the calling web.RequestHandler, and `data` is the POST form data from the login page. .. versionchanged:: 1.0 authenticators may be registered via entry points, e.g. `c.JupyterHub.authenticator_class = 'pam'` Currently installed: - default: jupyterhub.auth.PAMAuthenticator - dummy: jupyterhub.auth.DummyAuthenticator - pam: jupyterhub.auth.PAMAuthenticator Default: 'jupyterhub.auth.PAMAuthenticator' --JupyterHub.base_url=<URLPrefix> The base URL of the entire application. Add this to the beginning of all JupyterHub URLs. Use base_url to run JupyterHub within an existing website. .. deprecated: 0.9 Use JupyterHub.bind_url Default: '/' --JupyterHub.bind_url=<Unicode> The public facing URL of the whole JupyterHub application. This is the address on which the proxy will bind. Sets protocol, ip, base_url Default: 'http://:8000' --JupyterHub.cleanup_proxy=<Bool> Whether to shutdown the proxy when the Hub shuts down. Disable if you want to be able to teardown the Hub while leaving the proxy running. Only valid if the proxy was starting by the Hub process. If both this and cleanup_servers are False, sending SIGINT to the Hub will only shutdown the Hub, leaving everything else running. The Hub should be able to resume from database state. Default: True --JupyterHub.cleanup_servers=<Bool> Whether to shutdown single-user servers when the Hub shuts down. Disable if you want to be able to teardown the Hub while leaving the single- user servers running. If both this and cleanup_proxy are False, sending SIGINT to the Hub will only shutdown the Hub, leaving everything else running. The Hub should be able to resume from database state. Default: True --JupyterHub.concurrent_spawn_limit=<Int> Maximum number of concurrent users that can be spawning at a time. Spawning lots of servers at the same time can cause performance problems for the Hub or the underlying spawning system. Set this limit to prevent bursts of logins from attempting to spawn too many servers at the same time. This does not limit the number of total running servers. See active_server_limit for that. If more than this many users attempt to spawn at a time, their requests will be rejected with a 429 error asking them to try again. Users will have to wait for some of the spawning services to finish starting before they can start their own. If set to 0, no limit is enforced. Default: 100 --JupyterHub.config_file=<Unicode> The config file to load Default: 'jupyterhub_config.py' --JupyterHub.confirm_no_ssl=<Bool> DEPRECATED: does nothing Default: False --JupyterHub.cookie_max_age_days=<Float> Number of days for a login cookie to be valid. Default is two weeks. Default: 14 --JupyterHub.cookie_secret=<Bytes> The cookie secret to use to encrypt cookies. Loaded from the JPY_COOKIE_SECRET env variable by default. Should be exactly 256 bits (32 bytes). Default: b'' --JupyterHub.cookie_secret_file=<Unicode> File in which to store the cookie secret. Default: 'jupyterhub_cookie_secret' --JupyterHub.data_files_path=<Unicode> The location of jupyterhub data files (e.g. /usr/local/share/jupyterhub) Default: '$HOME/checkouts/readthedocs.org/user_builds/jupyterhub/... --JupyterHub.db_kwargs=<key-1>=<value-1>... Include any kwargs to pass to the database connection. See sqlalchemy.create_engine for details. Default: {} --JupyterHub.db_url=<Unicode> url for the database. e.g. `sqlite:///jupyterhub.sqlite` Default: 'sqlite:///jupyterhub.sqlite' --JupyterHub.debug_db=<Bool> log all database transactions. This has A LOT of output Default: False --JupyterHub.debug_proxy=<Bool> DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug Default: False --JupyterHub.default_server_name=<Unicode> If named servers are enabled, default name of server to spawn or open, e.g. by user-redirect. Default: '' --JupyterHub.default_url=<Union> The default URL for users when they arrive (e.g. when user directs to "/") By default, redirects users to their own server. Can be a Unicode string (e.g. '/hub/home') or a callable based on the handler object: :: def default_url_fn(handler): user = handler.current_user if user and user.admin: return '/hub/admin' return '/hub/home' c.JupyterHub.default_url = default_url_fn Default: traitlets.Undefined --JupyterHub.external_ssl_authorities=<key-1>=<value-1>... Dict authority:dict(files). Specify the key, cert, and/or ca file for an authority. This is useful for externally managed proxies that wish to use internal_ssl. The files dict has this format (you must specify at least a cert):: { 'key': '/path/to/key.key', 'cert': '/path/to/cert.crt', 'ca': '/path/to/ca.crt' } The authorities you can override: 'hub-ca', 'notebooks-ca', 'proxy-api-ca', 'proxy-client-ca', and 'services-ca'. Use with internal_ssl Default: {} --JupyterHub.extra_handlers=<list-item-1>... Register extra tornado Handlers for jupyterhub. Should be of the form ``("<regex>", Handler)`` The Hub prefix will be added, so `/my-page` will be served at `/hub/my- page`. Default: [] --JupyterHub.extra_log_file=<Unicode> DEPRECATED: use output redirection instead, e.g. jupyterhub &>> /var/log/jupyterhub.log Default: '' --JupyterHub.extra_log_handlers=<list-item-1>... Extra log handlers to set on JupyterHub logger Default: [] --JupyterHub.generate_certs=<Bool> Generate certs used for internal ssl Default: False --JupyterHub.generate_config=<Bool> Generate default config file Default: False --JupyterHub.hub_bind_url=<Unicode> The URL on which the Hub will listen. This is a private URL for internal communication. Typically set in combination with hub_connect_url. If a unix socket, hub_connect_url **must** also be set. For example: "http://127.0.0.1:8081" "unix+http://%2Fsrv%2Fjupyterhub%2Fjupyterhub.sock" .. versionadded:: 0.9 Default: '' --JupyterHub.hub_connect_ip=<Unicode> The ip or hostname for proxies and spawners to use for connecting to the Hub. Use when the bind address (`hub_ip`) is 0.0.0.0, :: or otherwise different from the connect address. Default: when `hub_ip` is 0.0.0.0 or ::, use `socket.gethostname()`, otherwise use `hub_ip`. Note: Some spawners or proxy implementations might not support hostnames. Check your spawner or proxy documentation to see if they have extra requirements. .. versionadded:: 0.8 Default: '' --JupyterHub.hub_connect_port=<Int> DEPRECATED Use hub_connect_url .. versionadded:: 0.8 .. deprecated:: 0.9 Use hub_connect_url Default: 0 --JupyterHub.hub_connect_url=<Unicode> The URL for connecting to the Hub. Spawners, services, and the proxy will use this URL to talk to the Hub. Only needs to be specified if the default hub URL is not connectable (e.g. using a unix+http:// bind url). .. seealso:: JupyterHub.hub_connect_ip JupyterHub.hub_bind_url .. versionadded:: 0.9 Default: '' --JupyterHub.hub_ip=<Unicode> The ip address for the Hub process to *bind* to. By default, the hub listens on localhost only. This address must be accessible from the proxy and user servers. You may need to set this to a public ip or '' for all interfaces if the proxy or user servers are in containers or on a different host. See `hub_connect_ip` for cases where the bind and connect address should differ, or `hub_bind_url` for setting the full bind URL. Default: '127.0.0.1' --JupyterHub.hub_port=<Int> The internal port for the Hub process. This is the internal port of the hub itself. It should never be accessed directly. See JupyterHub.port for the public port to use when accessing jupyterhub. It is rare that this port should be set except in cases of port conflict. See also `hub_ip` for the ip and `hub_bind_url` for setting the full bind URL. Default: 8081 --JupyterHub.implicit_spawn_seconds=<Float> Trigger implicit spawns after this many seconds. When a user visits a URL for a server that's not running, they are shown a page indicating that the requested server is not running with a button to spawn the server. Setting this to a positive value will redirect the user after this many seconds, effectively clicking this button automatically for the users, automatically beginning the spawn process. Warning: this can result in errors and surprising behavior when sharing access URLs to actual servers, since the wrong server is likely to be started. Default: 0 --JupyterHub.init_spawners_timeout=<Int> Timeout (in seconds) to wait for spawners to initialize Checking if spawners are healthy can take a long time if many spawners are active at hub start time. If it takes longer than this timeout to check, init_spawner will be left to complete in the background and the http server is allowed to start. A timeout of -1 means wait forever, which can mean a slow startup of the Hub but ensures that the Hub is fully consistent by the time it starts responding to requests. This matches the behavior of jupyterhub 1.0. .. versionadded: 1.1.0 Default: 10 --JupyterHub.internal_certs_location=<Unicode> The location to store certificates automatically created by JupyterHub. Use with internal_ssl Default: 'internal-ssl' --JupyterHub.internal_ssl=<Bool> Enable SSL for all internal communication This enables end-to-end encryption between all JupyterHub components. JupyterHub will automatically create the necessary certificate authority and sign notebook certificates as they're created. Default: False --JupyterHub.ip=<Unicode> The public facing ip of the whole JupyterHub application (specifically referred to as the proxy). This is the address on which the proxy will listen. The default is to listen on all interfaces. This is the only address through which JupyterHub should be accessed by users. .. deprecated: 0.9 Use JupyterHub.bind_url Default: '' --JupyterHub.jinja_environment_options=<key-1>=<value-1>... Supply extra arguments that will be passed to Jinja environment. Default: {} --JupyterHub.last_activity_interval=<Int> Interval (in seconds) at which to update last-activity timestamps. Default: 300 --JupyterHub.load_groups=<key-1>=<value-1>... Dict of 'group': ['usernames'] to load at startup. This strictly *adds* groups and users to groups. Loading one set of groups, then starting JupyterHub again with a different set will not remove users or groups from previous launches. That must be done through the API. Default: {} --JupyterHub.log_datefmt=<Unicode> The date format used by logging formatters for %(asctime)s Default: '%Y-%m-%d %H:%M:%S' --JupyterHub.log_format=<Unicode> The Logging format template Default: '[%(name)s]%(highlevel)s %(message)s' --JupyterHub.log_level=<Enum> Set the log level by value or name. Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL'] Default: 30 --JupyterHub.logo_file=<Unicode> Specify path to a logo image to override the Jupyter logo in the banner. Default: '' --JupyterHub.named_server_limit_per_user=<Int> Maximum number of concurrent named servers that can be created by a user at a time. Setting this can limit the total resources a user can consume. If set to 0, no limit is enforced. Default: 0 --JupyterHub.pid_file=<Unicode> File to write PID Useful for daemonizing JupyterHub. Default: '' --JupyterHub.port=<Int> The public facing port of the proxy. This is the port on which the proxy will listen. This is the only port through which JupyterHub should be accessed by users. .. deprecated: 0.9 Use JupyterHub.bind_url Default: 8000 --JupyterHub.proxy_api_ip=<Unicode> DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url Default: '' --JupyterHub.proxy_api_port=<Int> DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url Default: 0 --JupyterHub.proxy_auth_token=<Unicode> DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token Default: '' --JupyterHub.proxy_check_interval=<Int> Interval (in seconds) at which to check if the proxy is running. Default: 30 --JupyterHub.proxy_class=<EntryPointType> The class to use for configuring the JupyterHub proxy. Should be a subclass of :class:`jupyterhub.proxy.Proxy`. .. versionchanged:: 1.0 proxies may be registered via entry points, e.g. `c.JupyterHub.proxy_class = 'traefik'` Currently installed: - configurable-http-proxy: jupyterhub.proxy.ConfigurableHTTPProxy - default: jupyterhub.proxy.ConfigurableHTTPProxy Default: 'jupyterhub.proxy.ConfigurableHTTPProxy' --JupyterHub.proxy_cmd=<command-item-1>... DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command Default: [] --JupyterHub.recreate_internal_certs=<Bool> Recreate all certificates used within JupyterHub on restart. Note: enabling this feature requires restarting all notebook servers. Use with internal_ssl Default: False --JupyterHub.redirect_to_server=<Bool> Redirect user to server (if running), instead of control panel. Default: True --JupyterHub.reset_db=<Bool> Purge and reset the database. Default: False --JupyterHub.service_check_interval=<Int> Interval (in seconds) at which to check connectivity of services with web endpoints. Default: 60 --JupyterHub.service_tokens=<key-1>=<value-1>... Dict of token:servicename to be loaded into the database. Allows ahead-of-time generation of API tokens for use by externally managed services. Default: {} --JupyterHub.services=<list-item-1>... List of service specification dictionaries. A service For instance:: services = [ { 'name': 'cull_idle', 'command': ['/path/to/cull_idle_servers.py'], }, { 'name': 'formgrader', 'url': 'http://127.0.0.1:1234', 'api_token': 'super-secret', 'environment': } ] Default: [] --JupyterHub.show_config=<Bool> Instead of starting the Application, dump configuration to stdout Default: False --JupyterHub.show_config_json=<Bool> Instead of starting the Application, dump configuration to stdout (as JSON) Default: False --JupyterHub.shutdown_on_logout=<Bool> Shuts down all user servers on logout Default: False --JupyterHub.spawner_class=<EntryPointType> The class to use for spawning single-user servers. Should be a subclass of :class:`jupyterhub.spawner.Spawner`. .. versionchanged:: 1.0 spawners may be registered via entry points, e.g. `c.JupyterHub.spawner_class = 'localprocess'` Currently installed: - default: jupyterhub.spawner.LocalProcessSpawner - localprocess: jupyterhub.spawner.LocalProcessSpawner - simple: jupyterhub.spawner.SimpleLocalProcessSpawner Default: 'jupyterhub.spawner.LocalProcessSpawner' --JupyterHub.ssl_cert=<Unicode> Path to SSL certificate file for the public facing interface of the proxy When setting this, you should also set ssl_key Default: '' --JupyterHub.ssl_key=<Unicode> Path to SSL key file for the public facing interface of the proxy When setting this, you should also set ssl_cert Default: '' --JupyterHub.statsd_host=<Unicode> Host to send statsd metrics to. An empty string (the default) disables sending metrics. Default: '' --JupyterHub.statsd_port=<Int> Port on which to send statsd metrics about the hub Default: 8125 --JupyterHub.statsd_prefix=<Unicode> Prefix to use for all metrics sent by jupyterhub to statsd Default: 'jupyterhub' --JupyterHub.subdomain_host=<Unicode> Run single-user servers on subdomains of this host. This should be the full `https://hub.domain.tld[:port]`. Provides additional cross-site protections for javascript served by single- user servers. Requires `<username>.hub.domain.tld` to resolve to the same host as `hub.domain.tld`. In general, this is most easily achieved with wildcard DNS. When using SSL (i.e. always) this also requires a wildcard SSL certificate. Default: '' --JupyterHub.template_paths=<list-item-1>... Paths to search for jinja templates, before using the default templates. Default: [] --JupyterHub.template_vars=<key-1>=<value-1>... Extra variables to be passed into jinja templates Default: {} --JupyterHub.tornado_settings=<key-1>=<value-1>... Extra settings overrides to pass to the tornado application. Default: {} --JupyterHub.trust_user_provided_tokens=<Bool> Trust user-provided tokens (via JupyterHub.service_tokens) to have good entropy. If you are not inserting additional tokens via configuration file, this flag has no effect. In JupyterHub 0.8, internally generated tokens do not pass through additional hashing because the hashing is costly and does not increase the entropy of already-good UUIDs. User-provided tokens, on the other hand, are not trusted to have good entropy by default, and are passed through many rounds of hashing to stretch the entropy of the key (i.e. user-provided tokens are treated as passwords instead of random keys). These keys are more costly to check. If your inserted tokens are generated by a good-quality mechanism, e.g. `openssl rand -hex 32`, then you can set this flag to True to reduce the cost of checking authentication tokens. Default: False --JupyterHub.trusted_alt_names=<list-item-1>... Names to include in the subject alternative name. These names will be used for server name verification. This is useful if JupyterHub is being run behind a reverse proxy or services using ssl are on different hosts. Use with internal_ssl Default: [] --JupyterHub.trusted_downstream_ips=<list-item-1>... Downstream proxy IP addresses to trust. This sets the list of IP addresses that are trusted and skipped when processing the `X-Forwarded-For` header. For example, if an external proxy is used for TLS termination, its IP address should be added to this list to ensure the correct client IP addresses are recorded in the logs instead of the proxy server's IP address. Default: [] --JupyterHub.upgrade_db=<Bool> Upgrade the database automatically on start. Only safe if database is regularly backed up. Only SQLite databases will be backed up to a local file automatically. Default: False --JupyterHub.user_redirect_hook=<Callable> Callable to affect behavior of /user-redirect/ Receives 4 parameters: 1. path - URL path that was provided after /user- redirect/ 2. request - A Tornado HTTPServerRequest representing the current request. 3. user - The currently authenticated user. 4. base_url - The base_url of the current hub, for relative redirects It should return the new URL to redirect to, or None to preserve current behavior. Default: None Spawner(LoggingConfigurable) options ------------------------------------ --Spawner.args=<list-item-1>... Extra arguments to be passed to the single-user server. Some spawners allow shell-style expansion here, allowing you to use environment variables here. Most, including the default, do not. Consult the documentation for your spawner to verify! Default: [] --Spawner.auth_state_hook=<Any> An optional hook function that you can implement to pass `auth_state` to the spawner after it has been initialized but before it starts. The `auth_state` dictionary may be set by the `.authenticate()` method of the authenticator. This hook enables you to pass some or all of that information to your spawner. Example:: def userdata_hook(spawner, auth_state): spawner.userdata = auth_state["userdata"] c.Spawner.auth_state_hook = userdata_hook Default: None --Spawner.cmd=<command-item-1>... The command used for starting the single-user server. Provide either a string or a list containing the path to the startup script command. Extra arguments, other than this path, should be provided via `args`. This is usually set if you want to start the single-user server in a different python environment (with virtualenv/conda) than JupyterHub itself. Some spawners allow shell-style expansion here, allowing you to use environment variables. Most, including the default, do not. Consult the documentation for your spawner to verify! Default: ['jupyterhub-singleuser'] --Spawner.consecutive_failure_limit=<Int> Maximum number of consecutive failures to allow before shutting down JupyterHub. This helps JupyterHub recover from a certain class of problem preventing launch in contexts where the Hub is automatically restarted (e.g. systemd, docker, kubernetes). A limit of 0 means no limit and consecutive failures will not be tracked. Default: 0 --Spawner.cpu_guarantee=<Float> Minimum number of cpu-cores a single-user notebook server is guaranteed to have available. If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs. **This is a configuration setting. Your spawner must implement support for the limit to work.** The default spawner, `LocalProcessSpawner`, does **not** implement this support. A custom spawner **must** add support for this setting for it to be enforced. Default: None --Spawner.cpu_limit=<Float> Maximum number of cpu-cores a single-user notebook server is allowed to use. If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs. The single-user notebook server will never be scheduled by the kernel to use more cpu-cores than this. There is no guarantee that it can access this many cpu-cores. **This is a configuration setting. Your spawner must implement support for the limit to work.** The default spawner, `LocalProcessSpawner`, does **not** implement this support. A custom spawner **must** add support for this setting for it to be enforced. Default: None --Spawner.debug=<Bool> Enable debug-logging of the single-user server Default: False --Spawner.default_url=<Unicode> The URL the single-user server should start in. `{username}` will be expanded to the user's username Example uses: - You can set `notebook_dir` to `/` and `default_url` to `/tree/home/{username}` to allow people to navigate the whole filesystem from their notebook server, but still start in their home directory. - Start with `/notebooks` instead of `/tree` if `default_url` points to a notebook instead of a directory. - You can set this to `/lab` to have JupyterLab start by default, rather than Jupyter Notebook. Default: '' --Spawner.disable_user_config=<Bool> Disable per-user configuration of single-user servers. When starting the user's single-user server, any config file found in the user's $HOME directory will be ignored. Note: a user could circumvent this if the user modifies their Python environment, such as when they have their own conda environments / virtualenvs / containers. Default: False --Spawner.env_keep=<list-item-1>... List of environment variables for the single-user server to inherit from the JupyterHub process. This list is used to ensure that sensitive information in the JupyterHub process's environment (such as `CONFIGPROXY_AUTH_TOKEN`) is not passed to the single-user server's process. Default: ['PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VI... --Spawner.environment=<key-1>=<value-1>... Extra environment variables to set for the single-user server's process. Environment variables that end up in the single-user server's process come from 3 sources: - This `environment` configurable - The JupyterHub process' environment variables that are listed in `env_keep` - Variables to establish contact between the single-user notebook and the hub (such as JUPYTERHUB_API_TOKEN) The `environment` configurable should be set by JupyterHub administrators to add installation specific environment variables. It is a dict where the key is the name of the environment variable, and the value can be a string or a callable. If it is a callable, it will be called with one parameter (the spawner instance), and should return a string fairly quickly (no blocking operations please!). Note that the spawner class' interface is not guaranteed to be exactly same across upgrades, so if you are using the callable take care to verify it continues to work after upgrades! .. versionchanged:: 1.2 environment from this configuration has highest priority, allowing override of 'default' env variables, such as JUPYTERHUB_API_URL. Default: {} --Spawner.http_timeout=<Int> Timeout (in seconds) before giving up on a spawned HTTP server Once a server has successfully been spawned, this is the amount of time we wait before assuming that the server is unable to accept connections. Default: 30 --Spawner.ip=<Unicode> The IP address (or hostname) the single-user server should listen on. The JupyterHub proxy implementation should be able to send packets to this interface. Default: '' --Spawner.mem_guarantee=<ByteSpecification> Minimum number of bytes a single-user notebook server is guaranteed to have available. Allows the following suffixes: - K -> Kilobytes - M -> Megabytes - G -> Gigabytes - T -> Terabytes **This is a configuration setting. Your spawner must implement support for the limit to work.** The default spawner, `LocalProcessSpawner`, does **not** implement this support. A custom spawner **must** add support for this setting for it to be enforced. Default: None --Spawner.mem_limit=<ByteSpecification> Maximum number of bytes a single-user notebook server is allowed to use. Allows the following suffixes: - K -> Kilobytes - M -> Megabytes - G -> Gigabytes - T -> Terabytes If the single user server tries to allocate more memory than this, it will fail. There is no guarantee that the single-user notebook server will be able to allocate this much memory - only that it can not allocate more than this. **This is a configuration setting. Your spawner must implement support for the limit to work.** The default spawner, `LocalProcessSpawner`, does **not** implement this support. A custom spawner **must** add support for this setting for it to be enforced. Default: None --Spawner.notebook_dir=<Unicode> Path to the notebook directory for the single-user server. The user sees a file listing of this directory when the notebook interface is started. The current interface does not easily allow browsing beyond the subdirectories in this directory's tree. `~` will be expanded to the home directory of the user, and {username} will be replaced with the name of the user. Note that this does *not* prevent users from accessing files outside of this path! They can do so with many other means. Default: '' --Spawner.options_form=<Union> An HTML form for options a user can specify on launching their server. The surrounding `<form>` element and the submit button are already provided. For example: .. code:: html Set your key: <input name="key" val="default_key"></input> <br> Choose a letter: <select name="letter" multiple="true"> <option value="A">The letter A</option> <option value="B">The letter B</option> </select> The data from this form submission will be passed on to your spawner in `self.user_options` Instead of a form snippet string, this could also be a callable that takes as one parameter the current spawner instance and returns a string. The callable will be called asynchronously if it returns a future, rather than a str. Note that the interface of the spawner class is not deemed stable across versions, so using this functionality might cause your JupyterHub upgrades to break. Default: traitlets.Undefined --Spawner.options_from_form=<Callable> Interpret HTTP form data Form data will always arrive as a dict of lists of strings. Override this function to understand single-values, numbers, etc. This should coerce form data into the structure expected by self.user_options, which must be a dict, and should be JSON-serializeable, though it can contain bytes in addition to standard JSON data types. This method should not have any side effects. Any handling of `user_options` should be done in `.start()` to ensure consistent behavior across servers spawned via the API and form submission page. Instances will receive this data on self.user_options, after passing through this function, prior to `Spawner.start`. .. versionchanged:: 1.0 user_options are persisted in the JupyterHub database to be reused on subsequent spawns if no options are given. user_options is serialized to JSON as part of this persistence (with additional support for bytes in case of uploaded file data), and any non-bytes non-jsonable values will be replaced with None if the user_options are re-used. Default: traitlets.Undefined --Spawner.poll_interval=<Int> Interval (in seconds) on which to poll the spawner for single-user server's status. At every poll interval, each spawner's `.poll` method is called, which checks if the single-user server is still running. If it isn't running, then JupyterHub modifies its own state accordingly and removes appropriate routes from the configurable proxy. Default: 30 --Spawner.port=<Int> The port for single-user servers to listen on. Defaults to `0`, which uses a randomly allocated port number each time. If set to a non-zero value, all Spawners will use the same port, which only makes sense if each server is on a different address, e.g. in containers. New in version 0.7. Default: 0 --Spawner.post_stop_hook=<Any> An optional hook function that you can implement to do work after the spawner stops. This can be set independent of any concrete spawner implementation. Default: None --Spawner.pre_spawn_hook=<Any> An optional hook function that you can implement to do some bootstrapping work before the spawner starts. For example, create a directory for your user or load initial content. This can be set independent of any concrete spawner implementation. This maybe a coroutine. Example:: from subprocess import check_call def my_hook(spawner): username = spawner.user.name check_call(['./examples/bootstrap-script/bootstrap.sh', username]) c.Spawner.pre_spawn_hook = my_hook Default: None --Spawner.ssl_alt_names=<list-item-1>... List of SSL alt names May be set in config if all spawners should have the same value(s), or set at runtime by Spawner that know their names. Default: [] --Spawner.ssl_alt_names_include_local=<Bool> Whether to include DNS:localhost, IP:127.0.0.1 in alt names Default: True --Spawner.start_timeout=<Int> Timeout (in seconds) before giving up on starting of single-user server. This is the timeout for start to return, not the timeout for the server to respond. Callers of spawner.start will assume that startup has failed if it takes longer than this. start should return when the server process is started and its location is known. Default: 60 Authenticator(LoggingConfigurable) options ------------------------------------------ --Authenticator.admin_users=<set-item-1>... Set of users that will have admin rights on this JupyterHub. Admin users have extra privileges: - Use the admin panel to see list of users logged in - Add / remove users in some authenticators - Restart / halt the hub - Start / stop users' single-user servers - Can access each individual users' single-user server (if configured) Admin access should be treated the same way root access is. Defaults to an empty set, in which case no user has admin access. Default: set() --Authenticator.allowed_users=<set-item-1>... Set of usernames that are allowed to log in. Use this with supported authenticators to restrict which users can log in. This is an additional list that further restricts users, beyond whatever restrictions the authenticator has in place. If empty, does not perform any additional restriction. .. versionchanged:: 1.2 `Authenticator.whitelist` renamed to `allowed_users` Default: set() --Authenticator.auth_refresh_age=<Int> The max age (in seconds) of authentication info before forcing a refresh of user auth info. Refreshing auth info allows, e.g. requesting/re-validating auth tokens. See :meth:`.refresh_user` for what happens when user auth info is refreshed (nothing by default). Default: 300 --Authenticator.auto_login=<Bool> Automatically begin the login process rather than starting with a "Login with..." link at `/hub/login` To work, `.login_url()` must give a URL other than the default `/hub/login`, such as an oauth handler or another automatic login handler, registered with `.get_handlers()`. .. versionadded:: 0.8 Default: False --Authenticator.blocked_users=<set-item-1>... Set of usernames that are not allowed to log in. Use this with supported authenticators to restrict which users can not log in. This is an additional block list that further restricts users, beyond whatever restrictions the authenticator has in place. If empty, does not perform any additional restriction. .. versionadded: 0.9 .. versionchanged:: 1.2 `Authenticator.blacklist` renamed to `blocked_users` Default: set() --Authenticator.delete_invalid_users=<Bool> Delete any users from the database that do not pass validation When JupyterHub starts, `.add_user` will be called on each user in the database to verify that all users are still valid. If `delete_invalid_users` is True, any users that do not pass validation will be deleted from the database. Use this if users might be deleted from an external system, such as local user accounts. If False (default), invalid users remain in the Hub's database and a warning will be issued. This is the default to avoid data loss due to config changes. Default: False --Authenticator.enable_auth_state=<Bool> Enable persisting auth_state (if available). auth_state will be encrypted and stored in the Hub's database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables. Encrypting auth_state requires the cryptography package. Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one (or more, separated by ;) 32B encryption keys. These can be either base64 or hex-encoded. If encryption is unavailable, auth_state cannot be persisted. New in JupyterHub 0.8 Default: False --Authenticator.post_auth_hook=<Any> An optional hook function that you can implement to do some bootstrapping work during authentication. For example, loading user account details from an external system. This function is called after the user has passed all authentication checks and is ready to successfully authenticate. This function must return the authentication dict reguardless of changes to it. This maybe a coroutine. .. versionadded: 1.0 Example:: import os, pwd def my_hook(authenticator, handler, authentication): user_data = pwd.getpwnam(authentication['name']) spawn_data = { 'pw_data': user_data 'gid_list': os.getgrouplist(authentication['name'], user_data.pw_gid) } if authentication['auth_state'] is None: authentication['auth_state'] = {} authentication['auth_state']['spawn_data'] = spawn_data return authentication c.Authenticator.post_auth_hook = my_hook Default: None --Authenticator.refresh_pre_spawn=<Bool> Force refresh of auth prior to spawn. This forces :meth:`.refresh_user` to be called prior to launching a server, to ensure that auth state is up-to-date. This can be important when e.g. auth tokens that may have expired are passed to the spawner via environment variables from auth_state. If refresh_user cannot refresh the user auth data, launch will fail until the user logs in again. Default: False --Authenticator.username_map=<key-1>=<value-1>... Dictionary mapping authenticator usernames to JupyterHub users. Primarily used to normalize OAuth user names to local users. Default: {} --Authenticator.username_pattern=<Unicode> Regular expression pattern that all valid usernames must match. If a username does not match the pattern specified here, authentication will not be attempted. If not set, allow any username. Default: '' --Authenticator.whitelist=<set-item-1>... Deprecated, use `Authenticator.allowed_users` Default: set() CryptKeeper(SingletonConfigurable) options ------------------------------------------ --CryptKeeper.keys=<list-item-1>... Default: [] --CryptKeeper.n_threads=<Int> The number of threads to allocate for encryption Default: 2 Pagination(Configurable) options -------------------------------- --Pagination.default_per_page=<Int> Default number of entries per page for paginated results. Default: 100 --Pagination.max_per_page=<Int> Maximum number of entries per page for paginated results. Default: 250 Examples -------- generate default config file: jupyterhub --generate-config -f /etc/jupyterhub/jupyterhub_config.py spawn the server on 10.0.1.2:443 with https: jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert
This guide covers best-practices, tips, common questions and operations, as well as other information relevant to running your own JupyterHub over time.
When troubleshooting, you may see unexpected behaviors or receive an error message. This section provide links for identifying the cause of the problem and how to resolve it.
Behavior
Errors
How do I…?
Troubleshooting commands
If you have tried to start the JupyterHub proxy and it fails to start:
c.JupyterHub.ip = '*'
c.JupyterHub.ip = ''
jupyterhub --ip=0.0.0.0
Note: If this occurs on Ubuntu/Debian, check that the you are using a recent version of node. Some versions of Ubuntu/Debian come with a version of node that is very old, and it is necessary to update node.
If the sudospawner script is not found in the path, sudospawner will not run. To avoid this, specify sudospawner’s absolute path. For example, start jupyterhub with:
jupyterhub --SudoSpawner.sudospawner_path='/absolute/path/to/sudospawner'
or add:
c.SudoSpawner.sudospawner_path = '/absolute/path/to/sudospawner'
to the config file, jupyterhub_config.py.
When nothing is given for these lists, there will be no admins, and all users who can authenticate on the system (i.e. all the unix users on the server with a password) will be allowed to start a server. The allowed username set lets you limit this to a particular set of users, and admin_users lets you specify who among them may use the admin interface (not necessary, unless you need to do things like inspect other users’ servers, or modify the user list at runtime).
Even though the command to start your Docker container exposes port 8000 (docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub), it is possible that the IP address itself is not accessible/visible. As a result when you try http://localhost:8000 in your browser, you are unable to connect even though the container is running properly. One workaround is to explicitly tell Jupyterhub to start at 0.0.0.0 which is visible to everyone. Try this command: docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub --ip 0.0.0.0 --port 8000
docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub
0.0.0.0
docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub --ip 0.0.0.0 --port 8000
I started JupyterHub + nbgrader on the same host without containers. When I try to restart JupyterHub + nbgrader with this configuration, errors appear that the service accounts cannot start because the ports are being used.
How can I kill the processes that are using these ports?
Run the following command:
sudo kill -9 $(sudo lsof -t -i:<service_port>)
Where <service_port> is the port used by the nbgrader course service. This configuration is specified in jupyterhub_config.py.
<service_port>
After successfully logging in to JupyterHub with a compatible authenticators, I get a ‘Spawn failed’ error message in the browser. The JupyterHub logs have jupyterhub KeyError: "getpwnam(): name not found: <my_user_name>.
jupyterhub KeyError: "getpwnam(): name not found: <my_user_name>
This issue occurs when the authenticator requires a local system user to exist. In these cases, you need to use a spawner that does not require an existing system user account, such as DockerSpawner or KubeSpawner.
KubeSpawner
When launching JupyterHub with sudo jupyterhub I get import errors and my environment variables don’t work.
When launching services with sudo ... the shell won’t have the same environment variables or PATHs in place. The most direct way to solve this issue is to use the full path to your python environment and add environment variables. For example:
sudo ...
sudo MY_ENV=abc123 \ /home/foo/venv/bin/python3 \ /srv/jupyterhub/jupyterhub
Use docker logs <container> where <container> is the container name defined within docker-compose.yml. For example, to view the logs of the JupyterHub container use:
docker logs <container>
<container>
docker-compose.yml
docker logs hub
By default, the user’s notebook server is named jupyter-<username> where username is the user’s username within JupyterHub’s db. So if you wanted to see the logs for user foo you would use:
jupyter-<username>
foo
docker logs jupyter-foo
You can also tail logs to view them in real time using the -f option:
-f
docker logs -f hub
You receive a 500 error when accessing the URL /user/<your_name>/.... This is often seen when your single-user server cannot verify your user cookie with the Hub.
/user/<your_name>/...
There are two likely reasons for this:
The main symptom is a failure to load any page served by the single-user server, met with a 500 error. This is typically the first page at /user/<your_name> after logging in or clicking “Start my server”. When a single-user notebook server receives a request, the notebook server makes an API request to the Hub to check if the cookie corresponds to the right user. This request is logged.
/user/<your_name>
If everything is working, the response logged will be similar to this:
200 GET /hub/api/authorizations/cookie/jupyterhub-token-name/[secret] (@10.0.1.4) 6.10ms
You should see a similar 200 message, as above, in the Hub log when you first visit your single-user notebook server. If you don’t see this message in the log, it may mean that your single-user notebook server isn’t connecting to your Hub.
If you see 403 (forbidden) like this, it’s likely a token problem:
403 GET /hub/api/authorizations/cookie/jupyterhub-token-name/[secret] (@10.0.1.4) 4.14ms
Check the logs of the single-user notebook server, which may have more detailed information on the cause.
If you make an API request and it is not received by the server, you likely have a network configuration issue. Often, this happens when the Hub is only listening on 127.0.0.1 (default) and the single-user servers are not on the same ‘machine’ (can be physically remote, or in a docker container or VM). The fix for this case is to make sure that c.JupyterHub.hub_ip is an address that all single-user servers can connect to, e.g.:
c.JupyterHub.hub_ip
c.JupyterHub.hub_ip = '10.0.0.1'
If you receive a 403 error, the API token for the single-user server is likely invalid. Commonly, the 403 error is caused by resetting the JupyterHub database (either removing jupyterhub.sqlite or some other action) while leaving single-user servers running. This happens most frequently when using DockerSpawner, because Docker’s default behavior is to stop/start containers which resets the JupyterHub database, rather than destroying and recreating the container every time. This means that the same API token is used by the server for its whole life, until the container is rebuilt.
The fix for this Docker case is to remove any Docker containers seeing this issue (typically all containers created before a certain point in time):
docker rm -f jupyter-name
After this, when you start your server via JupyterHub, it will build a new container. If this was the underlying cause of the issue, you should see your server again.
When your whole JupyterHub sits behind a organization proxy (not a reverse proxy like NGINX as part of your setup and not the configurable-http-proxy) the environment variables HTTP_PROXY, HTTPS_PROXY, http_proxy and https_proxy might be set. This confuses the jupyterhub-singleuser servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the singleuser server thinking it has a wrong auth token. To circumvent this you should add <hub_url>,<hub_ip>,localhost,127.0.0.1 to the environment variables NO_PROXY and no_proxy.
HTTP_PROXY
HTTPS_PROXY
http_proxy
https_proxy
<hub_url>,<hub_ip>,localhost,127.0.0.1
NO_PROXY
no_proxy
JupyterHub services allow processes to interact with JupyterHub’s REST API. Example use-cases include:
If possible, try to run the Jupyter Notebook as an externally managed service with one of the provided jupyter/docker-stacks.
Standard JupyterHub installations include a jupyterhub-singleuser command which is built from the jupyterhub.singleuser:main method. The jupyterhub-singleuser command is the default command when JupyterHub launches single-user Jupyter Notebooks. One of the goals of this command is to make sure the version of JupyterHub installed within the Jupyter Notebook coincides with the version of the JupyterHub server itself.
jupyterhub.singleuser:main
If you launch a Jupyter Notebook with the jupyterhub-singleuser command directly from the command line the Jupyter Notebook won’t have access to the JUPYTERHUB_API_TOKEN and will return:
JUPYTERHUB_API_TOKEN env is required to run jupyterhub-singleuser. Did you launch it manually?
If you plan on testing jupyterhub-singleuser independently from JupyterHub, then you can set the api token environment variable. For example, if were to run the single-user Jupyter Notebook on the host, then:
export JUPYTERHUB_API_TOKEN=my_secret_token jupyterhub-singleuser
With a docker container, pass in the environment variable with the run command:
docker run -d \ -p 8888:8888 \ -e JUPYTERHUB_API_TOKEN=my_secret_token \ jupyter/datascience-notebook:latest
This example demonstrates how to combine the use of the jupyterhub-singleuser environment variables when launching a Notebook as an externally managed service.
Some certificate providers, i.e. Entrust, may provide you with a chained certificate that contains multiple files. If you are using a chained certificate you will need to concatenate the individual files by appending the chain cert and root cert to your host cert:
cat your_host.crt chain.crt root.crt > your_host-chained.crt
You would then set in your jupyterhub_config.py file the ssl_key and ssl_cert as follows:
ssl_key
ssl_cert
c.JupyterHub.ssl_cert = your_host-chained.crt c.JupyterHub.ssl_key = your_host.key
Your certificate provider gives you the following files: example_host.crt, Entrust_L1Kroot.txt and Entrust_Root.txt.
example_host.crt
Entrust_L1Kroot.txt
Entrust_Root.txt
Concatenate the files appending the chain cert and root cert to your host cert:
cat example_host.crt Entrust_L1Kroot.txt Entrust_Root.txt > example_host-chained.crt
You would then use the example_host-chained.crt as the value for JupyterHub’s ssl_cert. You may pass this value as a command line option when starting JupyterHub or more conveniently set the ssl_cert variable in JupyterHub’s configuration file, jupyterhub_config.py. In jupyterhub_config.py, set:
example_host-chained.crt
c.JupyterHub.ssl_cert = /path/to/example_host-chained.crt c.JupyterHub.ssl_key = /path/to/example_host.key
where ssl_cert is example-chained.crt and ssl_key to your private key.
Then restart JupyterHub.
See also JupyterHub SSL encryption.
Both conda and pip can be used without a network connection. You can make your own repository (directory) of conda packages and/or wheels, and then install from there instead of the internet.
For instance, you can install JupyterHub with pip and configurable-http-proxy with npmbox:
python3 -m pip wheel jupyterhub npmbox configurable-http-proxy
Setting the following in jupyterhub_config.py will configure access to the entire filesystem and set the default to the user’s home directory.
c.Spawner.notebook_dir = '/' c.Spawner.default_url = '/home/%U' # %U will be replaced with the username
From the command line, pySpark executors can be configured using a command similar to this one:
pyspark --total-executor-cores 2 --executor-memory 1G
Cloudera documentation for configuring spark on YARN applications provides additional information. The pySpark configuration documentation is also helpful for programmatic configuration examples.
While JupyterLab is still under active development, we have had users ask about how to try out JupyterLab with JupyterHub.
You need to install and enable the JupyterLab extension system-wide, then you can change the default URL to /lab.
/lab
For instance:
python3 -m pip install jupyterlab jupyter serverextension enable --py jupyterlab --sys-prefix
The important thing is that jupyterlab is installed and enabled in the single-user notebook server environment. For system users, this means system-wide, as indicated above. For Docker containers, it means inside the single-user docker image, etc.
In jupyterhub_config.py, configure the Spawner to tell the single-user notebook servers to default to JupyterLab:
Users will need a GitHub account to login and be authenticated by the Hub.
You can do this with logrotate, or pipe to logger to use syslog instead of directly to a file.
logger
For example, with this logrotate config file:
/var/log/jupyterhub.log { copytruncate daily }
and run this daily by putting a script in /etc/cron.daily/:
/etc/cron.daily/
logrotate /path/to/above-config
Or use syslog:
jupyterhub | logger -t jupyterhub
The following commands provide additional detail about installed packages, versions, and system information that may be helpful when troubleshooting a JupyterHub deployment. The commands are:
jupyter troubleshooting
jupyterhub --debug
The Apache Toree kernel will an issue, when running with JupyterHub, if the standard HDFS rack awareness script is used. This will materialize in the logs as a repeated WARN:
16/11/29 16:24:20 WARN ScriptBasedMapping: Exception running /etc/hadoop/conf/topology_script.py some.ip.address ExitCodeException exitCode=1: File "/etc/hadoop/conf/topology_script.py", line 63 print rack ^ SyntaxError: Missing parentheses in call to 'print' at `org.apache.hadoop.util.Shell.runCommand(Shell.java:576)`
In order to resolve this issue, there are two potential options.
Docker images can be found at the JupyterHub organization on DockerHub. The Docker image jupyterhub/singleuser provides an example single user notebook server for use with DockerSpawner.
Additional single user notebook server images can be found at the Jupyter organization on DockerHub and information about each image at the jupyter/docker-stacks repo.
JupyterHub offers easy upgrade pathways between minor versions. This document describes how to do these upgrades.
If you are using a JupyterHub distribution, you should consult the distribution’s documentation on how to upgrade. This document is if you have set up your own JupyterHub without using a distribution.
It is long because is pretty detailed! Most likely, upgrading JupyterHub is painless, quick and with minimal user interruption.
The changelog contains information on what has changed with the new JupyterHub release, and any deprecation warnings. Read these notes to familiarize yourself with the coming changes. There might be new releases of authenticators & spawners you are using, so read the changelogs for those too!
If you are using the default configuration where configurable-http-proxy is managed by JupyterHub, your users will see service disruption during the upgrade process. You should notify them, and pick a time to do the upgrade where they will be least disrupted.
If you are using a different proxy, or running configurable-http-proxy independent of JupyterHub, your users will be able to continue using notebook servers they had already launched, but will not be able to launch new servers nor sign in.
Before doing an upgrade, it is critical to back up:
Shutdown the JupyterHub process. This would vary depending on how you have set up JupyterHub to run. Most likely, it is using a process supervisor of some sort (systemd or supervisord or even docker). Use the supervisor specific command to stop the JupyterHub process.
systemd
supervisord
There are two environments where the jupyterhub package is installed:
You need to make sure the version of the jupyterhub package matches in both these environments. If you installed jupyterhub with pip, you can upgrade it with:
python3 -m pip install --upgrade jupyterhub==<version>
Where <version> is the version of JupyterHub you are upgrading to.
<version>
If you used conda to install jupyterhub, you should upgrade it with:
conda install -c conda-forge jupyterhub==<version>
You should also check for new releases of the authenticator & spawner you are using. You might wish to upgrade those packages too along with JupyterHub, or upgrade them separately.
Once new packages are installed, you need to upgrade the JupyterHub database. From the hub environment, in the same directory as your jupyterhub_config.py file, you should run:
jupyterhub upgrade-db
This should find the location of your database, and run necessary upgrades for it.
SQLite has some disadvantages when it comes to upgrading JupyterHub. These are:
Losing the Hub database is often not a big deal. Information that resides only in the Hub database includes:
If the following conditions are true, you should be fine clearing the Hub database and starting over:
Once the database upgrade is completed, start the jupyterhub process again.
Congratulations, your JupyterHub has been upgraded!
For detailed changes from the prior release, click on the version number, and its link will bring up a GitHub listing of changes. Use git log on the command line for details.
git log
JupyterHub 1.3 is a small feature release. Highlights include:
?state=
state
active
inactive
ready
jupyterhub_
(full changelog)
(GitHub contributors page for this release)
@0mar | @agp8x | @alexweav | @belfhi | @betatim | @cbanek | @cmd-ntrf | @coffeebenzene | @consideRatio | @danlester | @fcollonval | @GeorgianaElena | @ianabc | @IvanaH8 | @manics | @meeseeksmachine | @mhwasil | @minrk | @mriedem | @mxjeff | @olifre | @rcthomas | @rgbkrk | @rkdarst | @Sangarshanan | @slemonide | @support | @tlvu | @welcome | @yuvipanda
@alexweav | @belfhi | @betatim | @cmd-ntrf | @consideRatio | @danlester | @fcollonval | @GeorgianaElena | @ianabc | @IvanaH8 | @manics | @meeseeksmachine | @minrk | @mriedem | @olifre | @rcthomas | @rgbkrk | @rkdarst | @slemonide | @support | @welcome | @yuvipanda
@bitnik
JupyterHub 1.2 is an incremental release with lots of small improvements. It is unlikely that users will have to change much to upgrade, but lots of new things are possible and/or better!
There are no database schema changes requiring migration from 1.1 to 1.2.
Highlights:
c.Authenticator.allowed_users = {'user', ...}
delete_invalid_users
python:3.8 + master dependencies
key in UserDict
@0nebody | @1kastner | @ahkui | @alexdriedger | @alexweav | @AlJohri | @Analect | @analytically | @aneagoe | @AngelOnFira | @barrachri | @basvandervlies | @betatim | @bigbosst | @blink1073 | @Cadair | @Carreau | @cbjuan | @ceocoder | @chancez | @choldgraf | @Chrisjw42 | @cmd-ntrf | @consideRatio | @danlester | @diurnalist | @Dmitry1987 | @dsblank | @dylex | @echarles | @elgalu | @fcollonval | @gatoniel | @GeorgianaElena | @hnykda | @itssimon | @jgwerner | @JohnPaton | @joshmeek | @jtpio | @kinow | @kreuzert | @kxiao-fn | @lesiano | @limimiking | @lydian | @mabbasi90 | @maluhoss | @manics | @matteoipri | @mbmilligan | @meeseeksmachine | @mhwasil | @minrk | @mriedem | @nscozzaro | @pabepadu | @possiblyMikeB | @psyvision | @rabsr | @rainwoodman | @rajat404 | @rcthomas | @reneluria | @rgbkrk | @rkdarst | @rkevin-arch | @romainx | @ryanlovett | @ryogesh | @sdague | @snickell | @SonakshiGrover | @ssanderson | @stefanvangastel | @steinad | @stephen-a2z | @stevegore | @stv0g | @subgero | @sudi007 | @summerswallow | @support | @synchronizing | @thuvh | @tritemio | @twalcari | @vchandvankar | @vilhelmen | @vlizanae | @weimin | @welcome | @willingc | @xlotlu | @yhal-nesi | @ynnelson | @yuvipanda | @zonca | @Zsailer
1.1 is a release with lots of accumulated fixes and improvements, especially in performance, metrics, and customization. There are no database changes in 1.1, so no database upgrade is required when upgrading from 1.0 to 1.1.
Of particular interest to deployments with automatic health checking and/or large numbers of users is that the slow startup time introduced in 1.0 by additional spawner validation can now be mitigated by JupyterHub.init_spawners_timeout, allowing the Hub to become responsive before the spawners may have finished validating.
JupyterHub.init_spawners_timeout
Several new Prometheus metrics are added (and others fixed!) to measure sources of common performance issues, such as proxy interactions and startup.
1.1 also begins adoption of the Jupyter telemetry project in JupyterHub, See The Jupyter Telemetry docs for more info. The only events so far are starting and stopping servers, but more will be added in future releases.
There are many more fixes and improvements listed below. Thanks to everyone who has contributed to this release!
JupyterHub.user_redirect_hook
PROXY_DELETE_DURATION_SECONDS
Service.oauth_no_confirm
JupyterHub.default_server_name
uids
JupyterHub.activity_resolution
pre_spawn_start
TOTAL_USERS
RUNNING_SERVERS
--help
--version
8001
<div class="container">
changelog
JupyterHub 1.0 is a major milestone for JupyterHub. Huge thanks to the many people who have contributed to this release, whether it was through discussion, testing, documentation, or development.
Support TLS encryption and authentication of all internal communication. Spawners must implement .move_certs method to make certificates available to the notebook server if it is not local to the Hub.
There is now full UI support for managing named servers. With named servers, each jupyterhub user may have access to more than one named server. For example, a professor may access a server named research and another named teaching.
research
teaching
Authenticators can now expire and refresh authentication data by implementing Authenticator.refresh_user(user). This allows things like OAuth data and access tokens to be refreshed. When used together with Authenticator.refresh_pre_spawn = True, auth refresh can be forced prior to Spawn, allowing the Authenticator to require that authentication data is fresh immediately before the user’s server is launched.
Authenticator.refresh_user(user)
Authenticator.refresh_pre_spawn = True
See also
Authenticator.refresh_user()
Spawner.create_certs()
Spawner.move_certs()
allow custom spawners, authenticators, and proxies to register themselves via ‘entry points’, enabling more convenient configuration such as:
c.JupyterHub.authenticator_class = 'github' c.JupyterHub.spawner_class = 'docker' c.JupyterHub.proxy_class = 'traefik_etcd'
Spawners are passed the tornado Handler object that requested their spawn (as self.handler), so they can do things like make decisions based on query arguments in the request.
self.handler
SimpleSpawner and DummyAuthenticator, which are useful for testing, have been merged into JupyterHub itself:
# For testing purposes only. Should not be used in production. c.JupyterHub.authenticator_class = 'dummy' c.JupyterHub.spawner_class = 'simple'
These classes are not appropriate for production use. Only testing.
Add health check endpoint at /hub/health
/hub/health
Several prometheus metrics have been added (thanks to Outreachy applicants!)
A new API for registering user activity. To prepare for the addition of alternate proxy implementations, responsibility for tracking activity is taken away from the proxy and moved to the notebook server (which already has activity tracking features). Activity is now tracked by pushing it to the Hub from user servers instead of polling the proxy API.
Dynamic options_form callables may now return an empty string which will result in no options form being rendered.
options_form
Spawner.user_options is persisted to the database to be re-used, so that a server spawned once via the form can be re-spawned via the API with the same options.
Spawner.user_options
Added c.PAMAuthenticator.pam_normalize_username option for round-tripping usernames through PAM to retrieve the normalized form.
c.PAMAuthenticator.pam_normalize_username
Added c.JupyterHub.named_server_limit_per_user configuration to limit the number of named servers each user can have. The default is 0, for no limit.
c.JupyterHub.named_server_limit_per_user
API requests to HubAuthenticated services (e.g. single-user servers) may pass a token in the Authorization header, matching authentication with the Hub API itself.
Added Authenticator.is_admin(handler, authentication) method and Authenticator.admin_groups configuration for automatically determining that a member of a group should be considered an admin.
Authenticator.is_admin(handler, authentication)
Authenticator.admin_groups
New c.Authenticator.post_auth_hook configuration that can be any callable of the form async def hook(authenticator, handler, authentication=None):. This hook may transform the return value of Authenticator.authenticate() and return a new authentication dictionary, e.g. specifying admin privileges, group membership, or custom allowed/blocked logic. This hook is called after existing normalization and allowed-username checking.
c.Authenticator.post_auth_hook
async def hook(authenticator, handler, authentication=None):
Authenticator.authenticate()
Spawner.options_from_form may now be async
Added JupyterHub.shutdown_on_logout option to trigger shutdown of a user’s servers when they log out.
JupyterHub.shutdown_on_logout
When Spawner.start raises an Exception, a message can be passed on to the user if the exception has a .jupyterhub_message attribute.
.jupyterhub_message
Authentication methods such as check_whitelist should now take an additional authentication argument that will be a dictionary (default: None) of authentication data, as returned by Authenticator.authenticate():
check_whitelist
authentication
def check_whitelist(self, username, authentication=None): ...
authentication should have a default value of None for backward-compatibility with jupyterhub < 1.0.
Prometheus metrics page is now authenticated. Any authenticated user may see the prometheus metrics. To disable prometheus authentication, set JupyterHub.authenticate_prometheus = False.
JupyterHub.authenticate_prometheus = False
Visits to /user/:name no longer trigger an implicit launch of the user’s server. Instead, a page is shown indicating that the server is not running with a link to request the spawn.
API requests to /user/:name for a not-running server will have status 503 instead of 404.
OAuth includes a confirmation page when attempting to visit another user’s server, so that users can choose to cancel authentication with the single-user server. Confirmation is still skipped when accessing your own server.
JupyterHub.db_url
There have been several changes to the development process that shouldn’t generally affect users of JupyterHub, but may affect contributors. In general, see CONTRIBUTING.md for contribution info or ask if you have questions.
CONTRIBUTING.md
black
pre-commit
pytest-asyncio
pytest-tornado
oauthlib
python-oauth2
JupyterHub 0.9.6 is a security release.
JupyterHub 0.9.5 included a partial fix for this issue.
JupyterHub 0.9.4 is a small bugfix release.
application/json
text/html
JupyterHub 0.9.3 contains small bugfixes and improvements
expires_at
JupyterHub 0.9.2 contains small bugfixes and improvements.
Spawner.consecutive_failure_limit
JupyterHub 0.9.1 contains a number of small bugfixes on top of 0.9.
c.LocalProcessSpawner.shell_cmd
/user/:name/api/...
JupyterHub.base_url
JupyterHub 0.9 is a major upgrade of JupyterHub. There are several changes to the database schema, so make sure to backup your database and run:
after upgrading jupyterhub.
The biggest change for 0.9 is the switch to asyncio coroutines everywhere instead of tornado coroutines. Custom Spawners and Authenticators are still free to use tornado coroutines for async methods, as they will continue to work. As part of this upgrade, JupyterHub 0.9 drops support for Python < 3.5 and tornado < 5.0.
Require Python >= 3.5
Require tornado >= 5.0
Use asyncio coroutines throughout
Set status 409 for conflicting actions instead of 400, e.g. creating users or groups that already exist.
timestamps in REST API continue to be UTC, but now include ‘Z’ suffix to identify them as such.
REST API User model always includes servers dict, not just when named servers are enabled.
servers
server info is no longer available to oauth identification endpoints, only user info and group membership.
server
User.last_activity may be None if a user has not been seen, rather than starting with the user creation time which is now separately stored as User.created.
User.last_activity
User.created
static resources are now found in $PREFIX/share/jupyterhub instead of share/jupyter/hub for improved consistency.
$PREFIX/share/jupyterhub
share/jupyter/hub
Deprecate .extra_log_file config. Use pipe redirection instead:
.extra_log_file
jupyterhub &>> /var/log/jupyterhub.log
Add JupyterHub.bind_url config for setting the full bind URL of the proxy. Sets ip, port, base_url all at once.
JupyterHub.bind_url
Add JupyterHub.hub_bind_url for setting the full host+port of the Hub. hub_bind_url supports unix domain sockets, e.g. unix+http://%2Fsrv%2Fjupyterhub.sock
JupyterHub.hub_bind_url
hub_bind_url
unix+http://%2Fsrv%2Fjupyterhub.sock
Deprecate JupyterHub.hub_connect_port config in favor of JupyterHub.hub_connect_url. hub_connect_ip is not deprecated and can still be used in the common case where only the ip address of the hub differs from the bind ip.
JupyterHub.hub_connect_port
JupyterHub.hub_connect_url
Spawners can define a .progress method which should be an async generator. The generator should yield events of the form:
.progress
{ "message": "some-state-message", "progress": 50, }
These messages will be shown with a progress bar on the spawn-pending page. The async_generator package can be used to make async generators compatible with Python 3.5.
async_generator
track activity of individual API tokens
new REST API for managing API tokens at /hub/api/user/tokens[/token-id]
/hub/api/user/tokens[/token-id]
allow viewing/revoking tokens via token page
User creation time is available in the REST API as User.created
Server start time is stored as Server.started
Server.started
Spawner.start may return a URL for connecting to a notebook instead of (ip, port). This enables Spawners to launch servers that setup their own HTTPS.
(ip, port)
Optimize database performance by disabling sqlalchemy expire_on_commit by default.
Add python -m jupyterhub.dbutil shell entrypoint for quickly launching an IPython session connected to your JupyterHub database.
python -m jupyterhub.dbutil shell
Include User.auth_state in user model on single-user REST endpoints for admins only.
User.auth_state
Include Server.state in server model on REST endpoints for admins only.
Server.state
Add Authenticator.blacklist for blocking users instead of allowing.
Authenticator.blacklist
Pass c.JupyterHub.tornado_settings['cookie_options'] down to Spawners so that cookie options (e.g. expires_days) can be set globally for the whole application.
c.JupyterHub.tornado_settings['cookie_options']
expires_days
SIGINFO (ctrl-t) handler showing the current status of all running threads, coroutines, and CPU/memory/FD consumption.
ctrl-t
Add async Spawner.get_options_form alternative to .options_form, so it can be a coroutine.
Spawner.get_options_form
.options_form
Add JupyterHub.redirect_to_server config to govern whether users should be sent to their server on login or the JupyterHub home page.
JupyterHub.redirect_to_server
html page templates can be more easily customized and extended.
Allow registering external OAuth clients for using the Hub as an OAuth provider.
Add basic prometheus metrics at /hub/metrics endpoint.
/hub/metrics
Add session-id cookie, enabling immediate revocation of login tokens.
Authenticators may specify that users are admins by specifying the admin key when return the user model as a dict.
admin
Added “Start All” button to admin page for launching all user servers at once.
Services have an info field which is a dictionary. This is accessible via the REST API.
info
JupyterHub.extra_handlers allows defining additional tornado RequestHandlers attached to the Hub.
JupyterHub.extra_handlers
API tokens may now expire. Expiry is available in the REST model as expires_at, and settable when creating API tokens by specifying expires_in.
expires_in
?redirects
.form-control
getpass.getuser()
JupyterHub 0.8.1 is a collection of bugfixes and small improvements on 0.8.
jupyterhub --upgrade-db
bower
JupyterHub 0.8 is a big release!
Perhaps the biggest change is the use of OAuth to negotiate authentication between the Hub and single-user services. Due to this change, it is important that the single-user server and Hub are both running the same version of JupyterHub. If you are using containers (e.g. via DockerSpawner or KubeSpawner), this means upgrading jupyterhub in your user images at the same time as the Hub. In most cases, a
pip install jupyterhub==version
in your Dockerfile is sufficient.
JupyterHub now defined a Proxy API for custom proxy implementations other than the default. The defaults are unchanged, but configuration of the proxy is now done on the ConfigurableHTTPProxy class instead of the top-level JupyterHub. TODO: docs for writing a custom proxy.
Single-user servers and services (anything that uses HubAuth) can now accept token-authenticated requests via the Authentication header.
Authenticators can now store state in the Hub’s database. To do so, the authenticate method should return a dict of the form
{ 'username': 'name', 'state': {} }
This data will be encrypted and requires JUPYTERHUB_CRYPT_KEY environment variable to be set and the Authenticator.enable_auth_state flag to be True. If these are not set, auth_state returned by the Authenticator will not be stored.
Authenticator.enable_auth_state
There is preliminary support for multiple (named) servers per user in the REST API. Named servers can be created via API requests, but there is currently no UI for managing them.
Add LocalProcessSpawner.popen_kwargs and LocalProcessSpawner.shell_cmd for customizing how user server processes are launched.
LocalProcessSpawner.popen_kwargs
LocalProcessSpawner.shell_cmd
Add Authenticator.auto_login flag for skipping the “Login with…” page explicitly.
Authenticator.auto_login
Add JupyterHub.hub_connect_ip configuration for the ip that should be used when connecting to the Hub. This is promoting (and deprecating) DockerSpawner.hub_ip_connect for use by all Spawners.
JupyterHub.hub_connect_ip
DockerSpawner.hub_ip_connect
Add Spawner.pre_spawn_hook(spawner) hook for customizing pre-spawn events.
Spawner.pre_spawn_hook(spawner)
Add JupyterHub.active_server_limit and JupyterHub.concurrent_spawn_limit for limiting the total number of running user servers and the number of pending spawns, respectively.
JupyterHub.active_server_limit
JupyterHub.concurrent_spawn_limit
.get_env()
.get_args()
So many things fixed!
httponly
--group
/user-redirect/
Spawner.will_resume
DockerSpawner.remove_containers = False
set('string')
next_url
/api/
/api/info
pip install .
--no-ssl
%U
{username}
Bugfixes on 0.6:
juptyer/jupyterhub
jupyterhub/jupyterhub-onbuild
c.JupyterHub.statsd_{host,port,prefix}
@default
@observe
c.PAMAuthenticator.open_sessions = False
Spawner.environment
JupyterHub.api_tokens
token: username
/api/authorizations/token
JupyterHub.subdomain_host = 'https://jupyterhub.domain.tld[:port]'
127.0.0.1
Fix removal of /login page in 0.4.0, breaking some OAuth providers.
/login
Spawner.user_options_form
Authenticator.post_spawn_stop
First preview release
JupyterHub also provides a REST API for administration of the Hub and users. The documentation on Using JupyterHub’s REST API provides information on:
The same JupyterHub API spec, as found here, is available in an interactive form here (on swagger’s petstore). The OpenAPI Initiative (fka Swagger™) is a project used to describe and document RESTful APIs.
JupyterHub API Reference:
jupyterhub.app
The multi-user notebook application
JupyterHub
jupyterhub.app.
An Application for starting a Multi-User Jupyter Notebook server.
active_server_limit
Maximum number of concurrent servers that can be active at a time.
Setting this can limit the total resources your users can consume.
An active server is any server that’s not fully stopped. It is considered active from the time it has been requested until the time that it has completely stopped.
If this many user servers are active, users will not be able to launch new servers until a server is shutdown. Spawn requests will be rejected with a 429 error asking them to try again.
If set to 0, no limit is enforced.
active_user_window
Duration (in seconds) to determine the number of active users.
activity_resolution
Resolution (in seconds) for updating activity
If activity is registered that is less than activity_resolution seconds more recent than the current value, the new value will be ignored.
This avoids too many writes to the Hub database.
Grant admin users permission to access single-user servers.
Users should be properly informed if this is enabled.
DEPRECATED since version 0.7.2, use Authenticator.admin_users instead.
allow_named_servers
Allow named single-user servers per user
answer_yes
Answer yes to any questions (e.g. confirm overwrite)
PENDING DEPRECATION: consider using services
Dict of token:username to be loaded into the database.
Allows ahead-of-time generation of API tokens for use by externally managed services, which authenticate as JupyterHub users.
Consider using services for general services that talk to the JupyterHub API.
authenticate_prometheus
Authentication for prometheus metrics
authenticator_class
Class for authenticating users.
This should be a subclass of jupyterhub.auth.Authenticator with an authenticate() method that: is a coroutine (asyncio or tornado) returns username on success, None on failure takes two arguments: (handler, data), where handler is the calling web.RequestHandler, and data is the POST form data from the login page. Changed in version 1.0: authenticators may be registered via entry points, e.g. c.JupyterHub.authenticator_class = 'pam'
This should be a subclass of jupyterhub.auth.Authenticator
jupyterhub.auth.Authenticator
with an authenticate() method that:
authenticate()
handler
Changed in version 1.0: authenticators may be registered via entry points, e.g. c.JupyterHub.authenticator_class = 'pam'
c.JupyterHub.authenticator_class = 'pam'
base_url
The base URL of the entire application.
Add this to the beginning of all JupyterHub URLs. Use base_url to run JupyterHub within an existing website.
bind_url
The public facing URL of the whole JupyterHub application.
This is the address on which the proxy will bind. Sets protocol, ip, base_url
cleanup_proxy
Whether to shutdown the proxy when the Hub shuts down.
Disable if you want to be able to teardown the Hub while leaving the proxy running.
Only valid if the proxy was starting by the Hub process.
If both this and cleanup_servers are False, sending SIGINT to the Hub will only shutdown the Hub, leaving everything else running.
The Hub should be able to resume from database state.
cleanup_servers
Whether to shutdown single-user servers when the Hub shuts down.
Disable if you want to be able to teardown the Hub while leaving the single-user servers running.
If both this and cleanup_proxy are False, sending SIGINT to the Hub will only shutdown the Hub, leaving everything else running.
concurrent_spawn_limit
Maximum number of concurrent users that can be spawning at a time.
Spawning lots of servers at the same time can cause performance problems for the Hub or the underlying spawning system. Set this limit to prevent bursts of logins from attempting to spawn too many servers at the same time.
This does not limit the number of total running servers. See active_server_limit for that.
If more than this many users attempt to spawn at a time, their requests will be rejected with a 429 error asking them to try again. Users will have to wait for some of the spawning services to finish starting before they can start their own.
config_file
The config file to load
confirm_no_ssl
DEPRECATED: does nothing
cookie_max_age_days
Number of days for a login cookie to be valid. Default is two weeks.
cookie_secret
The cookie secret to use to encrypt cookies.
Loaded from the JPY_COOKIE_SECRET env variable by default.
Should be exactly 256 bits (32 bytes).
cookie_secret_file
File in which to store the cookie secret.
data_files_path
The location of jupyterhub data files (e.g. /usr/local/share/jupyterhub)
db_kwargs
Include any kwargs to pass to the database connection. See sqlalchemy.create_engine for details.
db_url
url for the database. e.g. sqlite:///jupyterhub.sqlite
sqlite:///jupyterhub.sqlite
debug_db
log all database transactions. This has A LOT of output
debug_proxy
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug
default_server_name
If named servers are enabled, default name of server to spawn or open, e.g. by user-redirect.
default_url
The default URL for users when they arrive (e.g. when user directs to “/”)
By default, redirects users to their own server.
Can be a Unicode string (e.g. ‘/hub/home’) or a callable based on the handler object:
def default_url_fn(handler): user = handler.current_user if user and user.admin: return '/hub/admin' return '/hub/home' c.JupyterHub.default_url = default_url_fn
external_ssl_authorities
Dict authority:dict(files). Specify the key, cert, and/or ca file for an authority. This is useful for externally managed proxies that wish to use internal_ssl.
The files dict has this format (you must specify at least a cert):
{ 'key': '/path/to/key.key', 'cert': '/path/to/cert.crt', 'ca': '/path/to/ca.crt' }
The authorities you can override: ‘hub-ca’, ‘notebooks-ca’, ‘proxy-api-ca’, ‘proxy-client-ca’, and ‘services-ca’.
Use with internal_ssl
extra_handlers
Register extra tornado Handlers for jupyterhub.
Should be of the form ("<regex>", Handler)
("<regex>", Handler)
The Hub prefix will be added, so /my-page will be served at /hub/my-page.
/my-page
/hub/my-page
extra_log_file
DEPRECATED: use output redirection instead, e.g.
extra_log_handlers
Extra log handlers to set on JupyterHub logger
generate_certs
Generate certs used for internal ssl
generate_config
Generate default config file
The URL on which the Hub will listen. This is a private URL for internal communication. Typically set in combination with hub_connect_url. If a unix socket, hub_connect_url must also be set.
“http://127.0.0.1:8081” “unix+http://%2Fsrv%2Fjupyterhub%2Fjupyterhub.sock”
New in version 0.9.
The ip or hostname for proxies and spawners to use for connecting to the Hub.
Use when the bind address (hub_ip) is 0.0.0.0, :: or otherwise different from the connect address.
hub_ip
Default: when hub_ip is 0.0.0.0 or ::, use socket.gethostname(), otherwise use hub_ip.
socket.gethostname()
Note: Some spawners or proxy implementations might not support hostnames. Check your spawner or proxy documentation to see if they have extra requirements.
New in version 0.8.
hub_connect_port
DEPRECATED
Use hub_connect_url
Deprecated since version 0.9: Use hub_connect_url
hub_connect_url
The URL for connecting to the Hub. Spawners, services, and the proxy will use this URL to talk to the Hub.
Only needs to be specified if the default hub URL is not connectable (e.g. using a unix+http:// bind url).
JupyterHub.hub_connect_ip JupyterHub.hub_bind_url
The ip address for the Hub process to bind to.
By default, the hub listens on localhost only. This address must be accessible from the proxy and user servers. You may need to set this to a public ip or ‘’ for all interfaces if the proxy or user servers are in containers or on a different host.
See hub_connect_ip for cases where the bind and connect address should differ, or hub_bind_url for setting the full bind URL.
hub_port
The internal port for the Hub process.
This is the internal port of the hub itself. It should never be accessed directly. See JupyterHub.port for the public port to use when accessing jupyterhub. It is rare that this port should be set except in cases of port conflict.
See also hub_ip for the ip and hub_bind_url for setting the full bind URL.
implicit_spawn_seconds
Trigger implicit spawns after this many seconds.
When a user visits a URL for a server that’s not running, they are shown a page indicating that the requested server is not running with a button to spawn the server.
Setting this to a positive value will redirect the user after this many seconds, effectively clicking this button automatically for the users, automatically beginning the spawn process.
Warning: this can result in errors and surprising behavior when sharing access URLs to actual servers, since the wrong server is likely to be started.
init_spawners_timeout
Timeout (in seconds) to wait for spawners to initialize
Checking if spawners are healthy can take a long time if many spawners are active at hub start time.
If it takes longer than this timeout to check, init_spawner will be left to complete in the background and the http server is allowed to start.
A timeout of -1 means wait forever, which can mean a slow startup of the Hub but ensures that the Hub is fully consistent by the time it starts responding to requests. This matches the behavior of jupyterhub 1.0.
The location to store certificates automatically created by JupyterHub.
Enable SSL for all internal communication
This enables end-to-end encryption between all JupyterHub components. JupyterHub will automatically create the necessary certificate authority and sign notebook certificates as they’re created.
The public facing ip of the whole JupyterHub application (specifically referred to as the proxy).
This is the address on which the proxy will listen. The default is to listen on all interfaces. This is the only address through which JupyterHub should be accessed by users.
jinja_environment_options
Supply extra arguments that will be passed to Jinja environment.
last_activity_interval
Interval (in seconds) at which to update last-activity timestamps.
load_groups
Dict of ‘group’: [‘usernames’] to load at startup.
This strictly adds groups and users to groups.
Loading one set of groups, then starting JupyterHub again with a different set will not remove users or groups from previous launches. That must be done through the API.
log_datefmt
The date format used by logging formatters for %(asctime)s
log_format
The Logging format template
log_level
Set the log level by value or name.
logo_file
Specify path to a logo image to override the Jupyter logo in the banner.
named_server_limit_per_user
Maximum number of concurrent named servers that can be created by a user at a time.
Setting this can limit the total resources a user can consume.
pid_file
File to write PID Useful for daemonizing JupyterHub.
port
The public facing port of the proxy.
This is the port on which the proxy will listen. This is the only port through which JupyterHub should be accessed by users.
DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
proxy_auth_token
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token
proxy_check_interval
Interval (in seconds) at which to check if the proxy is running.
proxy_class
The class to use for configuring the JupyterHub proxy.
Should be a subclass of jupyterhub.proxy.Proxy. Changed in version 1.0: proxies may be registered via entry points, e.g. c.JupyterHub.proxy_class = 'traefik'
Should be a subclass of jupyterhub.proxy.Proxy.
jupyterhub.proxy.Proxy
Changed in version 1.0: proxies may be registered via entry points, e.g. c.JupyterHub.proxy_class = 'traefik'
c.JupyterHub.proxy_class = 'traefik'
proxy_cmd
DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command
recreate_internal_certs
Recreate all certificates used within JupyterHub on restart.
Note: enabling this feature requires restarting all notebook servers.
Redirect user to server (if running), instead of control panel.
reset_db
Purge and reset the database.
service_check_interval
Interval (in seconds) at which to check connectivity of services with web endpoints.
service_tokens
Dict of token:servicename to be loaded into the database.
Allows ahead-of-time generation of API tokens for use by externally managed services.
services
List of service specification dictionaries.
A service
services = [ { 'name': 'cull_idle', 'command': ['/path/to/cull_idle_servers.py'], }, { 'name': 'formgrader', 'url': 'http://127.0.0.1:1234', 'api_token': 'super-secret', 'environment': } ]
show_config
Instead of starting the Application, dump configuration to stdout
show_config_json
Instead of starting the Application, dump configuration to stdout (as JSON)
shutdown_on_logout
Shuts down all user servers on logout
The class to use for spawning single-user servers.
Should be a subclass of jupyterhub.spawner.Spawner. Changed in version 1.0: spawners may be registered via entry points, e.g. c.JupyterHub.spawner_class = 'localprocess'
Should be a subclass of jupyterhub.spawner.Spawner.
jupyterhub.spawner.Spawner
Changed in version 1.0: spawners may be registered via entry points, e.g. c.JupyterHub.spawner_class = 'localprocess'
c.JupyterHub.spawner_class = 'localprocess'
Path to SSL certificate file for the public facing interface of the proxy
When setting this, you should also set ssl_key
Path to SSL key file for the public facing interface of the proxy
When setting this, you should also set ssl_cert
statsd_host
Host to send statsd metrics to. An empty string (the default) disables sending metrics.
statsd_port
Port on which to send statsd metrics about the hub
statsd_prefix
Prefix to use for all metrics sent by jupyterhub to statsd
subdomain_host
Run single-user servers on subdomains of this host.
This should be the full https://hub.domain.tld[:port].
https://hub.domain.tld[:port]
Provides additional cross-site protections for javascript served by single-user servers.
Requires <username>.hub.domain.tld to resolve to the same host as hub.domain.tld.
<username>.hub.domain.tld
hub.domain.tld
In general, this is most easily achieved with wildcard DNS.
When using SSL (i.e. always) this also requires a wildcard SSL certificate.
Paths to search for jinja templates, before using the default templates.
template_vars
Extra variables to be passed into jinja templates
tornado_settings
Extra settings overrides to pass to the tornado application.
trust_user_provided_tokens
Trust user-provided tokens (via JupyterHub.service_tokens) to have good entropy.
If you are not inserting additional tokens via configuration file, this flag has no effect.
In JupyterHub 0.8, internally generated tokens do not pass through additional hashing because the hashing is costly and does not increase the entropy of already-good UUIDs.
User-provided tokens, on the other hand, are not trusted to have good entropy by default, and are passed through many rounds of hashing to stretch the entropy of the key (i.e. user-provided tokens are treated as passwords instead of random keys). These keys are more costly to check.
If your inserted tokens are generated by a good-quality mechanism, e.g. openssl rand -hex 32, then you can set this flag to True to reduce the cost of checking authentication tokens.
trusted_alt_names
Names to include in the subject alternative name.
These names will be used for server name verification. This is useful if JupyterHub is being run behind a reverse proxy or services using ssl are on different hosts.
trusted_downstream_ips
Downstream proxy IP addresses to trust.
This sets the list of IP addresses that are trusted and skipped when processing the X-Forwarded-For header. For example, if an external proxy is used for TLS termination, its IP address should be added to this list to ensure the correct client IP addresses are recorded in the logs instead of the proxy server’s IP address.
X-Forwarded-For
upgrade_db
Upgrade the database automatically on start.
Only safe if database is regularly backed up. Only SQLite databases will be backed up to a local file automatically.
user_redirect_hook
Callable to affect behavior of /user-redirect/
Receives 4 parameters: 1. path - URL path that was provided after /user-redirect/ 2. request - A Tornado HTTPServerRequest representing the current request. 3. user - The currently authenticated user. 4. base_url - The base_url of the current hub, for relative redirects
It should return the new URL to redirect to, or None to preserve current behavior.
jupyterhub.auth
Base Authenticator class and the default PAM Authenticator
jupyterhub.auth.
Base class for implementing an authentication provider for JupyterHub
Set of users that will have admin rights on this JupyterHub.
Admin access should be treated the same way root access is.
Defaults to an empty set, in which case no user has admin access.
Set of usernames that are allowed to log in.
Use this with supported authenticators to restrict which users can log in. This is an additional list that further restricts users, beyond whatever restrictions the authenticator has in place.
If empty, does not perform any additional restriction.
Changed in version 1.2: Authenticator.whitelist renamed to allowed_users
Authenticator.whitelist
auth_refresh_age
The max age (in seconds) of authentication info before forcing a refresh of user auth info.
Refreshing auth info allows, e.g. requesting/re-validating auth tokens.
See refresh_user() for what happens when user auth info is refreshed (nothing by default).
refresh_user()
auto_login
Automatically begin the login process
rather than starting with a “Login with…” link at /hub/login
To work, .login_url() must give a URL other than the default /hub/login, such as an oauth handler or another automatic login handler, registered with .get_handlers().
.login_url()
.get_handlers()
blocked_users
Set of usernames that are not allowed to log in.
Use this with supported authenticators to restrict which users can not log in. This is an additional block list that further restricts users, beyond whatever restrictions the authenticator has in place.
Changed in version 1.2: Authenticator.blacklist renamed to blocked_users
Delete any users from the database that do not pass validation
When JupyterHub starts, .add_user will be called on each user in the database to verify that all users are still valid.
.add_user
If delete_invalid_users is True, any users that do not pass validation will be deleted from the database. Use this if users might be deleted from an external system, such as local user accounts.
If False (default), invalid users remain in the Hub’s database and a warning will be issued. This is the default to avoid data loss due to config changes.
enable_auth_state
Enable persisting auth_state (if available).
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one (or more, separated by ;) 32B encryption keys. These can be either base64 or hex-encoded.
If encryption is unavailable, auth_state cannot be persisted.
New in JupyterHub 0.8
post_auth_hook
An optional hook function that you can implement to do some bootstrapping work during authentication. For example, loading user account details from an external system.
This function is called after the user has passed all authentication checks and is ready to successfully authenticate. This function must return the authentication dict reguardless of changes to it.
This maybe a coroutine.
Example:
import os, pwd def my_hook(authenticator, handler, authentication): user_data = pwd.getpwnam(authentication['name']) spawn_data = { 'pw_data': user_data 'gid_list': os.getgrouplist(authentication['name'], user_data.pw_gid) } if authentication['auth_state'] is None: authentication['auth_state'] = {} authentication['auth_state']['spawn_data'] = spawn_data return authentication c.Authenticator.post_auth_hook = my_hook
refresh_pre_spawn
Force refresh of auth prior to spawn.
This forces refresh_user() to be called prior to launching a server, to ensure that auth state is up-to-date.
This can be important when e.g. auth tokens that may have expired are passed to the spawner via environment variables from auth_state.
If refresh_user cannot refresh the user auth data, launch will fail until the user logs in again.
username_map
Dictionary mapping authenticator usernames to JupyterHub users.
Primarily used to normalize OAuth user names to local users.
username_pattern
Regular expression pattern that all valid usernames must match.
If a username does not match the pattern specified here, authentication will not be attempted.
If not set, allow any username.
whitelist
Deprecated, use Authenticator.allowed_users
add_user
Hook called when a user is added to JupyterHub
This method may be a coroutine.
By default, this just adds the user to the allowed_users set.
Subclasses may do more extensive things, such as adding actual unix users, but they should call super to ensure the allowed_users set is updated.
Note that this should be idempotent, since it is called whenever the hub restarts for all users.
Authenticate a user with login form data
This must be a coroutine.
It must return the username on successful authentication, and return None on failed authentication.
Checking allowed_users/blocked_users is handled separately by the caller.
Changed in version 0.8: Allow authenticate to return a dict containing auth_state.
The username of the authenticated user, or None if Authentication failed.
The Authenticator may return a dict instead, which MUST have a key name holding the username, and MAY have two optional keys set: auth_state, a dictionary of of auth state that will be persisted; and admin, the admin setting value for the user.
name
user (str or dict or None)
check_allowed
Check if a username is allowed to authenticate based on configuration
Return True if username is allowed, False otherwise. No allowed_users set means any username is allowed.
Names are normalized before being checked against the allowed set.
Changed in version 1.0: Signature updated to accept authentication data and any future changes
Changed in version 1.2: Renamed check_whitelist to check_allowed
check_blocked_users
Check if a username is blocked to authenticate based on Authenticator.blocked configuration
Return True if username is allowed, False otherwise. No block list means any username is allowed.
Names are normalized before being checked against the block list.
Changed in version 1.0: Signature updated to accept authentication data as second argument
Changed in version 1.2: Renamed check_blacklist to check_blocked_users
delete_user
Hook called when a user is deleted
Removes the user from the allowed_users set. Subclasses should call super to ensure the allowed_users set is updated.
get_authenticated_user
Authenticate the user who is attempting to log in
Returns user dict if successful, None otherwise.
This calls authenticate, which should be overridden in subclasses, normalizes the username if any normalization should be done, and then validates the name in the allowed set.
This is the outer API for authenticating a user. Subclasses should not override this method.
Changed in version 0.8: return dict instead of username
get_handlers
Return any custom handlers the authenticator needs to register
Used in conjugation with login_url and logout_url.
login_url
logout_url
('/url', Handler)
is_admin
Authentication helper to determine a user’s admin status.
The admin status of the user, or None if it could not be determined or should not change.
admin_status (Bool or None)
Override this when registering a custom login handler
Generally used by authenticators that do not use simple form-based authentication.
The subclass overriding this is responsible for making sure there is a handler available to handle the URL returned from this method, using the get_handlers method.
Override when registering a custom logout handler
Normalize the given username and return it
Override in subclasses if usernames need different normalization rules.
The default attempts to lowercase the username and apply username_map if it is set.
post_spawn_stop
Hook called after stopping a user container
Can be used to do auth-related cleanup, e.g. closing PAM sessions.
Hook called before spawning a user’s server
Can be used to do auth-related startup, e.g. opening PAM sessions.
refresh_user
Refresh auth data for a given user
Allows refreshing or invalidating auth data.
Only override if your authenticator needs to refresh its data about users once in a while.
Return True if auth data for the user is up-to-date and no updates are required.
Return False if the user’s auth data has expired, and they should be required to login again.
Return a dict of auth data if some values should be updated. This dict should have the same structure as that returned by authenticate() when it returns a dict. Any fields present will refresh the value for the user. Any fields not present will be left unchanged. This can include updating .admin or .auth_state fields.
.admin
.auth_state
auth_data (bool or dict)
run_post_auth_hook
Run the post_auth_hook if defined
The hook must always return the authentication dict
Authentication (dict)
validate_username
Validate a normalized username
Return True if username is valid, False otherwise.
Base class for Authenticators that work with local Linux/UNIX users
Checks for local users, and can attempt to create them if they exist.
add_user_cmd
The command to use for creating users as a list of strings
For each element in the list, the string USERNAME will be replaced with the user’s username. The username will also be appended as the final argument.
For Linux, the default value is:
[‘adduser’, ‘-q’, ‘–gecos’, ‘””’, ‘–disabled-password’]
To specify a custom home directory, set this to:
[‘adduser’, ‘-q’, ‘–gecos’, ‘””’, ‘–home’, ‘/customhome/USERNAME’, ‘–disabled-password’]
This will run the command:
adduser -q –gecos “” –home /customhome/river –disabled-password river
when the user ‘river’ is created.
allowed_groups
Allow login from all users in these UNIX groups.
If set, allowed username set is ignored.
If set to True, will attempt to create local system users if they do not exist already.
Supports Linux and BSD variants only.
group_whitelist
DEPRECATED: use allowed_groups
Dictionary of uids to use at user creation time. This helps ensure that users created from the database get the same uid each time they are created in temporary deployments or containers.
add_system_user
Create a new local UNIX user on the system.
Tested to work on FreeBSD and Linux, at least.
Hook called whenever a new user is added
If self.create_system_users, the user will attempt to be created if it doesn’t exist.
check_allowed_groups
If allowed_groups is configured, check if authenticating user is part of group.
system_user_exists
Check if the user exists on the system
Authenticate local UNIX users with PAM
Authoritative list of user groups that determine admin access. Users not in these groups can still be granted admin status through admin_users.
allowed/blocked rules still apply.
check_account
Whether to check the user’s account status via PAM during authentication.
The PAM account stack performs non-authentication based account management. It is typically used to restrict/permit access to a service and this step is needed to access the host’s user access control.
Disabling this can be dangerous as authenticated but unauthorized users may be granted access and, therefore, arbitrary execution on the system.
encoding
The text encoding to use when communicating with PAM
open_sessions
Whether to open a new PAM session when spawners are started.
This may trigger things like mounting shared filsystems, loading credentials, etc. depending on system configuration, but it does not always work.
If any errors are encountered when opening/closing PAM sessions, this is automatically set to False.
Round-trip the username via PAM lookups to make sure it is unique
PAM can accept multiple usernames that map to the same user, for example DOMAINusername in some cases. To prevent this, convert username into uid, then back to uid to normalize.
service
The name of the PAM service to use for authentication
DummyAuthenticator
Dummy Authenticator for testing
By default, any username + password is allowed If a non-empty password is set, any username will be allowed if it logs in with that password.
New in version 1.0.
Set a global password for all users wanting to log in.
This allows users with any username to log in with the same static password.
jupyterhub.spawner
Contains base Spawner class & default implementation
jupyterhub.spawner.
Base class for spawning single-user notebook servers.
Subclass this, and override the following methods:
As JupyterHub supports multiple users, an instance of the Spawner subclass is created for each user. If there are 20 JupyterHub users, there will be 20 instances of the subclass.
args
Extra arguments to be passed to the single-user server.
Some spawners allow shell-style expansion here, allowing you to use environment variables here. Most, including the default, do not. Consult the documentation for your spawner to verify!
auth_state_hook
An optional hook function that you can implement to pass auth_state to the spawner after it has been initialized but before it starts. The auth_state dictionary may be set by the .authenticate() method of the authenticator. This hook enables you to pass some or all of that information to your spawner.
def userdata_hook(spawner, auth_state): spawner.userdata = auth_state["userdata"] c.Spawner.auth_state_hook = userdata_hook
cmd
The command used for starting the single-user server.
Provide either a string or a list containing the path to the startup script command. Extra arguments, other than this path, should be provided via args.
This is usually set if you want to start the single-user server in a different python environment (with virtualenv/conda) than JupyterHub itself.
Some spawners allow shell-style expansion here, allowing you to use environment variables. Most, including the default, do not. Consult the documentation for your spawner to verify!
consecutive_failure_limit
Maximum number of consecutive failures to allow before shutting down JupyterHub.
This helps JupyterHub recover from a certain class of problem preventing launch in contexts where the Hub is automatically restarted (e.g. systemd, docker, kubernetes).
A limit of 0 means no limit and consecutive failures will not be tracked.
cpu_guarantee
Minimum number of cpu-cores a single-user notebook server is guaranteed to have available.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner, LocalProcessSpawner, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
cpu_limit
Maximum number of cpu-cores a single-user notebook server is allowed to use.
The single-user notebook server will never be scheduled by the kernel to use more cpu-cores than this. There is no guarantee that it can access this many cpu-cores.
debug
Enable debug-logging of the single-user server
The URL the single-user server should start in.
{username} will be expanded to the user’s username
Example uses:
notebook_dir
/tree/home/{username}
/notebooks
/tree
disable_user_config
Disable per-user configuration of single-user servers.
When starting the user’s single-user server, any config file found in the user’s $HOME directory will be ignored.
Note: a user could circumvent this if the user modifies their Python environment, such as when they have their own conda environments / virtualenvs / containers.
env_keep
List of environment variables for the single-user server to inherit from the JupyterHub process.
This list is used to ensure that sensitive information in the JupyterHub process’s environment (such as CONFIGPROXY_AUTH_TOKEN) is not passed to the single-user server’s process.
environment
Extra environment variables to set for the single-user server’s process.
The environment configurable should be set by JupyterHub administrators to add installation specific environment variables. It is a dict where the key is the name of the environment variable, and the value can be a string or a callable. If it is a callable, it will be called with one parameter (the spawner instance), and should return a string fairly quickly (no blocking operations please!).
Note that the spawner class’ interface is not guaranteed to be exactly same across upgrades, so if you are using the callable take care to verify it continues to work after upgrades!
Changed in version 1.2: environment from this configuration has highest priority, allowing override of ‘default’ env variables, such as JUPYTERHUB_API_URL.
http_timeout
Timeout (in seconds) before giving up on a spawned HTTP server
Once a server has successfully been spawned, this is the amount of time we wait before assuming that the server is unable to accept connections.
The IP address (or hostname) the single-user server should listen on.
The JupyterHub proxy implementation should be able to send packets to this interface.
mem_guarantee
Minimum number of bytes a single-user notebook server is guaranteed to have available.
mem_limit
Maximum number of bytes a single-user notebook server is allowed to use.
If the single user server tries to allocate more memory than this, it will fail. There is no guarantee that the single-user notebook server will be able to allocate this much memory - only that it can not allocate more than this.
Path to the notebook directory for the single-user server.
The user sees a file listing of this directory when the notebook interface is started. The current interface does not easily allow browsing beyond the subdirectories in this directory’s tree.
~ will be expanded to the home directory of the user, and {username} will be replaced with the name of the user.
Note that this does not prevent users from accessing files outside of this path! They can do so with many other means.
An HTML form for options a user can specify on launching their server.
The surrounding <form> element and the submit button are already provided.
<form>
Set your key: <input name="key" val="default_key"></input> <br> Choose a letter: <select name="letter" multiple="true"> <option value="A">The letter A</option> <option value="B">The letter B</option> </select>
The data from this form submission will be passed on to your spawner in self.user_options
Instead of a form snippet string, this could also be a callable that takes as one parameter the current spawner instance and returns a string. The callable will be called asynchronously if it returns a future, rather than a str. Note that the interface of the spawner class is not deemed stable across versions, so using this functionality might cause your JupyterHub upgrades to break.
Interpret HTTP form data
Form data will always arrive as a dict of lists of strings. Override this function to understand single-values, numbers, etc.
This should coerce form data into the structure expected by self.user_options, which must be a dict, and should be JSON-serializeable, though it can contain bytes in addition to standard JSON data types.
This method should not have any side effects. Any handling of user_options should be done in .start() to ensure consistent behavior across servers spawned via the API and form submission page.
user_options
.start()
Instances will receive this data on self.user_options, after passing through this function, prior to Spawner.start.
Changed in version 1.0: user_options are persisted in the JupyterHub database to be reused on subsequent spawns if no options are given. user_options is serialized to JSON as part of this persistence (with additional support for bytes in case of uploaded file data), and any non-bytes non-jsonable values will be replaced with None if the user_options are re-used.
poll_interval
Interval (in seconds) on which to poll the spawner for single-user server’s status.
At every poll interval, each spawner’s .poll method is called, which checks if the single-user server is still running. If it isn’t running, then JupyterHub modifies its own state accordingly and removes appropriate routes from the configurable proxy.
.poll
The port for single-user servers to listen on.
Defaults to 0, which uses a randomly allocated port number each time.
0
If set to a non-zero value, all Spawners will use the same port, which only makes sense if each server is on a different address, e.g. in containers.
New in version 0.7.
post_stop_hook
An optional hook function that you can implement to do work after the spawner stops.
This can be set independent of any concrete spawner implementation.
pre_spawn_hook
An optional hook function that you can implement to do some bootstrapping work before the spawner starts. For example, create a directory for your user or load initial content.
from subprocess import check_call def my_hook(spawner): username = spawner.user.name check_call(['./examples/bootstrap-script/bootstrap.sh', username]) c.Spawner.pre_spawn_hook = my_hook
ssl_alt_names
List of SSL alt names
May be set in config if all spawners should have the same value(s), or set at runtime by Spawner that know their names.
ssl_alt_names_include_local
Whether to include DNS:localhost, IP:127.0.0.1 in alt names
start_timeout
Timeout (in seconds) before giving up on starting of single-user server.
This is the timeout for start to return, not the timeout for the server to respond. Callers of spawner.start will assume that startup has failed if it takes longer than this. start should return when the server process is started and its location is known.
create_certs
Create and set ownership for the certs to be used for internal ssl
Path to cert files and CA
This method creates certs for use with the singleuser notebook. It enables SSL and ensures that the notebook can perform bi-directional SSL auth with the hub (verification based on CA).
If the singleuser host has a name or ip other than localhost, an appropriate alternative name(s) must be passed for ssl verification by the hub to work. For example, for Jupyter hosts with an IP of 10.10.10.10 or DNS name of jupyter.example.com, this would be:
alt_names=[“IP:10.10.10.10”] alt_names=[“DNS:jupyter.example.com”]
respectively. The list can contain both the IP and DNS names to refer to the host by either IP or DNS name (note the default_names below).
default_names
format_string
Render a Python format string
Uses Spawner.template_namespace() to populate format namespace.
Spawner.template_namespace()
get_args
Return the arguments to be passed after self.cmd
Doesn’t expect shell expansion to happen.
get_env
Return the environment dict to use for the Spawner.
This applies things like env_keep, anything defined in Spawner.environment, and adds the API token to the env.
When overriding in subclasses, subclasses must call super().get_env(), extend the returned dict and return it.
super().get_env()
Use this to access the env in Spawner.start to allow extension in subclasses.
get_state
Save state of spawner into database.
A black box of extra state for custom spawners. The returned value of this is passed to load_state.
load_state
Subclasses should call super().get_state(), augment the state returned from there, and return that state.
super().get_state()
move_certs
Takes certificate paths and makes them available to the notebook server
.move_certs is called after certs for the singleuser notebook have been created by create_certs.
By default, certs are created in a standard, central location defined by internal_certs_location. For a local, single-host deployment of JupyterHub, this should suffice. If, however, singleuser notebooks are spawned on other hosts, .move_certs should be overridden to move these files appropriately. This could mean using scp to copy them to another host, moving them to a volume mounted in a docker container, or exporting them as a secret in kubernetes.
scp
poll
Check if the single-user process is running
State transitions, behavior, and return response:
Design assumptions about when poll may be called:
Start the single-user server
Changed in version 0.7: Return ip, port instead of setting on self.user.server directly.
Stop the single-user server
If now is False (default), shutdown the server as gracefully as possible, e.g. starting with SIGINT, then SIGTERM, then SIGKILL. If now is True, terminate the server immediately.
now
The coroutine should return when the single-user server process is no longer running.
Must be a coroutine.
template_namespace
Return the template namespace for format-string formatting.
Currently used on default_url and notebook_dir.
Subclasses may add items to the available namespace.
The default implementation includes:
{ 'username': user.name, 'base_url': users_base_url, }
A Spawner that uses subprocess.Popen to start single-user servers as local processes.
subprocess.Popen
Requires local UNIX users matching the authenticated users to exist. Does not work on Windows.
This is the default spawner for JupyterHub.
Note: This spawner does not implement CPU / memory guarantees and limits.
interrupt_timeout
Seconds to wait for single-user server process to halt after SIGINT.
If the process has not exited cleanly after this many seconds, a SIGTERM is sent.
kill_timeout
Seconds to wait for process to halt after SIGKILL before giving up.
If the process does not exit cleanly after this many seconds of SIGKILL, it becomes a zombie process. The hub process will log a warning and then give up.
popen_kwargs
Extra keyword arguments to pass to Popen
when spawning single-user servers.
popen_kwargs = dict(shell=True)
shell_cmd
Specify a shell command to launch.
The single-user command will be appended to this list, so it sould end with -c (for bash) or equivalent.
-c
c.LocalProcessSpawner.shell_cmd = ['bash', '-l', '-c']
to launch with a bash login shell, which would set up the user’s own complete environment.
Warning
Using shell_cmd gives users control over PATH, etc., which could change what the jupyterhub-singleuser launch command does. Only use this for trusted users.
term_timeout
Seconds to wait for single-user server process to halt after SIGTERM.
If the process does not exit cleanly after this many seconds of SIGTERM, a SIGKILL is sent.
jupyterhub.proxy
API for JupyterHub’s proxy.
Custom proxy implementations can subclass Proxy and register in JupyterHub config:
from mymodule import MyProxy c.JupyterHub.proxy_class = MyProxy
Route Specification:
jupyterhub.proxy.
Base class for configurable proxies that JupyterHub can use.
A proxy implementation should subclass this and must define the following methods:
get_all_routes()
add_route()
In addition to these, the following method(s) may need to be implemented:
start()
should_start
stop()
And the following method(s) are optional, but can be provided:
get_route()
Should the Hub start the proxy
If True, the Hub will start the proxy and stop it. Set to False if the proxy is managed externally, such as by systemd, docker, or another service manager.
add_all_services
Update the proxy table from the database.
Used when loading up a new proxy.
add_all_users
add_hub_route
Add the default route for the Hub
Add a route to the proxy.
Subclasses must define this method
Will raise an appropriate Exception (FIXME: find what?) if the route could not be added.
The proxy implementation should also have a way to associate the fact that a route came from JupyterHub.
add_service
Add a service’s server to the proxy table.
Add a user’s server to the proxy table.
check_routes
Check that all users are properly routed on the proxy.
Delete a route with a given routespec if it exists.
delete_service
Remove a service’s server from the proxy table.
Remove a user’s server from the proxy table.
get_all_routes
Fetch and return all the routes associated by JupyterHub from the proxy.
Should return a dictionary of routes, where the keys are routespecs and each value is a dict of the form:
{ 'routespec': the route specification ([host]/path/) 'target': the target host URL (proto://host) for this route 'data': the attached data dict for this route (as specified in add_route) }
get_route
Return the route info for a given routespec.
host.tld/path/
'routespec': The normalized route specification passed in to add_route ([host]/path/) 'target': The target host for this route (proto://host) 'data': The arbitrary data dict that was passed in by JupyterHub when adding this route.
None: if there are no routes matching the given routespec
Start the proxy.
Will be called during startup if should_start is True.
Subclasses must define this method if the proxy is to be started by the Hub
Stop the proxy.
Will be called during teardown if should_start is True.
validate_routespec
Validate a routespec
Proxy implementation for the default configurable-http-proxy.
This is the default proxy implementation for running the nodejs proxy configurable-http-proxy.
If the proxy should not be run as a subprocess of the Hub, (e.g. in a separate container), set:
The ip (or hostname) of the proxy’s API endpoint
auth_token
The Proxy auth token
Loaded from the CONFIGPROXY_AUTH_TOKEN env variable by default.
The command to start the proxy
concurrency
The number of requests allowed to be concurrently outstanding to the proxy
Limiting this number avoids potential timeout errors by sending too many requests to update the proxy at once
Add debug-level logging to the Proxy.
File in which to write the PID of the proxy process.
jupyterhub.user
UserDict
jupyterhub.user.
Like defaultdict, but for users
Users can be retrieved by:
A User wrapper object is always returned.
This dict contains at least all active users, but not necessarily all users in the database.
Checking key in userdict returns whether an item is already in the cache, not whether it is in the database.
key in userdict
Changed in version 1.2: 'username' in userdict pattern is now supported
'username' in userdict
add
Add a user to the UserDict
count_active_users
Count the number of user servers that are active/pending/ready
Returns dict with counts of active/pending/ready servers
delete
Delete a user from the cache and the database
get
Retrieve a User object if it can be found, else default
Lookup can be by User object, id, or name
Changed in version 1.2: get() accesses the database instead of just the cache by integer id, so is equivalent to catching KeyErrors on attempted lookup.
get()
User
High-level wrapper around an orm.User object
The user’s name
The user’s Server data object if running, None otherwise. Has ip, port attributes.
spawner
The user’s Spawner instance.
escaped_name
My name, escaped for use in URLs, cookies, etc.
jupyterhub.services.service
A service is a process that talks to JupyterHub.
An externally managed service running on a URL:
{ 'name': 'my-service', 'url': 'https://host:8888', 'admin': True, 'api_token': 'super-secret', }
A hub-managed service with no URL:
{ 'name': 'cull-idle', 'command': ['python', '/path/to/cull-idle'] 'admin': True, }
Service
jupyterhub.services.service.
An object wrapping a service specification for Hub API consumers.
A service has inputs:
If a service is to be managed by the Hub, it has a few extra options:
kind
The name of the kind of service as a string
managed
Am I managed by the Hub?
jupyterhub.services.auth
Authenticating services with JupyterHub.
Cookies are sent to the Hub for verification. The Hub replies with a JSON model describing the authenticated user.
HubAuth can be used in any application, even outside tornado.
HubAuthenticated is a mixin class for tornado handlers that should authenticate with the Hub.
jupyterhub.services.auth.
A class for authenticating with JupyterHub
This can be used by any application.
If using tornado, use via HubAuthenticated mixin. If using manually, use the .user_for_cookie(cookie_value) method to identify the user corresponding to a given cookie value.
.user_for_cookie(cookie_value)
The following config must be set:
The following config MAY be set:
API key for accessing Hub API.
Generate with jupyterhub token [username] or add to JupyterHub.services config.
jupyterhub token [username]
The base API URL of the Hub.
Typically http://hub-ip:hub-port/hub/api
http://hub-ip:hub-port/hub/api
The base URL prefix of this application
e.g. /services/service-name/ or /user/name/
Default: get from JUPYTERHUB_SERVICE_PREFIX
cache_max_age
The maximum time (in seconds) to cache the Hub’s responses for authentication.
A larger value reduces load on the Hub and occasional response lag. A smaller value reduces propagation time of changes on the Hub (rare).
Default: 300 (five minutes)
certfile
The ssl cert to use for requests
Use with keyfile
client_ca
The ssl certificate authority to use to verify requests
Use with keyfile and certfile
cookie_name
The name of the cookie I should be looking for
cookie_options
Additional options to pass when setting cookies.
Can include things like expires_days=None for session-expiry or secure=True if served on HTTPS and default HTTPS discovery fails (e.g. behind some proxies).
expires_days=None
secure=True
hub_host
The public host of JupyterHub
Only used if JupyterHub is spreading servers across subdomains.
hub_prefix
The URL prefix for the Hub itself.
Typically /hub/
keyfile
The ssl key to use for requests
Use with certfile
The login URL to use
Typically /hub/login
get_session_id
Get the jupyterhub session id
from the jupyterhub-session-id cookie.
get_token
Get the user token from a request
get_user
Get the Hub user for a given tornado handler.
Checks cookie with the Hub to identify the current user.
The ‘name’ field contains the user’s name.
user_for_cookie
Ask the Hub to identify the user for a given cookie.
The user model, if a user is identified, None if authentication fails.
user_model (dict)
user_for_token
Ask the Hub to identify the user for a given token.
HubOAuth
HubAuth using OAuth for login instead of cookies set by the Hub.
oauth_authorization_url
The URL to redirect to when starting the OAuth process
oauth_client_id
The OAuth client ID for this application.
Use JUPYTERHUB_CLIENT_ID by default.
oauth_redirect_uri
OAuth redirect URI
Should generally be /base_url/oauth_callback
oauth_token_url
The URL for requesting an OAuth token from JupyterHub
clear_cookie
Clear the OAuth cookie
Use OAuth client_id for cookie name
because we don’t want to use the same cookie name across OAuth clients.
generate_state
Generate a state string, given a next_url redirect target
get_next_url
Get the next_url for redirection, given an encoded OAuth state
get_state_cookie_name
Get the cookie name for oauth state, given an encoded OAuth state
Cookie name is stored in the state itself because the cookie name is randomized to deal with races between concurrent oauth sequences.
set_cookie
Set a cookie recording OAuth result
set_state_cookie
Generate an OAuth state and store it in a cookie
state (str)
The OAuth state that has been stored in the cookie (url safe, base64-encoded)
state_cookie_name
The cookie name for storing OAuth state
This cookie is only live for the duration of the OAuth handshake.
token_for_code
Get token for OAuth temporary code
This is the last step of OAuth login. Should be called in OAuth Callback handler.
Mixin for tornado handlers that are authenticated with JupyterHub
A handler that mixes this in must have the following attributes/properties:
Examples:
allow_all
Property indicating that all successfully identified user or service should be allowed.
check_hub_user
Check whether Hub-authenticated user or service should be allowed.
Returns the input if the user should be allowed, None otherwise.
Override if you want to check anything other than the username’s presence in hub_users list.
Tornado’s authentication method
get_login_url
Return the Hub’s login URL
hub_auth_class
alias of HubAuth
HubOAuthenticated
Simple subclass of HubAuthenticated using OAuth instead of old shared cookies
HubOAuthCallbackHandler
OAuth Callback handler
Finishes the OAuth flow, setting a cookie to record the user’s info.
Should be registered at SERVICE_PREFIX/oauth_callback
SERVICE_PREFIX/oauth_callback
We want you to contribute to JupyterHub in ways that are most exciting & useful to you. We value documentation, testing, bug reporting & code equally, and are glad to have your contributions in whatever form you wish :)
Our Code of Conduct (reporting guidelines) helps keep our community welcoming to as many people as possible.
We use Discourse <https://discourse.jupyter.org> for online discussion. Everyone in the Jupyter community is welcome to bring ideas and questions there. In addition, we use Gitter for online, real-time text chat, a place for more ephemeral discussions. The primary Gitter channel for JupyterHub is jupyterhub/jupyterhub. Gitter isn’t archived or searchable, so we recommend going to discourse first to make sure that discussions are most useful and accessible to the community. Remember that our community is distributed across the world in various timezones, so be patient if you do not get an answer immediately!
Discourse <https://discourse.jupyter.org>
GitHub issues are used for most long-form project discussions, bug reports and feature requests. Issues related to a specific authenticator or spawner should be directed to the appropriate repository for the authenticator or spawner. If you are using a specific JupyterHub distribution (such as Zero to JupyterHub on Kubernetes or The Littlest JupyterHub), you should open issues directly in their repository. If you can not find a repository to open your issue in, do not worry! Create it in the main JupyterHub repository and our community will help you figure it out.
A mailing list for all of Project Jupyter exists, along with one for teaching with Jupyter.
JupyterHub can only run on MacOS or Linux operating systems. If you are using Windows, we recommend using VirtualBox or a similar system to run Ubuntu Linux for development.
JupyterHub is written in the Python programming language, and requires you have at least version 3.5 installed locally. If you haven’t installed Python before, the recommended way to install it is to use miniconda. Remember to get the ‘Python 3’ version, and not the ‘Python 2’ version!
configurable-http-proxy, the default proxy implementation for JupyterHub, is written in Javascript to run on NodeJS. If you have not installed nodejs before, we recommend installing it in the miniconda environment you set up for Python. You can do so with conda install nodejs.
miniconda
conda install nodejs
JupyterHub uses git & GitHub for development & collaboration. You need to install git to work on JupyterHub. We also recommend getting a free account on GitHub.com.
When developing JupyterHub, you need to make changes to the code & see their effects quickly. You need to do a developer install to make that happen.
This guide does not attempt to dictate how development environements should be isolated since that is a personal preference and can be achieved in many ways, for example tox, conda, docker, etc. See this forum thread for a more detailed discussion.
tox
Clone the JupyterHub git repository to your computer.
git clone https://github.com/jupyterhub/jupyterhub cd jupyterhub
Make sure the python you installed and the npm you installed are available to you on the command line.
python
python -V
This should return a version number greater than or equal to 3.5.
npm -v
This should return a version number greater than or equal to 5.0.
Install configurable-http-proxy. This is required to run JupyterHub.
npm install -g configurable-http-proxy
If you get an error that says Error: EACCES: permission denied, you might need to prefix the command with sudo. If you do not have access to sudo, you may instead run the following commands:
Error: EACCES: permission denied
npm install configurable-http-proxy export PATH=$PATH:$(pwd)/node_modules/.bin
The second line needs to be run every time you open a new terminal.
Install the python packages required for JupyterHub development.
python3 -m pip install -r dev-requirements.txt python3 -m pip install -r requirements.txt
Setup a database.
The default database engine is sqlite so if you are just trying to get up and running quickly for local development that should be available via python. See The Hub’s Database for details on other supported databases.
sqlite
Install the development version of JupyterHub. This lets you edit JupyterHub code in a text editor & restart the JupyterHub process to see your code changes immediately.
python3 -m pip install --editable .
You are now ready to start JupyterHub!
You can access JupyterHub from your browser at http://localhost:8000 now.
Happy developing!
To simplify testing of JupyterHub, it’s helpful to use DummyAuthenticator instead of the default JupyterHub authenticator and SimpleLocalProcessSpawner instead of the default spawner.
There is a sample configuration file that does this in testing/jupyterhub_config.py. To launch jupyterhub with this configuration:
testing/jupyterhub_config.py
jupyterhub -f testing/jupyterhub_config.py
The default JupyterHub authenticator & spawner require your system to have user accounts for each user you want to log in to JupyterHub as.
DummyAuthenticator allows you to log in with any username & password, while SimpleLocalProcessSpawner allows you to start servers without having to create a unix user for each JupyterHub user. Together, these make it much easier to test JupyterHub.
Tip: If you are working on parts of JupyterHub that are common to all authenticators & spawners, we recommend using both DummyAuthenticator & SimpleLocalProcessSpawner. If you are working on just authenticator related parts, use only SimpleLocalProcessSpawner. Similarly, if you are working on just spawner related parts, use only DummyAuthenticator.
This section lists common ways setting up your development environment may fail, and how to fix them. Please add to the list if you encounter yet another way it can fail!
lessc
If the python3 -m pip install --editable . command fails and complains about lessc being unavailable, you may need to explicitly install some additional JavaScript dependencies:
npm install
This will fetch client-side JavaScript dependencies necessary to compile CSS.
You may also need to manually update JavaScript and CSS after some development updates, with:
python3 setup.py js # fetch updated client-side js python3 setup.py css # recompile CSS from LESS sources
Documentation is often more important than code. This page helps you get set up on how to contribute documentation to JupyterHub.
We use sphinx to build our documentation. It takes our documentation source files (written in markdown or reStructuredText & stored under the docs/source directory) and converts it into various formats for people to read. To make sure the documentation you write or change renders correctly, it is good practice to test it locally.
docs/source
Make sure you have successfuly completed Setting up a development install.
Install the packages required to build the docs.
python3 -m pip install -r docs/requirements.txt
Build the html version of the docs. This is the most commonly used output format, so verifying it renders as you should is usually good enough.
cd docs make html
This step will display any syntax or formatting errors in the documentation, along with the filename / line number in which they occurred. Fix them, and re-run the make html command to re-render the documentation.
make html
View the rendered documentation by opening build/html/index.html in a web browser.
build/html/index.html
Tip
On macOS, you can open a file from the terminal with open <path-to-file>. On Linux, you can do the same with xdg-open <path-to-file>.
open <path-to-file>
xdg-open <path-to-file>
This section lists various conventions we use in our documentation. This is a living document that grows over time, so feel free to add to it / change it!
Our entire documentation does not yet fully conform to these conventions yet, so help in making it so would be appreciated!
There are many ways to invoke a pip command, we recommend the following approach:
python3 -m pip
This invokes pip explicitly using the python3 binary that you are currently using. This is the recommended way to invoke pip in our documentation, since it is least likely to cause problems with python3 and pip being from different environments.
For more information on how to invoke pip commands, see the pip documentation.
Unit test help validate that JupyterHub works the way we think it does, and continues to do so when changes occur. They also help communicate precisely what we expect our code to do.
JupyterHub uses pytest for all our tests. You can find them under jupyterhub/tests directory in the git repository.
jupyterhub/tests
Make sure you have completed Setting up a development install. You should be able to start jupyterhub from the commandline & access it from your web browser. This ensures that the dev environment is properly set up for tests to run.
You can run all tests in JupyterHub
pytest -v jupyterhub/tests
This should display progress as it runs all the tests, printing information about any test failures as they occur.
If you wish to confirm test coverage the run tests with the --cov flag:
--cov
pytest -v --cov=jupyterhub jupyterhub/tests
You can also run tests in just a specific file:
pytest -v jupyterhub/tests/<test-file-name>
To run a specific test only, you can do:
pytest -v jupyterhub/tests/<test-file-name>::<test-name>
This runs the test with function name <test-name> defined in <test-file-name>. This is very useful when you are iteratively developing a single test.
<test-name>
<test-file-name>
For example, to run the test test_shutdown in the file test_api.py, you would run:
test_shutdown
test_api.py
pytest -v jupyterhub/tests/test_api.py::test_shutdown
Make sure you have completed all the steps in Setting up a development install successfully, and can launch jupyterhub from the terminal.
This roadmap collects “next steps” for JupyterHub. It is about creating a shared understanding of the project’s vision and direction amongst the community of users, contributors, and maintainers. The goal is to communicate priorities and upcoming release plans. It is not a aimed at limiting contributions to what is listed here.
All of the community is encouraged to provide feedback as well as share new ideas with the community. Please do so by submitting an issue. If you want to have an informal conversation first use one of the other communication channels. After submitting the issue, others from the community will probably respond with questions or comments they have to clarify the issue. The maintainers will help identify what a good next step is for the issue.
When submitting an issue, think about what “next step” category best describes your issue:
The roadmap will get updated as time passes (next review by 1st December) based on discussions and ideas captured as issues. This means this list should not be exhaustive, it should only represent the “top of the stack” of ideas. It should not function as a wish list, collection of feature requests or todo list. For those please create a new issue.
The roadmap should give the reader an idea of what is happening next, what needs input and discussion before it can happen and what has been postponed.
JupyterHub is a dependable tool used by humans that reduces the complexity of creating the environment in which a piece of software can be executed.
These “Now” items are considered active areas of focus for the project:
These “Soon” items are under discussion. Once an item reaches the point of an actionable plan, the item will be moved to the “Now” section. Typically, these will be moved at a future review of the roadmap.
The “Later” items are things that are at the back of the project’s mind. At this time there is no active plan for an item. The project would like to find the resources and time to discuss these ideas.
If you find a security vulnerability in Jupyter or JupyterHub, whether it is a failure of the security model described in Security Overview or a failure in implementation, please report it to security@ipython.org.
If you prefer to encrypt your security reports, you can use this PGP public key.
this PGP public key
JupyterHub is an open source project and community. It is a part of the Jupyter Project. JupyterHub is an open and inclusive community, and invites contributions from anyone. This section covers information about our community, as well as ways that you can connect and get involved.
Project Jupyter thanks the following people for their help and contribution on JupyterHub:
A JupyterHub Community Resource
We’ve compiled this list of JupyterHub deployments to help the community see the breadth and growth of JupyterHub’s use in education, research, and high performance computing.
Please submit pull requests to update information or to add new institutions or uses.
BIDS - Berkeley Institute for Data Science
Data 8
NERSC
Research IT
Although not technically a JupyterHub deployment, this tutorial setup may be helpful to others in the Jupyter community.
Thank you C. Titus Brown for sharing this with the Software Carpentry mailing list.
* I started a big Amazon machine; * I installed Docker and built a custom image containing my software of interest; * I ran multiple containers, one connected to port 8000, one on 8001, etc. and gave each student a different port; * students could connect in and use the Terminal program in Jupyter to execute commands, and could upload/download files via the Jupyter console interface; * in theory I could have used notebooks too, but for this I didn’t have need. I am aware that JupyterHub can probably do all of this including manage the containers, but I’m still a bit shy of diving into that; this was fairly straightforward, gave me disposable containers that were isolated for each individual student, and worked almost flawlessly. Should be easy to do with RStudio too.
Advanced Computing
(CU Research Computing) CURC
Earth Lab at CU
San Diego Supercomputer Center - Andrea Zonca
Educational Technology Services - Paul Jamason
Kristen Thyng - Oceanography
Everware Reproducible and reusable science powered by jupyterhub and docker. Like nbviewer, but executable. CERN, Geneva website