Isolating a Jupyterhub deployment

In the research group that I am a part of we use Jupyter and associated projects a lot. In addition to the local Jupyter instances that people may run on their private machines, we also have a Jupyterhub deployment that spawns Jupyter servers in Docker containers that we use for research purposes, as well as other deployments that we use for guest researchers and teaching, among other things.

One really useful recent addition to the Jupyter ecosystem is an authenticator plugin for Jupyterhub by yuvipanda that will give a user a temporary account that will expire when they log out. Along with the idle notebook culler, this effectively allows us to set up a tmpnb deployment, but using all the existing infrastructure we have for deploying and managing Jupyterhub instances. We want to use this to host an interactive tutorial for our quantum transport simulation tool, Kwant, that anyone can try out from wherever they are!

While this would be really awesome, there is currently one problem: we run everything on our own hardware in the university, so giving random people on the internet access to a Jupyter notebook servers inside the university firewall is a recipe for disaster. To get around this problem we will use the networking capabilites of Docker along with a few iptables rules to secure our deployment.

Docker networking

When you create a new Docker container it will, by default, be attached to the default network bridge used by Docker. All containers connected to the same bridge will be on the same IP subnet. Restricting access between containers in this configuration is possible but cumbersome (you'd need to write firewalls rules targeting each container individually). It is much simpler to first create a new "docker network", to which you attach all the containers you want to have a similar network configuration.

$ docker network create --driver=bridge my_new_network
48d08d196dc853e58c6115a6fab96ce84028ab68d6fa5d596c91adb406efb3ac

The above command creates a "network" called my_new_network, which we can attach newly created containers to when invoking docker run:

$ docker run --network=my_new_network debian:lastest

In the context of Jupyterhub, this last step is actually done with the following configuration in jupyterhub_config.py:

c.DockerSpawner.network_name = 'my_new_network'

when we execute docker network create the Docker daemon actually creates a virtual ethernet bridge in the kernel. We can inspect this with brctl.

$ brctl show
bridge name bridge id       STP enabled interfaces
br-48d08d196dc8     8000.024245cf35a7   no
docker0     8000.0242874f9221   no

We can see that our new docker network actually corresponds to the bridge interface br-48d08d196dc8. When a new Docker container is created its virtual network interface is attached to this bridge interface; just like if a physical machine was plugged into an ethernet switch.

If we want a more manageable name for the virtual bridge, say my_bridge, we can pass it as an argument to docker network create:

$ docker network create --driver=bridge -o "com.docker.network.bridge.name"="my_bridge" my_network

Applying IPTables rules

We can now use the bridge interface in IPTables rules to control access to docker containers connected to it. For example, if we want to prevent all containers on the network from accessing the internet, we could apply the following IPTables rule:

$ iptables -I DOCKER-ISOLATION -i my_bridge -o !my_bridge -m conntrack --cstate NEW -j REJECT

The above command says the following: Please reject TCP packets that arrive on my_bridge and are destined for a different interface, and which correspond to a new connection (i.e. they have the SYN flag set), and insert this rule before any others on the DOCKER-ISOLATION chain. The DOCKER-ISOLATION chain is installed by the Docker daemon when it is installed, and is jumped to from the FORWARD chain.

One final thing to be aware of is the kernel configuration setting net.bridge.bridge-nf-call-iptables. The docker containers are connected to the same network bridge, which operates on the link layer. This means that packets destined for hosts attached to the same bridge don't need to go up to the IP layer of the network stack for the kernel to process them, which means that in principle IPTables does not act on packets that are exchanged between containers on the docker network. This behaviour can, however, be controlled with the above kernel configuration. This could be useful if, for example, we want to prevent any traffic between containers on my_new_network:

$ sysctl net.bridge.bridge-nf-call-iptables=1
$ iptables -I DOCKER-ISOLATION -i my_bridge -o my_bridge -j DROP