Why and how to use custom Docker environment on DeepNote
To ensure that dependencies are installed before we run a notebook, it is common to see a bunch of
pip install ,
apt-get commands at the beginning.
DeepNote offers an elegant way to manage these installation scripts with the
init.ipynb, so we could keep our main notebook file clean and ensure that the dependencies are installed every time we spin up the virtual machine.
This works well when the number of packages are relatively small and when we have to experiment on different packages. However, there are some drawbacks.
Problem of dependency versions
It is a good practice to craft carefully the dependency versions in our installation scripts. However, most of us would rather work on the fun part of the notebook than testing dependency versions. When the versions are not specified, the dependency versions may be different each time we run the notebook and even break someday when certain parts of the library deprecates.
Start up speed
Installing dependencies takes time. When the number of required packages grow, the startup time of the environment grows.
Using custom Docker images
- the installed packages are unlikely to change, for example you need the AWS CLI in the environment and it is unlikely that you will remove it, or
- you would like to share the environment on multiple projects
Using custom Docker image for the environment should be a better choice than installing dependencies every time you run the notebook and this is one of the unique features that DeepNote offers.
How to create and use a custom Docker environment?
Using the customer Docker environment is easy on DeepNote. According to the documentation, you just have to fulfill these 2 requirements:
- Have the
pythoncommand with Python version > 3.6
Building the environment with a Dockerfile
Since I use Google Cloud, I will demonstrate how to use the dockerized Google Cloud SDK custom environment.
First, click the Environment tab on the left. Choose Local ./Dockerfile and click the Dockerfile anchor to edit your custom image.
Note: Make sure the machine is turned off or the Dockerfile cannot be edited.
Then, copy the following lines into the editor and don’t forget to press Build.
By default, the
python command links to Python 2 in this image. To fulfill the above-mentioned requirement, line 3
ln -s ... creates a symbolic link to link the
python command to
FROM google/cloud-sdk:latestRUN ln -s /usr/bin/python3 /usr/local/bin/python
Note: Using the
latest tag may not be ideal if you want to keep the same version between builds.
Once you start the machine, you can try to use the gcloud commands under the Terminal on DeepNote. It seems that the Terminal could not handle color codes yet. I believe improvements will be made.
Hope this post helps. Thank you for reading.