How to NOT embedded credential in Jupyter notebook

Heang Yuthakarn
4 min readFeb 10, 2020

--

No more hard code your credential in your Jupyter notebook

Photo by Erik Mclean on Unsplash

TL;DR: This article will show you how to use environment variables to keep your credential outside your Jupyter notebook.

  1. The article will show how to manually add environment variables in Windows and Mac (or Linux).
  2. It will also show how to use python-dotenv package to make adding environment variables more flexible and not depend on the OS.
  3. The last part of this article will show you how to automatically load credentials that set up using python-dotenv everytime you open Jupyter notebook.

Did you ever see code like this in Jupyter?

You can see that the username and password are shown clearly inside the code. As data engineers or data scientists, we work with data almost all the time, it’s not a good practice to hard code your credential because of

  1. It’s not secure e.g. people can see your credentials when they pass by your computer when you’re working.
  2. To share your code with others (e.g. colleagues) or even to publish to Github, you need to delete/replace or remark those credentials, which is extra work. And sometimes, you miss some of them.

Environment Variable

To NOT show credentials clearly, the common way to do is storing them in environment variables.

Windows

You can add them through Control Panel → Advanced system settings → Environment Variables

Click New…, then enter values like the below picture.

From the example at top of this article, we will create 4 environment variables:

Variable name: MYSQL_USER     | Variable value: john
Variable name: MYSQL_PASSWORD | Variable value: WeakPassw0rd
Variable name: MYSQL_HOST | Variable value: 192.168.1.1
Variable name: MYSQL_DB | Variable value: db01

Your screen will be like the below picture.

Mac and Linux

You can add environment variables to ~/.bachrc file (or ~/.zshrc depend on the shell you’re using) by append below lines at the end of the file:

export MYSQL_USER=john
export MYSQL_PASSWORD=WeakPassw0rd
export MYSQL_HOST=192.168.1.1
export MYSQL_DB=db01

To use these environment variables in our Jupyter Notebook, we need to use os.getenv() from os library. The below snippet shows how to use them:

Drawbacks of storing credentials in environment variable are:

  1. It has OS dependency. We need to use different ways to add environment variables.
  2. It’s difficult to maintain when the list grows, especially in Windows.

It will be better if we can keep credentials in a file and load them to environment variables. And here python-dotenv come.

python-dotenv

python-dotenv is the library that read the key-value pair from .env file and adds them to environment variables.

To get started, install the latest version with:

pip install python-dotenv

Then, create .env file along-side with you .ipynb file.

.
├── .env
└── your_notebook.ipynb

Note: you may not be able create .env file in Windows Explorer. Please use Command Prompt or other tools e.g. IDE (such as VS Code) to create.

Inside the .env file, I add our credentials in below format:

MYSQL_USER=john
MYSQL_PASSWORD=WeakPassw0rd
MYSQL_HOST=192.168.1.1
MYSQL_DB=db01

So, in our Jupyter Notebook, we just need to add load_dotenv() from dotenv library and using os.getenv()to get environment variables. The rest will be the same.

How to automatically load environment variables when open Jupyter Notebook

What if we don’t want to manual copy and paste the above snippet to our new Jupyter notebook every time?

The easiest way is to add Python (.py) or IPython (.ipy) scripts to your profile_default/startup/ directory. Files here will be executed as soon as the IPython shell is constructed, before any other code or scripts you have specified. Here are the detail steps:

  1. Navigate to the folder %USERPROFILE%\.ipython\profile_default\startup in Windows or ~/.ipython/profile_defaul\startup in Mac. If the startup folder doesn’t exist, create it.
  2. Copy the .env file from the last section to thestartup folder. (Remember? .env file need to be along-side with your script file)
  3. Add a new Python file called environmentvariables.py with below snippet:

Next time, when you launch Jupyter notebook those variables will be ready to use.

So in the new Jupyter notebook. When you want to connect to the database, just using the below snippet:

Notes

  • We can put .env file in a different folder than script file by using Path from pathlib library and specify dotenv_path in load_dotenv() like below:
  • The startup script can be named anything and you can have multiple files in the startup folder. The files will be run in order of their names, so you can control the ordering with prefixes, like 10-myimports.py.

--

--

Heang Yuthakarn
Heang Yuthakarn

Written by Heang Yuthakarn

Data Engineer | Infrastructure | Gadget Crazier | Drama King

Responses (5)