# How to store sensitive data in your code

Now that you've copied all the code from the Movie Watchlist project into the PyCharm project, it's time to being making the changes necessary to use psycopg2 instead of sqlite3.

In this chapter we'll look at how we can effectively connect to PostgreSQL. We won't change any of the functions that interact with the database, leaving that for the next chapter instead.

There are a couple of changes we'll need to make:

  • We'll have to change the connection code, since now we'll be connecting to Elephant SQL instead of a local SQLite file.
  • We'll learn about environment variables to remove the database username and password from our code, so that if we share it with others they can't access our database!

# Changing the connection code

At the moment our connection code looks like this:

import sqlite3

...

connection = sqlite3.connect("data.db")

We will remove the sqlite3 import, replacing it by psycopg2, and connect to our ElephantSQL URL instead:

import psycopg2

...

connection = psycopg2.connect("postgres://...")

# Using environment variables

However, often you want to avoid putting sensitive data inside your code. For example, your database's username and password.

That's because, if you share your code with other people (or with everyone, on something like GitHub), you don't want them to be able to access and modify your database.

Instead what you want is to tell those people that they need a database URL, and that they can get their own.

In order to enable this though, we need to be able to give our program the database URL somehow, but not inside the code.

That's where environment variables come into play.

# What is an environment variable?

An environment variable is a variable whose value is set outside the program. They have a name and a value. Environment variables only exist within a context.

For example, a context might be "the whole computer", or it might be "this particular program". For example you might want to define an environment variable that you'll use in many programs, and that could be defined in a very global context (for example, the PATH environment variable which is present in all operating systems).

If you want to define an environment variable for use in just one program, you can do that too!

# How do we define environment variables?

Every terminal (e.g. Bash, cmd.exe, PowerShell) can define environment variables in their own way.

In order to make our life easy, what we normally do with Python is we create a file that contains the environment variables we want. We never share this file with anybody else.

Then, we tell Python to read that file and convert it to environment variables.

We do this with most programming languages actually. The file in question is almost always called .env.

So, go ahead and create a file called .env in your PyCharm project, and inside it put this:

DATABASE_URL=postgres://...

Make sure to put your database URL from ElephantSQL in there, after the equal sign.

TIP

Since we're not going to share this file with anyone, there's two things we should do:

  1. Create a .env.example file that we do share with others, that contains just the names of the environment variables. This is to tell others what variables they will need, but not the values of them:
DATABASE_URL=
  1. If you're using Git, add .env to the .gitignore file. This will prevent Git from automatically trying to include the .env file in your commits.

# How do we read environment variables with Python?

Now that we've got that file, we can tell Python to read it and load each line as an environment variable.

To do that, we'll use another library called python-dotenv. Go ahead and install it like we did with psycopg2-binary.

WARNING

Make sure to install python-dotenv and not dotenv.

When that's installed, we can go into our database.py file and use the library to load up the .env file before we connect to the database. Just underneath our current imports we'll add two lines of code:

import os
import datetime
import psycopg2

from dotenv import load_dotenv

load_dotenv()

We also need to import os, as shown above.

Next, when we connect to the database, we'll use the environment variable instead of the hard-coded string:

connection = psycopg2.connect(os.environ["DATABASE_URL"])

os.environ gives us a dictionary of environment variables currently defined. We're accessing the DATABASE_URL variable in there, which should give us the correct value if we loaded the .env file correctly.