# How to store sensitive data in your code
Now that you've copied all the code from the Movie Watchlist project into the PyCharm project, it's time to being making the changes necessary to use psycopg2
instead of sqlite3
.
In this chapter we'll look at how we can effectively connect to PostgreSQL. We won't change any of the functions that interact with the database, leaving that for the next chapter instead.
There are a couple of changes we'll need to make:
- We'll have to change the connection code, since now we'll be connecting to Elephant SQL instead of a local SQLite file.
- We'll learn about environment variables to remove the database username and password from our code, so that if we share it with others they can't access our database!
# Changing the connection code
At the moment our connection code looks like this:
import sqlite3
...
connection = sqlite3.connect("data.db")
We will remove the sqlite3
import, replacing it by psycopg2
, and connect to our ElephantSQL URL instead:
import psycopg2
...
connection = psycopg2.connect("postgres://...")
# Using environment variables
However, often you want to avoid putting sensitive data inside your code. For example, your database's username and password.
That's because, if you share your code with other people (or with everyone, on something like GitHub), you don't want them to be able to access and modify your database.
Instead what you want is to tell those people that they need a database URL, and that they can get their own.
In order to enable this though, we need to be able to give our program the database URL somehow, but not inside the code.
That's where environment variables come into play.
# What is an environment variable?
An environment variable is a variable whose value is set outside the program. They have a name and a value. Environment variables only exist within a context.
For example, a context might be "the whole computer", or it might be "this particular program". For example you might want to define an environment variable that you'll use in many programs, and that could be defined in a very global context (for example, the PATH
environment variable which is present in all operating systems).
If you want to define an environment variable for use in just one program, you can do that too!
# How do we define environment variables?
Every terminal (e.g. Bash, cmd.exe, PowerShell) can define environment variables in their own way.
In order to make our life easy, what we normally do with Python is we create a file that contains the environment variables we want. We never share this file with anybody else.
Then, we tell Python to read that file and convert it to environment variables.
We do this with most programming languages actually. The file in question is almost always called .env
.
So, go ahead and create a file called .env
in your PyCharm project, and inside it put this:
DATABASE_URL=postgres://...
Make sure to put your database URL from ElephantSQL in there, after the equal sign.
TIP
Since we're not going to share this file with anyone, there's two things we should do:
- Create a
.env.example
file that we do share with others, that contains just the names of the environment variables. This is to tell others what variables they will need, but not the values of them:
DATABASE_URL=
- If you're using Git, add
.env
to the.gitignore
file. This will prevent Git from automatically trying to include the.env
file in your commits.
# How do we read environment variables with Python?
Now that we've got that file, we can tell Python to read it and load each line as an environment variable.
To do that, we'll use another library called python-dotenv
. Go ahead and install it like we did with psycopg2-binary
.
WARNING
Make sure to install python-dotenv
and not dotenv
.
When that's installed, we can go into our database.py
file and use the library to load up the .env
file before we connect to the database. Just underneath our current imports we'll add two lines of code:
import os
import datetime
import psycopg2
from dotenv import load_dotenv
load_dotenv()
We also need to import os
, as shown above.
Next, when we connect to the database, we'll use the environment variable instead of the hard-coded string:
connection = psycopg2.connect(os.environ["DATABASE_URL"])
os.environ
gives us a dictionary of environment variables currently defined. We're accessing the DATABASE_URL
variable in there, which should give us the correct value if we loaded the .env
file correctly.