# Separating our code into models

Table of Contents (long chapter!)

# What are models?

I like to think of models as "things our program deals with internally". The purpose of model classes is to make it easier for different parts of our application to communicate. By making this easier, we should also make it easier for us as developers to read and reason about the code.

At the moment our code deals in:

  • Polls
  • Options
  • Votes
  • Users

As developers, we must decide whether to create model classes for all these, or just for some.

In my case I'm going to create a model class for Poll and Option. I consider votes to be sufficiently simple (and encapsulated within options) that they don't need their own class. Also, our use of users at the moment is reserved to a username string attached to a vote. I won't create a model for them for that reason.

If we were storing more information about users (or votes!), then I might reconsider this. The main point though is: will creating an extra model make the code simpler, or more complicated?

# Creating our models

# Poll

I'll begin creating the Poll model class by:

  • Writing a __init__ method that contains the parameters that we would store for a poll in the database.
  • Writing a __repr__ method that faithfully represents how we could re-create this object.
from typing import List
from connections import create_connection  # More on this below
from models.option import Option  # More on this in a moment
import database


class Poll:
    def __init__(self, title: str, owner: str, _id: int = None):
        self.id = _id
        self.title = title
        self.owner = owner
    
    def __repr__(self) -> str:
        return f"Poll({self.name!r}, {self.owner!r}, {self.id!r})"

Note that the __init__ method has a parameter _id=None, because it is the database that is responsible for assigning new ID values (through the SERIAL data type).

Therefore, if I want to create a Poll object myself, I won't have an _id value to provide here. However, when I'm creating a Poll object as a result of a database operation, I will have an _id value to provide here.

Next up, we'll add methods for the main things we want to do with a Poll. Here we should go to app.py and see what sort of things a Poll does there. For example:

def save(self):
    connection = create_connection()
    new_poll_id = database.create_poll(connection, self.title, self.owner)
    connection.close()
    self.id = new_poll_id

We'll look at the create_connection() function in a little bit, but essentially it just creates a new PostgreSQL connection and gives us it.

Both Poll and Option model classes will have this save() method that they can call in order to save themselves to the database.

In order to make adding options to a poll easier, I'll add a method here for that:

def add_option(self, option_text: str):
    Option(option_text, self.id).save()

Then I'll make use of the @property decorator to write a method that allows me to easily find the options for a given poll:

@property
def options(self) -> List[Option]:
    connection = create_connection()
    options = database.get_poll_options(connection, self.id)
    connection.close()
    return [Option(option[1], option[2], option[0]) for option in options]

Next up, three class-methods that let me:

  • Get a specific poll by ID.
  • Get all polls.
  • Get the latest poll.

These methods are created because we require this in app.py.

@classmethod
def get(cls, poll_id: int) -> "Poll":
    connection = create_connection()
    poll = database.get_poll(connection, poll_id)
    connection.close()
    return cls(poll[1], poll[2], poll[0])

@classmethod
def all(cls) -> List["Poll"]:
    connection = create_connection()
    polls = database.get_polls(connection)
    connection.close()
    return [cls(poll[1], poll[2], poll[0]) for poll in polls]

@classmethod
def latest(cls) -> "Poll":
    connection = create_connection()
    poll = database.get_latest_poll()
    connection.close()
    return cls(poll[1], poll[2], poll[0])

The "Poll" return types are necessary (instead of just Poll) because when the type hinting is evaluated, the class hasn't resolved yet. Behind the scenes, it uses typing.ForwardRef[1].

# Option

The Option model class is similar: it encapsulates the data we need and actions we do in our application.

I'll start with the class definition and __init__ and __repr__ methods:

from typing import List
from connections import create_connection
import database


class Option:
    def __init__(self, option_text: str, poll_id: int, _id: int = None):
        self.id = _id
        self.text = option_text
        self.poll_id = poll_id
    
    def __repr__(self) -> str:
        return f"Option({self.text!r}, {self.poll_id!r}, {self.id!r})"

We'll need the save() method:

def save(self):
    connection = create_connection()
    new_option_id = database.add_option(self.text, self.poll_id)
    connection.close()
    self.id = new_option_id

And then I'll also add a method so that, given an Option, it's easy to vote on it:

def vote(self, username: str):
    connection = create_connection()
    database.add_poll_vote(connection, username, self.id)
    connection.close()

A method to get all the votes for an option:

@property
def votes(self) -> List[database.Vote]:
    connection = create_connection()
    votes = database.get_votes_for_option(connection, self.id)
    connection.close()
    return votes

And finally, a method to get a specific option by id:

@classmethod
def get(cls, option_id: int) -> "Option":
    connection = create_connection()
    option = database.get_option(connection, option_id)
    connection.close()
    return cls(option[1], option[2], option[0])

# create_connection(): a problem with this approach

This is what create_connection() looks like, inside an otherwise empty connections.py file:

import os
import psycopg2
from dotenv import load_dotenv

load_dotenv()


def create_connection():
    return psycopg2.connect(os.environ.get("DATABASE_URI"))

You may think this should be in database.py... And you'd be right! But we're going to delete it in just a moment, as it is terribly inefficient. Creating connections is actually quite an "expensive" operation: it takes a while to do. We want to avoid creating connections over an over, which is why we'll be using connection pooling (more on that in the next chapter!).

# Changes to database.py

You may have noticed that we're not using all the functions in database.py. For example, the one to calculate the percentage of votes, or to get a random winner.

This makes database.py simpler: all we need are functions to get or insert polls, options, and votes.

This is what database.py looks like now:

from typing import List, Tuple

from psycopg2.extras import execute_values


Poll = Tuple[int, str, str]
Option = Tuple[int, str, int]
Vote = Tuple[str, int]
PollResults = Tuple[int, str, int, float]


CREATE_POLLS = "CREATE TABLE IF NOT EXISTS polls (id SERIAL PRIMARY KEY, title TEXT, owner_username TEXT);"
CREATE_OPTIONS = "CREATE TABLE IF NOT EXISTS options (id SERIAL PRIMARY KEY, option_text TEXT, poll_id INTEGER);"
CREATE_VOTES = "CREATE TABLE IF NOT EXISTS votes (username TEXT, option_id INTEGER);"


SELECT_ALL_POLLS = "SELECT * FROM polls;"
SELECT_POLL = "SELECT * FROM polls WHERE id = %s;"
SELECT_LATEST_POLL = """SELECT * FROM polls
WHERE polls.id = (
    SELECT id FROM polls ORDER BY id DESC LIMIT 1
);"""

SELECT_POLL_OPTIONS = "SELECT * FROM options WHERE poll_id = %s;"
SELECT_OPTION = "SELECT * FROM options WHERE id = %s;"

SELECT_VOTES_FOR_OPTION = "SELECT * FROM votes WHERE option_id = %s;"

INSERT_POLL_RETURN_ID = "INSERT INTO polls (title, owner_username) VALUES (%s, %s) RETURNING id;"
INSERT_OPTION = "INSERT INTO options (option_text, poll_id) VALUES (%s, %s) RETURNING id;"
INSERT_VOTE = "INSERT INTO votes (username, option_id) VALUES (%s, %s);"


def create_tables(connection):
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(CREATE_POLLS)
            cursor.execute(CREATE_OPTIONS)
            cursor.execute(CREATE_VOTES)


# -- polls --


def create_poll(connection, title: str, owner: str) -> int:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(INSERT_POLL_RETURN_ID, (title, owner))

            poll_id = cursor.fetchone()[0]
            return poll_id


def get_polls(connection) -> List[Poll]:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(SELECT_ALL_POLLS)
            return cursor.fetchall()


def get_poll(connection, poll_id: int) -> Poll:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(SELECT_POLL, (poll_id,))
            return cursor.fetchone()


def get_latest_poll(connection) -> List[Poll]:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(SELECT_LATEST_POLL)
            return cursor.fetchall()


def get_poll_options(connection, poll_id: int) -> List[Option]:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(SELECT_POLL_OPTIONS, (poll_id,))
            return cursor.fetchall()


# -- options --


def get_option(connection, option_id: int) -> Option:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(SELECT_OPTION, (option_id,))
            return cursor.fetchone()


def add_option(connection, option_text: str, poll_id: int):
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(INSERT_OPTION, (option_text, poll_id))


# -- votes --


def get_votes_for_option(connection, option_id: int) -> List[Vote]:
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(SELECT_VOTES_FOR_OPTION, (option_id,))
            return cursor.fetchall()


def add_poll_vote(connection, username: str, option_id: int):
    with connection:
        with connection.cursor() as cursor:
            cursor.execute(INSERT_VOTE, (username, option_id))

Even the return values and custom types we define at the top have changed, and been made simpler.

However, as we'll see, this comes at the cost of using the database to its fullest extent. To make database interactions simpler, we will sacrifice the ability to use custom queries where we could benefit from them.

# Changes to app.py

There's quite a lot of changes to app.py[2] as a result of this. But most changes are simplifications: they allow us to reason about the code more easily.

Now, we'll be dealing with objects and actions (methods) rather than database operations.

# No more passing connections about

In our menu, we're currently creating a connection and passing it around to every function.

Now our models take care of getting a connection, so we no longer need to do this.

This:

def menu():	
    database_uri = input(DATABASE_PROMPT)	
    if not database_uri:	
        load_dotenv()	
        database_uri = os.environ["DATABASE_URI"]	
        	
    connection = psycopg2.connect(database_uri)	
    database.create_tables(connection)	
    while (selection := input(MENU_PROMPT)) != "6":	
        try:	
            MENU_OPTIONS[selection](connection)	
        except KeyError:	
            print("Invalid input selected. Please try again.")

Changes to this:

def menu():
    connection = pool.getconn()
    database.create_tables(connection)
    pool.putconn(connection)

    while (selection := input(MENU_PROMPT)) != "6":
        try:
            MENU_OPTIONS[selection]()
        except KeyError:
            print("Invalid input selected. Please try again.")

Which is quite a bit simpler! However, we have lost the ability to select a different Database URI.

I'm OK with losing that, and we'll bring it back later. If you're not, then I'd recommend asking once in the connections.py file, and reusing that for every new connection created.

As a result of this, we no longer need this:

-import os
-from dotenv import load_dotenv

-DATABASE_PROMPT = "Enter the DATABASE_URI value or leave empty to load from .env file: "

# Changes to creating polls

When we're creating a poll, now we'll create a Poll object and use the .save() and .add_option() methods.

This:

def prompt_create_poll(connection):	
    poll_title = input("Enter poll title: ")
    poll_owner = input("Enter poll owner: ")
    options = []

    while (new_option := input(NEW_OPTION_PROMPT)):
        options.append(new_option)

    database.create_poll(connection, poll_title, poll_owner, options)

Becomes this:

def prompt_create_poll():
    title = input("Enter poll title: ")
    owner = input("Enter poll owner: ")
    poll = Poll(title, owner)
    poll.save()

    while (new_option := input(NEW_OPTION_PROMPT)):
        poll.add_option(new_option)

# Changes to listing polls

The list_open_polls() function is currently accessing the polls as if they were coming from the database: with [0], [1], and so on.

No we'll use Poll.all() to get all the polls as Poll objects, and so we have to start using attribute access instead.

This:

def list_open_polls(connection):	
    polls = database.get_polls(connection)
    for poll in polls:
        print(f"{poll[0]}: {poll[1]} (created by {poll[2]})")

Becomes this:

def list_open_polls():
    for poll in Poll.all():
        print(f"{poll.id}: {poll.title} (created by {poll.owner})")

# Changes to voting on polls

The prompt_vote_poll() function is using the database functions directly. However, we have the model counterparts.

This:

def prompt_vote_poll(connection):
    poll_id = int(input("Enter poll would you like to vote on: "))

    poll_options = database.get_poll_details(connection, poll_id)
    _print_poll_options(poll_options)

    option_id = int(input("Enter option you'd like to vote for: "))
    username = input("Enter the username you'd like to vote as: ")

    database.add_poll_vote(connection, username, option_id)

Becomes this:

def prompt_vote_poll():
    poll_id = int(input("Enter poll would you like to vote on: "))

    _print_poll_options(Poll.get(poll_id).options)

    option_id = int(input("Enter option you'd like to vote for: "))
    username = input("Enter the username you'd like to vote as: ")

    Option.get(option_id).vote(username)

# Changes to printing poll options

Like before, the _print_poll_options() function is using subscripts to access data, but now we have the attributes in each Option object.

This:

def _print_poll_options(poll_with_options: List[database.PollWithOption]):	
    for option in poll_with_options:
        print(f"{option[3]}: {option[4]}")

Becomes this:

def _print_poll_options(options: List[Option]):
    for option in options:
        print(f"{option.id}: {option.text}")

Note that previously, that function had a parameter called poll_with_options, because we had the database return a poll with all its options. Now we have just options, since our queries have been modified to only return data pertinent to the model at play.

# Changes to showing poll votes

Showing the poll votes sees the biggest change. With our new model structure, we have no place for getting votes and percentages directly from the database. Instead, all we have are options and votes as raw data.

We'll have to use them to find the percentages using Python.

This is one of the key drawbacks of this model-based approach: consistency and simplicity of the interactions with models comes at a cost of offloading work to the database via custom queries.

This:

def show_poll_votes(connection):	
    poll_id = int(input("Enter poll you would like to see votes for: "))	

    try:	
        poll_and_votes = database.get_poll_and_vote_results(connection, poll_id)	
    except DivisionByZero:	
        print("No votes yet cast for this poll.")	
    else:	
        for result in poll_and_votes:	
            print(f"{result[1]} got {result[2]} votes ({result[3]:.2f}% of total)")

Becomes this:

def show_poll_votes():
    poll_id = int(input("Enter poll you would like to see votes for: "))
    poll = Poll.get(poll_id)
    options = poll.options
    votes_per_option = [len(option.votes) for option in options]
    total_votes = sum(votes_per_option)

    try:
        for option, votes in zip(options, votes_per_option):
            percentage = votes / total_votes * 100
            print(f"{option.text} for {votes} ({percentage:.2f}% of total)")
    except ZeroDivisionError:
        print("No votes yet cast for this poll.")

# Changes to randomizing a poll winner

Finally, randomizing a poll winner changes slightly. Instead of accessing the database directly, we go through the model classes.

Instead of using ORDER BY RANDOM() in the database query, we get all the voters and use random.choice() to select a random voter.

This:

def randomize_poll_winner(connection):	
    poll_id = int(input("Enter poll you'd like to pick a winner for: "))
    poll_options = database.get_poll_details(connection, poll_id)
    _print_poll_options(poll_options)

    option_id = int(input("Enter which is the winning option, we'll pick a random winner from voters: "))
    winner = database.get_random_poll_vote(connection, option_id)

    print(f"The randomly selected winner is {winner[0]}.")

Becomes this:

import random

...

def randomize_poll_winner():
    poll_id = int(input("Enter poll you'd like to pick a winner for: "))
    poll = Poll.get(poll_id)
    _print_poll_options(poll.options)

    option_id = int(input("Enter which is the winning option, we'll pick a random winner from voters: "))
    votes = Option.get(option_id).votes
    winner = random.choice(votes)
    print(f"The randomly selected winner is {winner[0]}.")

Here's the complete summary of changes: https://www.diffchecker.com/AJCwOBqm (opens new window)


  1. class typing.ForwardRef (Python Official Documentation) (opens new window) ↩︎

  2. Full changes to app.py (opens new window) ↩︎