# SQL VIEW
: virtual tables
A view[1] allows us to store a query with a name so that every time we access the view, the query is executed.
This is particularly useful for long and complex queries that we want to have accessible without having to type them over and over again.
In addition, you can use the view as if it were a table, which can simplify using the view itself in other queries.
Something important to remember when working with views though, is that every time you access the view, the underlying query is executed. Another way of saying that is that view is never materialized.
Here's a simple view definition using our data:
CREATE VIEW option_names AS
SELECT option_text FROM options;
After creating the view, we can access it:
SELECT * FROM option_names;
option_text |
---|
Flask |
Django |
It depends |
Views can be deleted using DROP VIEW option_names
.
If you want a view to be available only during the current transaction, you can use CREATE TEMP VIEW
. Then the view is automatically dropped at the end of the transaction. This can be helpful if you are using the view query multiple times and you want to save yourself some typing.
# A view for our most voted for options
We can retrieve the most voted for options with this query:
SELECT
DISTINCT ON (options.poll_id) poll_id,
options.id,
options.option_text,
COUNT(votes.option_id) as vote_count
FROM options
LEFT JOIN votes on options.id = votes.option_id
GROUP BY options.id
ORDER BY poll_id, vote_count DESC;
If you wanted to access this often, you could create a view for it:
CREATE VIEW most_voted_options AS
SELECT
DISTINCT ON (options.poll_id) poll_id,
options.id,
options.option_text,
COUNT(votes.option_id) as vote_count
FROM options
LEFT JOIN votes on options.id = votes.option_id
GROUP BY options.id
ORDER BY poll_id, vote_count DESC;
Then in the future, you could access the results like this:
SELECT * FROM most_voted_options;
# Updatable views (inserting data into views)
You can INSERT
, UPDATE
, and DELETE
on a simple view, the same way that you can do into a regular table.
A "simple" view is one that:
- Has exactly one entry in its FROM list, which must be a table or another updatable view.
- Is not defined with
WITH
,DISTINCT
,GROUP BY
,HAVING
,LIMIT
, orOFFSET
clauses at the top level. - Is not defined with set operations (
UNION
,INTERSECT
orEXCEPT
) at the top level. - Does not
SELECT
any aggregates, window functions or set-returning functions.
Quite a few rules there! I don't recommend trying everything out right now, but remember this page (bookmark it!) for when you need to create an updatable view in the future. The official documentation also has more examples[2].
Now let's talk a bit more about what we can do with updatable views. Until the end of the chapter, imagine you've got this table, employees
, to work with:
id INTEGER | name TEXT | salary INTEGER | department TEXT |
---|---|---|---|
1 | Bob Smith | 35000 | Research |
2 | Rolf Smith | 55000 | Product |
3 | Anne Pun | 98750 | Product |
4 | Jose Salvatierra | 60000 | Product |
5 | Charlie Robert | 53000 | Research |
6 | Susan Williams | 150000 | Product |
# Inserting into a view
When you create a view that follows the rules above, you can insert into it. Let's create the simplest view we can:
CREATE VIEW useless_employee_view AS
SELECT * FROM employees;
This is useless because you might as well just use the SELECT
query, but bear with me. I just want to show you that you can do this:
INSERT INTO useless_employee_view VALUES (7, 'Adam Smith', 78000, 'Research');
That would totally work, because the view follows the rules described earlier, none of the columns are read-only (more on that later), and we've inserted into all of the columns that we need.
If any columns had default values or were nullable, then you could skip those when inserting into the view and they would get the default value or NULL
as appropriate.
# Views with WHERE
Next up, a slightly more useful view:
CREATE VIEW high_earners AS
SELECT * FROM employees WHERE salary > 80000;
You can still insert into this view, and the INSERT
does not have to follow the WHERE
. This is fine:
INSERT INTO high_earners VALUES (7, 'Adam Smith', 50000, 'Research');
It just won't appear in the view when you next use it, but PostgreSQL doesn't mind.
However, you won't be able to UPDATE
or DELETE
rows that aren't inside the view already. So if you wanted to delete "Charlie Robert", you can't because he's not in the view.
This is not OK:
DELETE FROM high_earners WHERE id = 5;
You can still update the view such that performing the update removes things from the view. This is OK:
UPDATE high_earners SET salary = 50000 WHERE id = 6;
That would take Susan out of the view the next time the view is ran, but PostgreSQL doesn't care about it.
# LOCAL CHECK OPTION
PostgreSQL allows us to use CHECK OPTION
to restrict INSERT
or UPDATE
from resulting in data ending up outside the view.
If we created the view like this:
CREATE VIEW high_earners AS
SELECT * FROM employees WHERE salary > 80000
WITH LOCAL CHECK OPTION;
Then doing this would result in an error:
UPDATE high_earners SET salary = 50000 WHERE id = 6;
Also this would result in an error:
INSERT INTO high_earners VALUES (7, 'Adam Smith', 50000, 'Research');
# CASCADED CHECK OPTION
Let's go back and define our view like this:
CREATE VIEW high_earners AS
SELECT * FROM employees WHERE salary > 80000;
And then let's define another view that uses that view, using WITH LOCAL CHECK OPTION
:
CREATE VIEW high_salary_researchers AS
SELECT * FROM high_earners WHERE department = 'Research'
WITH LOCAL CHECK OPTION;
Now we can only insert into high_salary_researchers
someone in the "Research" department.
Because this is a LOCAL CHECK OPTION
, we can still insert into this view low earners.
To conclude, LOCAL CHECK OPTION
applies checks on the current view. It does not apply the checks of any deeper level view.
If you wanted to apply the checks of all views, you'd use this:
CREATE VIEW high_salary_researchers AS
SELECT * FROM high_earners WHERE department = 'Research'
WITH CASCADED CHECK OPTION;
Now you can only insert into this view people in the "Research" department with a salary over 80000.
# Non-updatable columns
If you generate any new columns in the SELECT
portion of the view, these will be read-only.
Let's add a new table to our data set, projects
:
id INTEGER | employee_id INTEGER | project_name TEXT |
---|---|---|
1 | 2 | New website |
2 | 3 | New website |
3 | 4 | New website |
4 | 3 | Open-source project |
5 | 1 | Open-source project |
Let's create a view that shows us, for each high-earning employee, how many projects they've worked on:
CREATE VIEW projects_per_high_earner AS
SELECT
employees.*,
(SELECT COUNT(*)
FROM projects
WHERE projects.employee_id = employees.id) AS project_count
FROM employees
WHERE salary > 80000;
Notice that project_count
is a generated column. When you try to INSERT
into projects_per_high_earner
, you must exclude that column from the inserted values.
It doesn't make sense to insert anything into that column, since that column is generated from an entirely different dataset.
# Materialized views
Materialized views are just views where the data is saved to the database permanently. This means that accessing a materialized view does not run the query again. They are created with CREATE MATERIALIZED VIEW
.
Clearly though, if the underlying data changes, then the materialized view will be outdated. That's because the query doesn't run again.
Therefore materialized views are particularly useful when:
- The query behind the view is very computationally intense; and
- You're OK with not having the absolute latest data in your view.
You can refresh the data in the materialized view (run the query again and update the view) by issuing a command:
REFRESH MATERIALIZED VIEW most_voted_options;
You may want to do this periodically (e.g. every day), or when the load in your database is at its lowest point (probably when most of your users are sleeping).
# Wrapping Up
I hope this chapter has been useful, and not overly confusing!
When you start creating views, particularly if you have subqueries like in that last example, or if you're doing many levels of JOIN
, it can get quite complicated.
If one of your view definitions is overly complicated due to nesting, remember you can always extract the nested parts into its own view! Alternatively as we've learned earlier on, you can use a WITH
statement to extract subqueries to before the SELECT
.