An Introduction to PostgreSQL
The world’s most advanced open source relational database
Goals and audience
This post is an introductory guide to getting started with PostgreSQL (PSQL). Future sections of this guide will build on this introduction and discuss other key concepts of PSQL. The complete guide will be a reference that can help with day-to-day development involving PSQL.
Pronunciation
While PostgreSQL is the formal name of the database, it is usually referenced like so (with my pronunciation in parentheses):
- Postgres (post-gress)
- PSQL (pee-sequel)
Structured Query Language (SQL) is the programming language that a developer uses to view and modify data in the database. There are other databases that support using SQL as well, and they are not discussed in this guide.
Getting Started
What’s a relational database?
A relational database such as PSQL is very similar to a collection of spreadsheets. In PSQL these spreadsheets are called tables; they store headers, columns, and rows. Relational means that you can define relationships between data stored in these tables.
Installation
Download the PostgreSQL installer for your operating system. This website calls it a binary package, which is another way to say program.
Example
Imagine being a business owner that keeps track of what your customers purchase by using two spreadsheets:
- A spreadsheet of customers who have a name and an email address.
- A spreadsheet of purchases that have a customer name, an amount and a purchase date.
Instead of using spreadsheets, putting this information into a database lets you inspect this data and ask questions like:
- What’s the average purchase amount?
- Which customers made multiple purchases last year?
Try it out!
-
Open up the command line.
- On Windows, run
cmd. - On macOS, open the
Terminalapp (you can use thecmd + spaceshortcut and type “Terminal”). - On Ubuntu, open the
Terminalapp or use theCtrl + Alt + Tshortcut.
- On Windows, run
-
The installer should have installed a
createdbcommand. Runcreatedb businessto create a database namedbusiness. -
Run the command
psql businessto start the interactive terminal and interact with thebusinessdatabase. Here is the officialpsqldocumentation for reference, although you probably won’t need to use it. -
Create the customers and purchases tables by entering the following code. The formatting doesn’t matter, but you will typically see tables created this way for readability:
create table customers ( name text, email_address text ); create table purchases ( customer_name text, amount numeric, purchase_date date ); -
Create (or insert) two customers and some purchases for them:
insert into customers (name, email_address) values ('Eric', 'eric@gmail.com'), ('Melanie', 'melanie@gmail.com'); insert into purchases (customer_name, amount, purchase_date) values ('Eric', 55.01, '2020-04-20'), ('Eric', 86.57, '2020-04-21'), ('Melanie', 57.50, '2020-04-23'); -
Read (or select) the data you just created:
select * from customers;This statement is shorthand for writing
select name, email_address from customers;You can do the same for
purchases:select * from purchases;Or use this equivalent code:
select customer_name, amount, purchase_date from purchases; -
Update Eric’s purchase of $86.57 to apply a $10 discount for being a loyal customer.
update purchases set amount = (amount - 10) where customer_name = 'Eric' and amount = '86.57'; -- These lines starting with -- are comments. -- The PSQL terminal ignores these lines. -- Run the code below to see Eric's purchases. -- You should now see that one of them is now $76.57 rather than $86.57. select * from purchases where customer_name = 'Eric'; -
Delete Eric’s purchase of $76.57 since he returned the product and you refunded him.
delete from purchases where customer_name = 'Eric' and amount = '76.57';
Inspecting your data
With the data in our database, we can now answer these questions:
-
What’s the average purchase amount?
select avg(amount) from purchases;The built-in
avgfunction returns the average value of the given column. -
Which customers made multiple purchases last year?
select customers.name, count(*) from customers join purchases on customers.name = purchases.customer_name where purchases.purchase_date > now() - interval '1 year' group by customers.name order by count(*) desc;count(*)is a built-in function that shows the number of rows your query found.joincombines data from multiple tables.wherefilters the data by the predicate (or condition) you provide.group byis used withcount(*)to tell the database how to count rows.order bysorts the results in order:- descending (highest value first)
- ascending (lowest value first)
Summary
In this guide, you learned:
- PostgreSQL is a type of relational database.
- A relational database is like a collection of spreadsheets.
- You define relationships between data in these spreadsheets, such as connecting a spreadsheet of customers to a spreadsheet of their purchases.
- You manipulate data using CRUD, which is a common acronym that refers to:
- Create (or insert)
- Read (or select)
- Update
- Delete
count(*)withgroup byshows us how many rows have the same value, such as all of the purchases belonging to the same customer.joincombines data from multiple tables.wherecan be used to select only certain rows that match some condition.order bysorts data in one of two ways:- descending (highest value first)
- ascending (lowest value first)