SQL Joins
Learning Goals
- Visualize and practice SQL join queries
- Use WHERE and JOIN together
- Understand the difference between an inner join and a left join
Warm Up Challenge
Let’s keep practicing with our Songs and Artists tables.
For each of the following, what is the SQL command we would use to get this information:
- Get all song titles
- Get the songs with a play count greater than 40
- Get the artists that have a hometown of Minneapolis, sorted alphabetically by name
Extra challenge!
- Get the 5 longest songs that have the letter p in the title
Joining Tables
The SQL Join Query
So far, we have looked at SQL queries that deal only with one table. But you will sometimes need to run queries based on information from more than one table. Maybe our query involves both songs and artists!
For example:
- Find the artists with songs that have been played more than 100 times
- Find all songs where the artist names contains “Prince”
When we come up against this problem, we rely on JOIN
queries to accomplish this goal.
A JOIN
pulls information from multiple tables into one temporary table. Let’s see how this works.
In PG Admin, let’s take a look at our songs
and artists
tables:
SELECT * FROM songs; 434 | 2
id | title | length | play_count | artist_id
---+-----------------+--------+------------+-----------
1 | Purple Rain | 345 | 23 | 1
2 | Raspberry Beret | 432 | 12 | 1
3 | Wild Wild Life | 367 | 367 | 2
(3 rows)
SELECT * FROM artists;
id | name | hometown | years_active
----+---------------+--------------+-------------
1 | Prince | Minneapolis | 1975-2016
2 | Talking Heads | New York | 1975-1991
(2 rows)
Above, we see our songs and artists table - what happens if we JOIN
these tables together?
SELECT artists.*, songs.* FROM songs JOIN artists ON artists.id = songs.artist_id;
id | name | hometown | years_active | id | title | length | play_count | artist_id
---+---------------+--------------+--------------+----+-----------------+------------+------------+----------
1 | Prince | Minneapolis | 1975-2016 | 1 | Purple Rain | 345 | 23 | 1
1 | Prince | Minneapolis | 1975-2016 | 2 | Raspberry Beret | 432 | 12 | 1
2 | Talking Heads | New York | 1975-1991 | 3 | Wild Wild Life | 367 | 367 | 2
(3 rows)
Turn and Talk What did this query do? How might you describe the return value of this query?
When we run this JOIN
, we are joining the songs and artists tables together to return a table that includes all the information from both tables. For each artist, we see a row for each song that they have, with the information from both the artists table and the songs table.
When creating a JOIN
query, there are three essential parts:
SELECT
- this is what indicates which columns will be included in the resulting tableSELECT * FROM songs
JOIN
- the command to join in data from another tableJOIN artists
ON
- this tells the join how to join the two tables together, or what is the relationship between the two tables (most often, primary key = foreign key)ON artists.id = songs.artist_id
Let’s now add another artist, Zac Brown Band.
INSERT INTO
artists (
name, hometown, years_active
)
VALUES
('Zach Brown Band','Atlanta','2002-Present');
With Your Partner Run the
JOIN
again,Zac Brown Band
is not returned in the data! Why are we not seeing all our artists on this joined table?
Let’s now add a song where we don’t know the artist info.
INSERT INTO
songs (
title, length, play_count
)
VALUES
('Wow such a great song', 10, 1000);
Do you think this data will be returned when we run our JOIN
? Let’s pause and predict and then run it!
Spoiler
The song without an artist was not returned in the join.
Venn Diagram Review
When we create JOIN
queries, there are a handful of different join types that we can declare that will affect the resulting table. We can use Venn Diagrams to represent what data will be returned for each type of join.
Let’s review the images you looked at in preparation for this lesson.
With Your Breakout Room Explain what each of the shaded regions represents. What animals did you put in each region?
Types of Join Queries
While there are many different types of joins, today, we are going to cover the most common two: Inner Join and Left Join
Inner Join
The default JOIN
type in SQL is an Inner JOIN.
- Default join type (This is what we’ve been doing so far when we use the SQL keyword
JOIN
) - Only returns records that match the ON condition
- Only returns records from table A that have a corresponding record in table B
Left Join
The next most common JOIN
type is a Left Join.
- Will get all records from table A regardless of if they have corresponding rows in table B
- “Left” refers to the first table referenced in the query
- FROM artists LEFT JOIN songs artists is the “LEFT” table
Example:
SELECT artists.id, artists.name, songs.id, songs.title
FROM artists
LEFT JOIN songs
ON songs.artist_id = artists.id;
Now, we see bands even if they have no songs.
Combining JOIN
and WHERE
With your partner: Write a SQL query would to retrieve a list of artists who have songs longer than ‘400’.
Completed Query
```SQL SELECT artists.name FROM artists JOIN songs ON artists.id = songs.artist_id WHERE songs.length > 400; ```
More Advanced Joins
Take a minute to look at this query. What tables are being joined here?
SELECT * FROM songs JOIN playlists_songs ON playlists_songs.song_id = songs.id
JOIN playlists ON playlists_songs.playlist_id = playlists.id
Spoiler
Did you see how we start by selecting everything form our songs table and then we join in data from two additional tables. We first bring in the data from our playlists_songs join table and then we bring in the data from our playlists table. Don't worry about fully understanding a query with multiple joins, but it's good to know this is an option. You can research this more if in the future you need to join the data in a many-to-many relationship.
Checks for Understanding
- What are the two types of joins covered today? And, what do they return?
- What is the default type of join used when you just type
JOIN
? - What is the SQL query to get a list of Artists who have songs that have been played more than 20 times?
- Which of the following queries would require a join.
- Find all of the songs with titles that start with the letter z written by artists from New York.
- Find the 10 most listened to songs
- Where is the most popular artist from?
- How would you summarize when/why you need to use a join?