SQL Joins without Venn diagrams
SQL Joins combine data from two tables on related keys.
Initial situation
Imagine having two tables: people
and tasks
.
Each person can be assigned one task and each task can be assigned to multiple people (1:n relationship). The table people
contains 100 entries. 50 people have a task assigned. The table tasks
has 50 entries and 30 of them are assigned to one or more people.
Cross Join
Cross join combines each row from table 1 with each row from table 2.
SELECT * FROM people CROSS JOIN tasks;
-- This yields the same result as:
SELECT * FROM people, tasks;
The result has 5000 rows (100 people x 50 tasks).
(Inner) Join
Inner join selects all rows where the condition applies. In our case this condition is people.task_id = tasks.task_id
. Rows from the first table that are unmatched in the other table and vice versa are not returned.
SELECT * FROM people INNER JOIN tasks
ON people.task_id = tasks.task_id;
-- This would yield the same:
SELECT * FROM tasks INNER JOIN people
ON tasks.task_id = people.task_id;
The result has 50 rows. This includes all people that have a task assigned. All people without task and all tasks without assignee are omitted.
Btw you can also write JOIN
instead of INNER JOIN
.
Natural Join
This is basically the same as Inner Join but it works only if the columns for the condition have the same name. In our case Natural Join works because both columns are called task_id
. If the id column would be named id
instead of task_id
, Natural Join would not work.
SELECT * FROM people NATURAL JOIN tasks;
-- This is the same:
SELECT * FROM tasks INNER JOIN people USING(task_id);
-- USING(attribute) also assumes both columns are called the same
The result is again 50 rows.
Outer Join
Outer Joins differ from the Inner Join in that they include also unmatched rows from either the left, right or both tables where the condition does not apply.
Left
and right
corresponds with the tables that are either on the left or the right from ___ OUTER JOIN
keyword.
Left Outer Join
SELECT * FROM people LEFT OUTER JOIN tasks USING(task_id);
The result are 100 rows. All people if they have a task assigned or not.
Right Outer Join
SELECT * FROM people RIGHT OUTER JOIN tasks USING(task_id);
The result are 70 rows. All the tasks with their assignees and all tasks which are not assigned. Note that there are tasks which have more than one person assigned!
Full Outer Join
SELECT * FROM people FULL OUTER JOIN tasks USING(task_id);
The result are 120 rows. Why? Because there are 50 people which have a total of 30 tasks assigned. Additionally there are 50 people with no task assigned and 20 tasks with no assignee.
Feedback
Spotted a mistake or got some feedback on the code? Leave a comment or reach out to me on Twitter ๐