SQL Joins without Venn diagrams

ยท

4 min read

SQL Joins combine data from two tables on related keys.

Initial situation

Imagine having two tables: people and tasks. image.png Each person can be assigned one task and each task can be assigned to multiple people (1:n relationship). The table people contains 100 entries. 50 people have a task assigned. The table tasks has 50 entries and 30 of them are assigned to one or more people.

Cross Join

Cross join combines each row from table 1 with each row from table 2.

SELECT * FROM people CROSS JOIN tasks;

-- This yields the same result as:
SELECT * FROM people, tasks;

The result has 5000 rows (100 people x 50 tasks).

(Inner) Join

Inner join selects all rows where the condition applies. In our case this condition is people.task_id = tasks.task_id. Rows from the first table that are unmatched in the other table and vice versa are not returned.

SELECT * FROM people INNER JOIN tasks
      ON people.task_id = tasks.task_id;

-- This would yield the same:
SELECT * FROM tasks INNER JOIN people
      ON tasks.task_id = people.task_id;

The result has 50 rows. This includes all people that have a task assigned. All people without task and all tasks without assignee are omitted.

Btw you can also write JOIN instead of INNER JOIN.

Natural Join

This is basically the same as Inner Join but it works only if the columns for the condition have the same name. In our case Natural Join works because both columns are called task_id. If the id column would be named id instead of task_id, Natural Join would not work.

SELECT * FROM people NATURAL JOIN tasks;

-- This is the same:
SELECT * FROM tasks INNER JOIN people USING(task_id);
-- USING(attribute) also assumes both columns are called the same

The result is again 50 rows.

Outer Join

Outer Joins differ from the Inner Join in that they include also unmatched rows from either the left, right or both tables where the condition does not apply.

Left and right corresponds with the tables that are either on the left or the right from ___ OUTER JOIN keyword.

Left Outer Join

SELECT * FROM people LEFT OUTER JOIN tasks USING(task_id);

The result are 100 rows. All people if they have a task assigned or not.

Right Outer Join

SELECT * FROM people RIGHT OUTER JOIN tasks USING(task_id);

The result are 70 rows. All the tasks with their assignees and all tasks which are not assigned. Note that there are tasks which have more than one person assigned!

Full Outer Join

SELECT * FROM people FULL OUTER JOIN tasks USING(task_id);

The result are 120 rows. Why? Because there are 50 people which have a total of 30 tasks assigned. Additionally there are 50 people with no task assigned and 20 tasks with no assignee.

Feedback

Spotted a mistake or got some feedback on the code? Leave a comment or reach out to me on Twitter ๐Ÿ˜Š