SQL: LEFT JOIN and multiple matches on a key

Author: Jacek Trociński, Last Revision: 2020.01.20, Topic area: SQL

LEFT JOINs can return multiple matches on a single key value, depending on your requirements this may or may not be desirable. In Oracle, you can use the analytical function ROW_NUMBER to choose a single value when multiple matches on a key occur. You can use a similar solution in other RDBMSs that offer the ROW_NUMBER function, such as PostrgeSQL.

Example (in Oracle):

CREATE TABLE users(
	user_id NUMBER,
	firstname VARCHAR2(10 CHAR),
	lastname VARCHAR2(10 CHAR)
);

CREATE TABLE user_email_addresses(
	user_id NUMBER,
	email VARCHAR2(30 CHAR)
);

INSERT INTO users(user_id, firstname, lastname) VALUES(1, 'Bill', 'Gates');
INSERT INTO users(user_id, firstname, lastname) VALUES(2, 'Linus', 'Torvalds');

INSERT INTO user_email_addresses(user_id, email) VALUES(1, '[email protected]');
INSERT INTO user_email_addresses(user_id, email) VALUES(1, '[email protected]');
INSERT INTO user_email_addresses(user_id, email) VALUES(2, '[email protected]');

COMMIT;

Table USERS:

   USER_ID FIRSTNAME  LASTNAME
---------- ---------- ----------
         1 Bill       Gates
         2 Linus      Torvalds  

Table USER_EMAIL_ADDRESSES:

   USER_ID EMAIL
---------- ------------------------------
         1 [email protected]
         1 [email protected]
         2 [email protected] 

In the following query notice that multiple matches are found for USER_ID = 1 in the table USER_EMAIL_ADDRESSES:

SELECT
	users.*,
	emails.email
FROM users
LEFT JOIN user_email_addresses emails ON users.user_id = emails.user_id
;

Result:

   USER_ID FIRSTNAME  LASTNAME   EMAIL
---------- ---------- ---------- ------------------------------
         1 Bill       Gates      [email protected]
         1 Bill       Gates      [email protected]
         2 Linus      Torvalds   [email protected] 

In the following query ROW_NUMBER is used to eliminate multiple matches on a key by choosing the first value sorted by EMAIL for each USER_ID in USER_EMAIL_ADDRESSES:

SELECT
	users.*,
	emails.email
FROM users
LEFT JOIN (
	SELECT
		e.*,
		ROW_NUMBER() OVER (PARTITION BY e.user_id ORDER BY e.email) AS emails_rn
	FROM user_email_addresses e
) emails ON users.user_id = emails.user_id AND emails.emails_rn = 1
;

Result:

   USER_ID FIRSTNAME  LASTNAME   EMAIL
---------- ---------- ---------- ------------------------------
         1 Bill       Gates      [email protected]
         2 Linus      Torvalds   [email protected] 

The takeaway here is that multiple matches on a key can occur when using a LEFT JOIN, ROW_NUMBER can help you deal with them if you need to.