SQL UNION


When working with databases, you often need to combine data from different tables or queries. The UNION operator in SQL is a powerful tool for this purpose. It allows you to combine the results of two or more SELECT queries into a single result set.


1. What is SQL UNION?

The UNION operator in SQL combines the results of two or more SELECT statements into a single result set. It eliminates duplicate records from the result set by default. Each SELECT query within the UNION must have the same number of columns and compatible data types.

Basic Syntax of SQL UNION:

SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;
  • column1, column2, ...: The columns that you want to retrieve from each table. These columns must be of the same data type and be in the same order.
  • table1 and table2: The tables from which you want to retrieve data.

2. SQL UNION Example

Let’s say you have two tables: employees and contractors. Both tables contain information about individuals, including their first_name and last_name, and you want to create a unified list of names, without duplicates.

Example Query:

SELECT first_name, last_name
FROM employees
UNION
SELECT first_name, last_name
FROM contractors;

Explanation:

  • The query combines the results from the employees and contractors tables.
  • The UNION operator ensures that duplicate rows (if any) are removed from the final result.

Result:

first_name last_name
Alice Johnson
Bob Smith
Charlie Brown

3. Difference Between UNION and UNION ALL

  • UNION: Combines the results of two queries and removes duplicate records. It performs a distinct operation on the result set.
  • UNION ALL: Combines the results of two queries but does not remove duplicates. All records, including duplicates, are included in the final result set.

Example: Using UNION ALL

Let’s look at an example where you want to combine the results of two tables, but you want to include all records, even duplicates.

SELECT first_name, last_name
FROM employees
UNION ALL
SELECT first_name, last_name
FROM contractors;

Explanation:

  • The UNION ALL operator combines the data from both employees and contractors without removing any duplicates.

Result:

first_name last_name
Alice Johnson
Bob Smith
Charlie Brown
Alice Johnson
Charlie Brown

4. SQL UNION with Different Columns

In order for a UNION operation to work, the number of columns in both queries must be the same. However, the column names do not need to match. The data types of corresponding columns in each SELECT query must also be compatible.

Example:

If you are selecting data from two different tables with the same column data types but different column names:

SELECT first_name AS name, last_name
FROM employees
UNION
SELECT first_name AS name, surname
FROM contractors;

Explanation:

  • In this case, the columns first_name and last_name have different names in the two tables. But using AS name makes the column names consistent, and the query works as expected.

Result:

name last_name
Alice Johnson
Bob Smith
Charlie Brown

5. Combining Data from Multiple Tables Using UNION

The UNION operator can also be used to combine data from multiple tables. For instance, you may want to retrieve a unified list of customer_id from two different tables, orders and returns, to analyze customers who made both orders and returns.

Example:

SELECT customer_id
FROM orders
UNION
SELECT customer_id
FROM returns;

Explanation:

  • This query combines the customer_id data from both the orders and returns tables.
  • Duplicate customer IDs are removed from the final result, leaving only unique customer IDs who have either made an order or return.

6. Using UNION to Combine Aggregate Results

You can also use the UNION operator with aggregate functions such as COUNT(), SUM(), or AVG() to combine the results of two or more aggregated queries.

Example:

SELECT COUNT(*) AS total_orders
FROM orders
UNION
SELECT COUNT(*) AS total_returns
FROM returns;

Explanation:

  • This query returns the total number of orders and the total number of returns.
  • The UNION operator combines the two aggregated results into a single result set.

Result:

total_orders total_returns
120 50

7. Sorting Results with UNION

When using UNION, the sorting of the results can be applied by adding an ORDER BY clause at the end of the combined query.

Example:

SELECT first_name, last_name
FROM employees
UNION
SELECT first_name, last_name
FROM contractors
ORDER BY last_name;

Explanation:

  • The ORDER BY clause sorts the final result set by the last_name column after combining the data from both tables.

Result:

first_name last_name
Alice Johnson
Charlie Brown
Bob Smith

8. Performance Considerations

  • UNION: Since it removes duplicate records, it may take more processing time and memory, especially with large datasets.
  • UNION ALL: It generally performs better than UNION because it doesn't have to check for and remove duplicates.

If you’re confident that the results of the SELECT statements won’t contain duplicates, it’s often better to use UNION ALL for better performance.