When working with databases, you often need to combine data from different tables or queries. The UNION
operator in SQL is a powerful tool for this purpose. It allows you to combine the results of two or more SELECT
queries into a single result set.
The UNION
operator in SQL combines the results of two or more SELECT
statements into a single result set. It eliminates duplicate records from the result set by default. Each SELECT
query within the UNION
must have the same number of columns and compatible data types.
SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;
column1, column2, ...
: The columns that you want to retrieve from each table. These columns must be of the same data type and be in the same order.table1
and table2
: The tables from which you want to retrieve data.Let’s say you have two tables: employees
and contractors
. Both tables contain information about individuals, including their first_name
and last_name
, and you want to create a unified list of names, without duplicates.
SELECT first_name, last_name
FROM employees
UNION
SELECT first_name, last_name
FROM contractors;
Explanation:
employees
and contractors
tables.UNION
operator ensures that duplicate rows (if any) are removed from the final result.Result:
first_name | last_name |
---|---|
Alice | Johnson |
Bob | Smith |
Charlie | Brown |
UNION
: Combines the results of two queries and removes duplicate records. It performs a distinct operation on the result set.UNION ALL
: Combines the results of two queries but does not remove duplicates. All records, including duplicates, are included in the final result set.Let’s look at an example where you want to combine the results of two tables, but you want to include all records, even duplicates.
SELECT first_name, last_name
FROM employees
UNION ALL
SELECT first_name, last_name
FROM contractors;
Explanation:
UNION ALL
operator combines the data from both employees
and contractors
without removing any duplicates.Result:
first_name | last_name |
---|---|
Alice | Johnson |
Bob | Smith |
Charlie | Brown |
Alice | Johnson |
Charlie | Brown |
In order for a UNION
operation to work, the number of columns in both queries must be the same. However, the column names do not need to match. The data types of corresponding columns in each SELECT
query must also be compatible.
If you are selecting data from two different tables with the same column data types but different column names:
SELECT first_name AS name, last_name
FROM employees
UNION
SELECT first_name AS name, surname
FROM contractors;
Explanation:
first_name
and last_name
have different names in the two tables. But using AS name
makes the column names consistent, and the query works as expected.Result:
name | last_name |
---|---|
Alice | Johnson |
Bob | Smith |
Charlie | Brown |
The UNION
operator can also be used to combine data from multiple tables. For instance, you may want to retrieve a unified list of customer_id
from two different tables, orders
and returns
, to analyze customers who made both orders and returns.
SELECT customer_id
FROM orders
UNION
SELECT customer_id
FROM returns;
Explanation:
customer_id
data from both the orders
and returns
tables.You can also use the UNION
operator with aggregate functions such as COUNT()
, SUM()
, or AVG()
to combine the results of two or more aggregated queries.
SELECT COUNT(*) AS total_orders
FROM orders
UNION
SELECT COUNT(*) AS total_returns
FROM returns;
Explanation:
UNION
operator combines the two aggregated results into a single result set.Result:
total_orders | total_returns |
---|---|
120 | 50 |
When using UNION
, the sorting of the results can be applied by adding an ORDER BY
clause at the end of the combined query.
SELECT first_name, last_name
FROM employees
UNION
SELECT first_name, last_name
FROM contractors
ORDER BY last_name;
Explanation:
ORDER BY
clause sorts the final result set by the last_name
column after combining the data from both tables.Result:
first_name | last_name |
---|---|
Alice | Johnson |
Charlie | Brown |
Bob | Smith |
UNION
: Since it removes duplicate records, it may take more processing time and memory, especially with large datasets.UNION ALL
: It generally performs better than UNION
because it doesn't have to check for and remove duplicates.If you’re confident that the results of the SELECT
statements won’t contain duplicates, it’s often better to use UNION ALL
for better performance.