SQL Subqueries


Subqueries, also known as nested queries, are one of the most powerful features in SQL. A subquery is a query embedded within another query, typically in the WHERE, FROM, or SELECT clauses. They allow you to perform more complex operations and retrieve data based on the results of another query, enabling you to work with multiple layers of data in a single SQL statement.


1. What is a Subquery in SQL?

A subquery is a SQL query nested inside another query. The outer query is the main query, and the subquery provides the values needed to execute the main query. Subqueries can be used to:

  • Retrieve data for comparison.
  • Return a value used in a filter condition.
  • Supply data to a column in the result set.

Subqueries are often used to simplify complex queries that would otherwise require multiple steps or joins.

Basic Syntax of a Subquery:

SELECT column1, column2
FROM table
WHERE column1 = (SELECT column1 FROM another_table WHERE condition);
  • The inner query (subquery) is placed within parentheses.
  • The outer query uses the results from the subquery to filter or manipulate data.

2. Types of Subqueries

Subqueries can be classified based on where they are used and their behavior. Here are the most common types:

1. Single-Row Subquery

A single-row subquery returns a single value, which is typically used in comparisons (=, >, <, <=, >=, <>) in the WHERE clause.

Example:

You want to find the employee with the highest salary.

SELECT first_name, last_name, salary
FROM employees
WHERE salary = (SELECT MAX(salary) FROM employees);

Explanation:

  • The subquery (SELECT MAX(salary) FROM employees) finds the highest salary.
  • The outer query uses that value to find the employee with that salary.

Result:

first_name last_name salary
John Doe 120000

2. Multi-Row Subquery

A multi-row subquery returns more than one value, and is used with operators like IN, NOT IN, ANY, or ALL.

Example:

To find employees who work in departments located in New York:

SELECT first_name, last_name
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE location = 'New York');

Explanation:

  • The subquery returns multiple department IDs where the location is "New York".
  • The outer query finds all employees working in those departments.

Result:

first_name last_name
Alice Johnson
Bob Smith

3. Correlated Subquery

A correlated subquery refers to columns from the outer query. It is evaluated once for each row processed by the outer query.

Example:

Find employees who earn more than the average salary in their respective departments:

SELECT first_name, last_name, department_id, salary
FROM employees e
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);

Explanation:

  • The subquery is correlated because it references e.department_id, which comes from the outer query.
  • For each employee, the subquery calculates the average salary of their department and compares it to the employee's salary.

Result:

first_name last_name department_id salary
Alice Johnson 1 80000
Bob Smith 2 90000

3. Using Subqueries in Different SQL Clauses

Subqueries can be used in different parts of a SQL query:

1. In the SELECT Clause

You can use subqueries in the SELECT clause to return a value derived from another table.

Example:
SELECT first_name, last_name, 
       (SELECT department_name FROM departments WHERE department_id = employees.department_id) AS department_name
FROM employees;

Explanation:

  • The subquery in the SELECT clause fetches the department name for each employee based on their department_id.

2. In the WHERE Clause

The WHERE clause is the most common place to use subqueries. Subqueries in the WHERE clause are typically used to filter records based on the results of another query.

Example:

Find employees who earn more than the average salary:

SELECT first_name, last_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

3. In the FROM Clause

Subqueries in the FROM clause are often called inline views or derived tables. These subqueries return a result set that is treated like a table.

Example:
SELECT department_id, MAX(salary) AS highest_salary
FROM (SELECT department_id, salary FROM employees WHERE salary > 50000) AS filtered_employees
GROUP BY department_id;

Explanation:

  • The subquery in the FROM clause creates a temporary table (filtered_employees) with only employees earning more than 50,000.
  • The outer query then calculates the maximum salary for each department from this filtered set.

4. Advantages of Using Subqueries

  • Simplifies Complex Queries: Subqueries allow you to break down complex queries into smaller, more manageable components.
  • Improved Readability: By using subqueries, you can write queries that are easier to understand without the need for multiple joins.
  • Flexibility: Subqueries enable you to use dynamic data from the results of other queries, making them useful for comparison and filtering.

5. Limitations of SQL Subqueries

  • Performance: Subqueries can be slower than using joins, especially when they are correlated. In these cases, SQL Server (and other databases) may evaluate the subquery multiple times, leading to performance issues.
  • Readability: While subqueries can improve readability in some cases, they can make a query harder to understand when nested too deeply or used incorrectly.

6. Best Practices for Using Subqueries

  • Optimize Performance: If performance is a concern, try to replace correlated subqueries with joins or use EXISTS or IN where possible.
  • Keep it Simple: Avoid deeply nested subqueries. If a subquery becomes too complex, consider breaking it down into multiple steps or using a temporary table.
  • Use Subqueries Where Necessary: Not every query needs a subquery. Use subqueries when they make your SQL query more readable or when you need to reference a result from another query.