When working with strings in SQL, one of the most useful functions is the SUBSTRING()
function. It allows you to extract a specific portion of a string, making it invaluable for string manipulation tasks like extracting area codes from phone numbers, getting the first name from full names, or analyzing specific patterns in text data.
The SUBSTRING()
function is used to extract a substring from a string. You specify the starting position and the length of the substring you want to extract.
Syntax:
SUBSTRING(string, start_position, length)
string
: The string expression from which to extract the substring.start_position
: The position to start extracting from (1-based index).length
: The number of characters to extract (optional, defaults to the end of the string).Let's say you have a column containing full names, and you want to extract the first name.
Extracting the first 5 characters of a string:
SELECT SUBSTRING('John Doe', 1, 4) AS first_name;
Result:
first_name |
---|
John |
In this example:
'John Doe'
is the source.1
(the first character).4
, so it extracts 'John'
.If you want to extract the substring starting from the 6th character and take the next 3 characters:
SELECT SUBSTRING('John Doe', 6, 3) AS middle_name;
Result:
middle_name |
---|
Doe |
In this example:
6
(which is the character 'D' in 'Doe').3
, so it extracts 'Doe'
.You can use the SUBSTRING()
function with data from a column. For example, extracting area codes from phone numbers stored as strings.
SELECT SUBSTRING(phone_number, 1, 3) AS area_code
FROM contacts;
Explanation:
phone_number
column in the contacts
table.If you have a string where the length varies (e.g., a date in the YYYY-MM-DD
format), you can use SUBSTRING()
to extract specific parts of the string.
Extracting the year from a date string:
SELECT SUBSTRING('2024-12-22', 1, 4) AS year;
Result:
year |
---|
2024 |
The SUBSTRING()
function can be useful when you want to extract specific patterns in a string.
Extracting the domain from an email address:
SELECT SUBSTRING(email, POSITION('@' IN email) + 1) AS domain
FROM users;
Explanation:
POSITION()
to find the location of the @
symbol, then extract everything that comes after it using SUBSTRING()
.While the SUBSTRING()
function is powerful, you should be aware of some potential edge cases:
If the start position is greater than the length of the string, SUBSTRING()
will return an empty result.
SELECT SUBSTRING('Hello', 10, 3);
Result:
''
(Empty string)
If the length specified exceeds the available length in the string, SUBSTRING()
will return the rest of the string starting from the specified position.
SELECT SUBSTRING('Hello', 2, 10);
Result:
SUBSTRING('Hello', 2, 10) |
---|
ello |
Here, it extracts all characters from position 2 onwards, even though the length was set to 10.
You can combine the CONCAT()
function with SUBSTRING()
to manipulate strings and concatenate the results.
SELECT CONCAT(SUBSTRING(first_name, 1, 3), SUBSTRING(last_name, 1, 2)) AS nickname
FROM users;
Explanation:
first_name
and the first 2 characters of the last_name
to create a nickname.SUBSTRING()
is powerful, using it excessively can lead to inefficient queries, especially on large datasets. Consider other methods or string manipulation techniques if possible.SUBSTRING()
with WHERE
clauses to filter records based on string patterns.
SELECT *
FROM employees
WHERE SUBSTRING(email, 1, 3) = 'joh';
This query filters employees whose email starts with "joh".