Working With Strings

An introduction to working with strings.

SQL has several built-in functions for manipulating string data in fields. Some of the more common ones are CONCAT, LEFT, LOWER, UPPER, SUBSTRING, STRING_SPLIT, and LENGTH.

For a full list of string functions see the reference section.

CONCAT

CONCAT takes two or more string values and combines them into one value. These values can be defined in the query or selected from the table. In the following example the query concatenates manager name, a separating space (the value specified in the query) and the sales person name:

SELECT CONCAT(manager, ", ", sales_agent)
  FROM sales_teams

The resulting table looks like this:

concat
Dustin Brinkmann, Anna Snelling
Dustin Brinkmann, Cecily Lampkin
Melvin Marxen, Mei-Mei Johns
Cara Losch, Violet Mclelland
Cara Losch, Corliss Cosme
Cara Losch, Rosie Papadopoulos
Cara Losch, Garret Kinder
Cara Losch, Wilburn Farren
Cara Losch, Elizabeth Anderson
Rocco Neubert, Daniell Hammack
Rocco Neubert, Cassey Cress

As a convenience, SQL also supports combining strings using the concatenation operator ||. Using the concatenation operator, the previous query could be written like this:

SELECT manager || ", " || sales_agent
  FROM sales_teams

LEFT

The LEFT string function returns the specified number of leftmost characters from a value. If we wanted to use a one-letter identifier for each of the regional offices for the sales agents we could run a query like this:

SELECT sales_agent,
       LEFT(regional_office, 1)
  FROM sales_teams

and we would get back:

sales_agent left
Anna Snelling C
Cecily Lampkin C
Mei-Mei Johns C
Violet Mclelland E
Corliss Cosme E
Rosie Papadopoulos E
Garret Kinder E
Wilburn Farren E
Elizabeth Anderson E
Daniell Hammack E
Cassey Cress E
Donn Cantrell E
Versie Hillebrand C

LOWER

LOWER is a useful string function for cleaning up the presentation of data as it transforms all characters in a value to lower case. For example we could change all the product names to lower case in the following query:

SELECT LOWER(product)
  FROM products
lower
gtx basic
gtx pro
mg special
mg advanced
gtx plus pro
gtx plus basic
gtk 500
mg mono
alpha

UPPER

Similar to LOWER, UPPER returns all the characters in a string as upper case. If we needed a list of the employees in all upper case we could run the query:

SELECT UPPER(sales_agent)
  FROM sales_teams
upper
ANNA SNELLING
CECILY LAMPKIN
MEI-MEI JOHNS
VIOLET MCLELLAND
CORLISS COSME
ROSIE PAPADOPOULOS
GARRET KINDER
WILBURN FARREN
ELIZABETH ANDERSON

SUBSTRING

SUBSTRING is a function that is used to return a portion of a string based on it’s placement in the string. For example if we were looking at data only for the GTX product line and we wanted to remove the GTX prefix from our query results we could use the following query:

SELECT SUBSTRING(product, 5),
       sales_price
  FROM products
 WHERE product LIKE "GTX%"

and the results returned would look like this:

substring sales_price
Basic 550
Pro 4,821
Plus Pro 5,482
Plus Basic 1,096

The SUBSTRINGfunction requires two arguments and can take a third. The first argument is the string to be found and it can be stated either as a string or as a column. The second argument is the placement in the string of the first character to be returned. The third (optional) argument is the number of characters to be returned. If the third argument is omitted, the query returns all of the string from the first returned character to the end of the string.

LENGTH

The LENGTH function returns the number of characters in a string. It may seem like an odd thing to want to know, but it is useful when you want to remove characters from the beginning of a string leaving a specified number of characters from the end. Our dataset doesn’t have a particularly good string where this operation would make sense, but as an example we could run a query to return only the last five letters of each sales agent name and it would look like this:

SELECT sales_agent,
       SUBSTRING(sales_agent, LENGTH(sales_agent) -5)
  FROM sales_teams
sales_agent substring
Anna Snelling elling
Cecily Lampkin ampkin
Mei-Mei Johns Johns
Violet Mclelland elland
Corliss Cosme Cosme
Rosie Papadopoulos poulos
Garret Kinder Kinder
Wilburn Farren Farren
Elizabeth Anderson derson
Daniell Hammack ammack
Cassey Cress Cress
Donn Cantrell ntrell

STRING_SPLIT

STRING_SPLIT is a function that is used to split a string into multiple strings, based on some delimiter. Unlike other functions, STRING_SPLIT returns multiple rows for each string input. For example, if you were looking to get the first and last names of your sales agents in a single column, you could split their names using the space character as a delimiter :

SELECT sales_agent,
       STRING_SPLIT(sales_agent, " ")
  FROM sales_teams
sales_agent substring
Anna Snelling Anna
Anna Snelling Snelling
Cecily Lampkin Cecily
Cecily Lampkin Lampkin
Mei-Mei Johns Mei-Mei
Mei-Mei Johns Johns

The STRING_SPLIT function requires two arguments. The first argument is the string to be split and it can be stated either as a string or as a column. The second argument is the delimiter character used to split the string.