Working With Arrays
An introduction to working with arrays.
Traditional SQL databases store data as one value per field. More modern SQL databases can store multiple, indexed values of the same data type in a single field called an array. There is no standard set of SQL functions for arrays. Instead, different flavors of SQL each have their own functions and aggregations. The ones we support in our SQL implementation include: ARRAY
, ARRAY_AGG
, ARRAY_APPEND
, ARRAY_CONCAT
, ARRAY_JOIN
, ARRAY_LENGTH
, ARRAY_PREPEND
, ARRAY_CONTAINS
, and ELEMENT_AT
.
For a full list of array functions see the reference section.
Other features of our array implementation include:
We use a comma as an array separator. Note that (for the time being) commas in array values will result in an incorrect count of array items as we do not differentiate between commas in strings and the comma separator.
Array functions can be used with joins to other data sources, but at present the array functions must occur before joining or federation.
Indexing is one-based (the index starts with 1, not 0), and negative indexing is supported (See
ELEMENT_AT
below for more details).
The following examples are run against a small table called city_array that looks like this:
country_id | cities |
---|---|
UK | London, Stretford |
SG | Singapore |
US | Southlake, South San Francisco, South Brunswick, Seattle |
CN | Beijing |
CH | Geneva, Bern |
BR | Sao Paulo |
NL | Utrechtv |
IN | Bombayv |
IT | Roma, Venicev |
AU | Sydney |
CA | Toronto, Whitehorse |
JP | Tokyo, Hiroshima |
DE | Munich |
ARRAY
The ARRAY
command allows you to create an array from a group of values. The values can be field values, calculated values, literals, or a combination thereof. The only requirement is that all the values are of the same data type. The following is an example of creating an array using the city_array table and literals:
SELECT ARRAY(country_id, "kind", "of", "town")
FROM city_array
A partial set of the results would be:
array |
---|
UK, kind, of, town |
SG, kind, of, town |
US, kind, of, town |
CN, kind, of, town |
CH, kind, of, town |
BR, kind, of, town |
ARRAY_AGG
The sole aggregation for arrays takes all the values in a column and aggregates them into one field. For example
SELECT ARRAY_AGG(country_id)
FROM city_array
returns
array_agg |
---|
UK, SG, US, CN, CH, BR, NL, IN, IT, AU, CA, JP, DE |
ARRAY_APPEND
The ARRAY_APPEND
function is used to append values to an array. As with all array functions the values can be field values, calculated values, literals, or a combination thereof. The only requirement is that all the values are of the same data type. The following is an example of a query using ARRAY_APPEND
:
SELECT country_id,
ARRAY_APPEND(cities, "foo")
FROM city_array
country_id | array_append |
---|---|
UK | London, Stretford, foo |
SG | Singapore, foo |
US | Southlake, South San Francisco, South Brunswick, Seattle, foo |
CN | Beijing, foo |
CH | Geneva, Bern, foo |
ARRAY_CONCAT
The ARRAY_CONCAT
function combines multiple arrays into a single array:
SELECT country_id,
ARRAY_CONCAT(cities, ARRAY("my", "kind", "of", "town"))
FROM city_array
country_id | array_concat |
---|---|
UK | London, Stretford, my, kind, of, town |
SG | Singapore, my, kind, of, town |
US | Southlake, South San Francisco, South Brunswick, Seattle, my, kind, of, town |
CN | Beijing, my, kind, of, town |
ARRAY_JOIN
This is the function to use if you want to concatenate all the values in an array field into one string value. You can specify an optional argument as a separator, and it can be any string. If you do not specify a separator, there will be nothing aded between the values. In this example we chose to use a forward slash:
SELECT country_id,
ARRAY_JOIN(cities, "/")
FROM city_array
country_id | array_concat |
---|---|
UK | London/Stretford |
SG | Singapore |
US | Southlake/South San Francisco/South Brunswick/Seattle |
CN | Beijing |
CH | Geneva/Bern |
BR | Sao Paulo |
ARRAY_LENGTH
If you would like to know the number of items in an array you can use ARRAY_LENGTH
:
SELECT country_id,
cities,
ARRAY_LENGTH(cities)
FROM city_array
country_id | cities | array_length |
---|---|---|
UK | London, Stretford | 2 |
SG | Singapore | 1 |
US | Southlake, South San Francisco, South Brunswick, Seattle | 4 |
CN | Beijing | 1 |
CH | Geneva, Bern | 2 |
BR | Sao Paulo | 1 |
ARRAY_PREPEND
Similar to ARRAY_APPEND
, ARRAY_PREPEND
adds values to an array. Because the array always has to come first in the command, this function allows you to specify the output order:
SELECT country_id,
ARRAY_PREPEND(cities, "foo")
FROM city_array
country_id | array_prepend |
---|---|
UK | foo, London, Stretford |
SG | foo, Singapore |
US | foo, Southlake, South San Francisco, South Brunswick, Seattle |
CN | foo, Beijing |
CH | foo, Geneva, Bern |
BR | foo, Sao Paulo |
NL | foo, Utrecht |
ARRAY_CONTAINS
The ARRAY_CONTAINS
function evaluates a column for a specific value and returns true if the value exists in a row and false if it does not:
SELECT country_id,
ARRAY_CONTAINS(cities, "London")
FROM city_array
returns:
country_id | array_contains |
---|---|
UK | true |
SG | false |
US | false |
CN | false |
CH | false |
ELEMENT_AT
Because the values in arrays are indexed, structured values (not strings as they can appear in a tabular format) you can find the values at different positions in the array. If you want to count from the beginning of the array, you use a positive integer If you want to count from the end of the array, use a negative integer:
SELECT country_id,
ELEMENT_AT(cities, 1)
FROM city_array
returns the first value in the array:
country_id | element_at |
---|---|
UK | London |
SG | Singapore |
US | Southlake |
CN | Beijing |
CH | Geneva |
While
SELECT country_id,
ELEMENT_AT(cities, -1)
FROM city_array
returns the last value:
country_id | element_at |
---|---|
UK | Stretford |
SG | Singapore |
US | Seattle |
CN | Beijing |
CH | Bern |
The ELEMENT_AT
function comes in both a long version and a syntactically sugared light version. Both work equally well. The sugared versions are written like this:
SELECT country_id,
cities[1]
FROM city_array
SELECT country_id,
cities[-1]
FROM city_array
An introduction to WITH
.