Split function in BigQuery with examples

Split function in BigQuery

Split is a part of String function in BigQuery which helps to split the value based on given delimiter. The String and Bytes values can be used in this function.

  • For String data type, the default delimiter is comma(,).
  • For Bytes values, the delimit should be specified.

Syntax

The function return data type is

  • ARRAY of type STRING for string value
    • If the given input string is empty, the function returns ARRAY with a single empty STRING.
  • ARRAY of type BYTES for byte value

Examples

Let’s look at the SPLIT function with some examples in BigQuery. For these examples, we have created a table as customer_info in BigQuery. This table contains the personal details of bank customers.

As mentioned below, the table has information about customer such as customer_id, name, date_of_birth, address, email_id, phone_numer, state and zip_code.

customer_info table in BigQuery
customer_info table in BigQuery

Example 1:

The column “name” has both first and last name of the customer with the separator of comma (,). Let’s separate the first name and last name using SPLIT function.

In the query, we have used SPLIT function for the column “name“. Also we mentioned the delimiter as comma (,). As a result, it returned the first name and last name in an ARRAY of STRING.

Example: Patricia,Hamilton is split into Patricia and Hamilton.

SPLIT function example in BigQuery
SPLIT function example in BigQuery

If we want to split the string into columns, we have to use the OFFSET (for zero-based indexes) or ORDINAL (for one-based indexes) operator. With that, we can fetch the specific value from the ARRAY.

SPLIT the string into columns using OFFSET

To get the first element from the array, we need to use the index value as 0 in OFFSET.

SPLIT the string into columns using ORDINAL

To get the first element from the array, we need to use the index value as 1 in ORDINAL.

As shown below, both the queries split the name and returned the first name and last name in a column.

Split the string into columns in BigQuery
Split the string into columns in BigQuery

Please note that, if the given index value is not found in the array, it will throw an error as “Array index is out of bounds (overflow)“. To avoid those errors, we can use SAFE_OFFSET or SAFE_ORDINAL function. Instead of error, it will return a NULL value for array index out of bounds.

In the query, we have used the index value as 3 using safe_offset operator.

  • The split function returns only two values(first, last name) in the array.
  • It means that it can return the values only for index 0 and 1.
  • Since there is no value for index 3 in array, the safe_offset operator returned the NULL values for both first_name and last_name columns.
Split function with safe_offset in BigQuery
Split function with safe_offset in BigQuery

Recommended Articles

References from GCP official documentation