How to split the string based on pattern in Hive

Split function in Hive

Hive providing many string functions to manipulate the strings. Split is one of the string function in Hive that used to split the string depending on the pattern and returns the array of strings.

Syntax of Split function in Hive

It splits input string around pat (pat is a regular expression).

Example 1: Split date values in Hive

Split function in Hive
Split function example in Hive

Here the customer_transactions table contains the transaction date field. Lets split this date column into year,month and date using split function in Hive.

Since we gave the pattern as hyphen(-) in the split function, it returns the year,month and date in a string array.

Example 2 : Split the URL in Hive

split the url using split function in Hive

In this example, we are going to split the organization url into array of strings.Since the dot(.) has a special meaning in Hive, we need to use double slash(\\) before the pattern to split the url.

Split function with array index in Hive

If we want to fetch only the website name without www and .info or .com, we can simply give the index value to get the element in a array. The index of the array starts with zero. So we can give split(url,’\.’)[1] to get the required output.

Recommended Articles