How to split the string based on pattern in Hive
Contents
Split function in Hive
Hive providing many string functions to manipulate the strings. Split is one of the string function in Hive that used to split the string depending on the pattern and returns the array of strings.
Syntax of Split function in Hive
1 |
split(string str,string pat) |
It splits input string around pat (pat is a regular expression).
Example 1: Split date values in Hive
Here the customer_transactions table contains the transaction date field. Lets split this date column into year,month and date using split function in Hive.
1 2 3 4 5 6 |
hive> select split(txn_date,'-') from customer_transactions; OK ["2019","04","14"] ["2019","05","12"] ["2019","09","20"] Time taken: 0.241 seconds, Fetched: 3 row(s) |
Since we gave the pattern as hyphen(-) in the split function, it returns the year,month and date in a string array.
Example 2 : Split the URL in Hive
In this example, we are going to split the organization url into array of strings.Since the dot(.) has a special meaning in Hive, we need to use double slash(\\) before the pattern to split the url.
1 2 3 4 5 |
hive> select split(url,'\\.') from organization; OK ["www","abc","info"] ["www","eyecare","com"] Time taken: 0.081 seconds, Fetched: 2 row(s) |
Split function with array index in Hive
If we want to fetch only the website name without www and .info or .com, we can simply give the index value to get the element in a array. The index of the array starts with zero. So we can give split(url,’\.’)[1] to get the required output.
1 2 3 4 5 |
hive> select split(url,'\\.')[1] from organization; OK abc eyecare Time taken: 0.094 seconds, Fetched: 2 row(s) |
Recommended Articles
- Substring function in Hive with examples
- Regexp_extract function in Hive with examples
- RLIKE (Regular expression) function in Hive with examples
- Instr function in Hive with examples