Explode function in Hive with examples

Explode function in Hive

Explode is a User Defined Table generating Function(UDTF) in Hive. It takes an array (or a map) as an input and outputs the elements of the array (or a map) as separate rows. UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW.

LATERAL VIEW statement is used with UDTF such as explode(). It creates a virtual table by applying the UDTF to each row of the base table and then joins resulting output rows to the input row.

Explode function syntax

It will return n number of rows where n is the size of the array/map. This function represent each element of array/map as a row.

Lateral view syntax

Example

Consider that we have a table investment_report in Hive. It has the columns such as user_id, account_value and invested_stocks. The column invested_stocks is an array data type.

Describe column names of hive table
Describe column names of hive table

It has the invested stock details of users as below. Let’s use the explode function on this column.

investment_report table in Hive
investment_report table in Hive

Now we will write a select query with explode function. It will accept the one row as input and return multiple rows as output.

Explode function example in Hive
Explode function example in Hive

As mentioned above, the explode function flatten the array. The select query returned each stock name as a separate row. Similarly we can use the explode function for the map data type values.

Let’s try to add the column user_id in the select query. It is throwing the semantic exception [Error 10081] due to explode function in the select.

Semantic exception for explode function in Hive
Semantic exception for explode function in Hive

Lateral view with explode in Hive

Lateral view with explode function creates a virtual table for the flatten rows. Then we can apply the join to the base table.

As per the syntax, we have added the lateral view statement with explode function after the from clause. The alias name stock_tble is given for this results. Similarly we have provided the column alias as stock_column.

Now we can select the base table column user_id with exploded values as below.

Lateral view with explode function in Hive
Lateral view with explode function in Hive

If we use the Select * statement, it will include the lateral view results also. As we shown below, the column stock_column is included in the result set.

Select * with Lateral view in Hive
Select * with Lateral view in Hive

Let’s use the Where condition to filter the results based on the stock names. As shown below, we have fetched only Costco stock.

Lateral view explode with Where condition in Hive
Lateral view explode with Where condition in Hive

From this lateral view results, we can get the count of each stock names using group by and count.

Lateral view with group by and count in Hive
Lateral view with group by and count in Hive

If we have multiple array/map columns in the table, we can create the multiple lateral view statement as below.

multiple lateral view syntax

Moreover, the join operation can be performed on the lateral view results. For that, we can write the lateral view statement as inner query. From the outer query, we can do the join on different tables.

Recommended Articles

References from Apache Hive documentation