Length function in Hive with examples

Length function in Hive

The length function returns the length of a specified string. In other words, it gives the number of characters in the string. It is available in every relational database systems. In Hive, Length function include the leading and trailing whitespaces when calculating the length of the string.

Syntax of Length function in Hive

The function is able to find the length for different values such as string, varchar, int, decimal, date & timestamp. To calculate length, either the value or column name should be passed as argument inside the length function.

If the value is NULL, the length function also returns NULL in Hive.

Example

Lets show the length function with few examples. For that, we have a table called “student_info” in Hive.

Student table in Hive
Student table in Hive

As we can see that the table contains the student details such as student_id, name, email_id, date_of_joining and last_login_time. Also please find the data types of those columns in the below screenshot.

Structure of student table in Hive
Structure of student table in Hive

Lets find the length of student_id column from student_info table. We have used length function on the student_id and it returns the length as 7. Even this column data type is bigint, the function returns the length of the value.

Length function example in Hive
Length function example in Hive

Next lets find the length of email_id column. The data type of the column is String and it contains the NULL value for one student.

Length function example with string value
Length function example with string value

As we can see above, the length function returned the number of characters in the email_id column. Since the email_id is NULL for one student, the length is also NULL.

Substring with Length function in Hive

We can use the Length function with other String function such as SubString. Lets use both length and Substring function on the email_id column to find the user name without domain name.

In the email_id, the character ‘@’ is splitting the user name and domain name. So we need to find the index/location of that character in email_id column. For that we have used INSTR and REVERSE function.

The REVERSE function is used to reverse the entire email_id string as below. In that String, INSTR function finds the index of ‘@’ character.

Reverse string in Hive
Reverse string in Hive

Then we are subtracting the index of ‘@’ character from the length of email_id column. So the starting index for the substring is 1 and ending index of the substring is the index value before the ‘@’ character.

As we have given the required index value in the substring function, it returned the user name from the email_id column as below.

Substring with Length function in Hive
Substring with Length function in Hive