Trim function in Hive with examples

Trim function in Hive

Trim function removes the extra space from both ends of the string value. In other words, it removes the leading and trailing space from the string. Along with this, Hive providing two more trim function as below.

  • Ltrim – It return a string after trimming the space from the beginning (left hand side) of a string.
  • Rtrim – It return a string after trimming the space from the end (right hand side) of a string.

Trim function syntax

Trim function accepts only the String,Char and Varchar data types. If we pass the Numeric values in this function, it will throw an error in Hive.

Trim function example

Let’s consider that we have a table package_tracking in Hive. It contains the following columns package_id, tracking_id, current_city and expected_delivery_date as below.

Trim function example in Hive
Trim function example in Hive

The current_city column has name of the city where the package is located. But there is some extra space in those city names. Example: Seattle has leading space & Miami has both leading and trailing space.

If we query this table based on city name, it will not provide the accurate result. Lets try it without trim function

Leading space in string value
Leading space in string value

Since the city name Miami has the space in both beginning and end of the value, the query is not returning the result set. To fetch the correct results, we need to add trim function for the current_city column.

Now the trim function removed the extra space from the current_city column and it returned the required results.

trim function to remove space in Hive
trim function to remove space in Hive

In order to track the number of packages in each city, Let’s write the group by query with trim function as below.

group by with trim function in Hive
group by with trim function in Hive

Ltrim function example

As we mentioned earlier, Ltrim function removes the space from the beginning of the string. To demonstrate this function, we will show some queries with and without ltrim function.

The string Seattle has the leading space in the column current_city. Let’s write a query to fetch the package count of Seattle city.

Query without Ltrim function

As we shown below, we tried to fetch the city Seattle without ltrim function. Since it has the beginning space in the column, it is not matched with the given condition. So zero records returned as output.

ltrim function example in Hive
ltrim function example in Hive

Query with Ltrim function

Now we tried the same query with ltrim function for the column current_city. Since it removed the leading spaces from current_city column, the given condition is matched and returned the correct output as below.

ltrim function in Hive
ltrim function in Hive

Rtrim function example

Rtrim function removes the space from end of the string. In the table package_tracking, Boston string has the trailing space for one of the value in the column current_city. Let’s write the query with and without rtrim function to see the difference in output.

Query without Rtrim function

As we shown below, we are trying to fetch the package count of Boston city. The table package_tracking has two entries for Boston city. But the query returned the count as 1. This is because the other string value has the trailing space in the current_city column.

rtrim function in hive
rtrim function in hive

Query with Rtrim function

Now we tried the same query with rtrim function like rtrim(current_city) = ‘Boston. It removed the trailing space from current_city column and returned the correct result as below.

rtrim function example in Hive
rtrim function example in Hive

Recommended Articles