How to execute HiveQL queries using Shell script?


Shell script can be used to run the Hive queries in batch mode. It will handle the input values/arguments , execution of queries and errors during the execution. In this tutorial, we will write the hive queries in a file. Then the file name is given as argument to the shell script.

Also we are using hive -f option and hivevar option for executing the hive queries using shell script

Hive query to run in batch mode

Lets consider that we have a below Hive query to run in batch mode. This is saved in a file with the name of input_hive_query.q. The query is inserting the customer order details into product_order_master table. Here the source table is customer_orders which present in the cust_DB database.

Also the variables ${database} and {run_date} are used in the Hive query. We need to set the values for these variables during the execution.

Shell script to run the Hive query

The below shell script has three functions

  • main () – The execution of the script is starts from here. It calls the other two functions one by one.
  • setup() – It will process the arguments that we pass while executing the script. Also it set the values for the variables ( run_date & database ).
  • run() – This function execute the hive query which is present in the file. The below command is used for the execution

We saved the below shell script in the name of

Command to execute the shell script

We need two arguments to execute our shell script .

  • HiveQL file name – The file name input_hive_query.q is given as a first argument with the name of -f.
  • Batch date – The batch date is given as second argument with the name of -d.

sh -f input_hive_query.q -d ‘2021-07-25’

After we executing this command, the shell script process the arguments. Then it called hive command to execute the hive query. If there is any error in the execution, it will write the errors in the log file => /x/home/user/revisit_class/hive_query.${NOW}.log.