How to use Qualify Row_number in BigQuery?

ROW_NUMBER() function in BigQuery

Row_number is a Numbering function which is a subset of Analytic function in BigQuery. In the analytic function, the OVER clause should be included to define a window of rows within a query result set. For each selected window of rows, Row_number function assigns a unique number.

Syntax of Row number in BigQuery

  • PARTITIONED BY: breaks up the input rows into separate partitions, over which the analytic function is independently evaluated.
  • column_reference is a column name which is used to define the window of the rows.
  • ORDER BY: Defines how rows are ordered within a partition.
  • value_expression is a column name that is used for sorting the rows within each partition.

Filtering results with the QUALIFY clause

The QUALIFY clause is used to filter the results of analytic function. As mentioned earlier, Row number functions assigns the unique number(1,2,3..) for each row in the partition, From that result set, Qualify filter the rows based on condition.

Example

Let’s understand Qualify row_number with example. We have a table house_rent_details in BigQuery. It contains the monthly rent details of the houses as below. Here rc_fin_test_tables is a dataset name. The columns of this table are Property_id, Address, Monthly_rent_in_dollars and City.

House rent details table in BigQuery
House rent details table in BigQuery

From this table, we want to find the low rent house details for each city. The column reference for partition is city. The sort_expression can be defined using the Monthly_rent_in_dollars column.

Let’s write the query with the Row number function. It will assign the unique integer number for each row within a partition. To sort the rent in ascending order within each partition, we are giving the ORDER BY clause in the query.

Output

Row number function in BigQuery
Row number function in BigQuery

After executing the query, we can see that each city is considered as a partition. For example, One partition has the New York city. Another partition has the Boston city.

In each partition, the rows are sorted in ascending order based on the Monthly_rent_in_dollars values. After that the row numbers are assigned to each row in the partition.

Filter the results using QUALIFY Clause

Now we can apply the Qualify clause to filter the results based on row number. Since we want to find the low rent house details for each city, we can fetch only the row which has row number as 1. Let’s write a query with Qualify as below

Output

Qualify row number example in BigQuery
Qualify row number example in BigQuery

Finally the QUALIFY clause returned the low rent house details for each city. Similarly we can get the highest rent house details for each city. To get that results, we just need to sort the rent details in descending order.

Output

Qualify row number in BigQuery
Qualify row number in BigQuery