How to create Partitioned primary index in Teradata?

Partitioned primary index in Teradata

Partitioning is an indexing technique in Teradata which helps to improve the query performance on the large tables. In general, the rows are distributed among the AMP(Access Module Processor) based on primary index. Then within each AMP, the records are sorted by row Id which is generated by Teradata.

If we create the partitioned primary index(PPI) for a table, the rows are sorted by partition value in AMP. Row distribution among the AMP still happens based on the primary index only. Using PPI, we can avoid full table scan in Teradata especially on Range Queries.

Example: Table without partitioned column in Teradata

Lets look at the Customer transaction table below which doesn’t have the PPI column. Here the primary index column is Customer_Id. So the records are distributed among the AMP based on the Customer_Id.

Records distributed among AMP without PPI
Records distributed among AMP without PPI

Assume that we are going to run the delete query with the filter condition of Transaction date.

Since we don’t have the index on Transaction date, PE will instruct AMP’s to do a full table scan.

Example: Table with partitioned column in Teradata

Lets create the Partitioned by column for the transaction date in Customer_Transaction table and then insert the required records.

As we defined the PPI for the Transaction_Date with INTERVAL ‘1’ MONTH, partitions are created for each month within the AMP. Each partitioning level is defined independently using a RANGE_N expression.

 Records distributed among AMP with PPI
Records distributed among AMP with PPI

Now if we run the delete query on this table, each AMP reads records from their respective partition(monthly). Here Parsing engine will not instruct AMP to do a full table scan.

  • AMP1 – reads January and February month partition. It doesn’t need other months data as per the delete query.
  • AMP2 – reads January,March & May month partition. It doesn’t need other months data as per the delete query.

Create Partitioned primary index in Teradata

Lets consider that Online payment industry in which millions of transactions happens for each day. If we bring those data into Teradata system, we need to define the partitioned primary index for the table to improve the performance of the query.we can create the partitions for each day, month and year as below.

  • For historical records, we can specify range with the interval of 1 year partitions.
  • Intermediate records – It is good to have range with the interval of 1 month partitions
  • Recent records(frequently accessing data) – we can specify range with the interval of 1 day partitions.