The Effective Use of Partition Pruning for the Optimisation of Retrieval Speed (Part 3)

The Effective Use of Partition Pruning for the Optimisation of Retrieval Speed (Part 3)

The Devil in the Detail - What Details Decide on the Effectiveness of Partition Pruning?

In the previous article of this series, a practical and effective approach to using partition pruning was explained in detail. This easy-to-implement method can significantly optimize query times. However, as is often the case, some details need to be taken into account to ensure the efficient and effective use of the presented method. In this regard, we echo Theodor Fontane, who observed as far back as the 19th century that the magic always lies in the details.

Table of Contents

Below, we will delve into some essential details that must be considered when employing partition pruning:

  1. History depth. As history gets longer, partitioning becomes more effective.
  2. Old reports. If you change an interval table to a partitioned interval one, then all queries against it remain valid (of course, to see the performance advantages you need to modify the filter condition).
  3. Outdated keys. Number of keys in historical table is usually changed over the time, in our example new accounts are opened and some are closed. Balances for the closed accounts could not be changed, but because of described logic, corresponding rows should be copied to each new month causing unnecessary growth of the table. In order to minimize this effect it is possible to define a default value for historized attribute. It means absence of a rows should be interpreted as the attribute having that value. In our example it is reasonable to assume, that closed account has a balance of zero, therefore we can implement a logic to not keep rows with zero balance; instead, in selects, we should count absence of a row as zero. Another example: if we keep track of statuses of account, then "Inactive" could be such a default value.
  4. ETL performance. ETL is going to be a bit more complicated for partitioned tables. But it is not necessarily slower, because you don't have to read the whole target table to detect differences.
  5. Given date in the future. If you select balances for the date which lie in a month not yet loaded in the table, then you receive no rows. If you use a classical interval table then the result is balances for the last loaded date. "No rows" is in my view more accurate, because we don't know the status in the future, but it is indeed the difference which should be taken to account.
  6. ETL for partitioned interval tables should correctly process the situation when no changes were made in the first day of a new month. Current values from previous month should be copied anyway.
  7. Local index for PK. You can use local index for the primary key if you partition the table. It speeds up its maintenance and allows you to drop/rebuild single partitions instead of the whole index for the time of ETL run.
  8. Partition level statistics. Oracle optimizer can use partition level statistics in case of partitioned interval tables, because just one partition being accessed each time. Therefore, you can save resources on collecting table level statistics for huge tables.

Want To Learn More? Contact Us!

Helene Fuchs

Your contact person

Helene Fuchs

Domain Lead Data Platform & Data Management

Pia Ehrnlechner

Your contact person

Pia Ehrnlechner

Domain Lead Data Platform & Data Management

Related Posts

chevron left icon
Previous post
Next post
chevron right icon

No previous post

No next post