Pages

Monday, 24 September 2012

Partitioning a large table with existing records




Best practice when partitioning a large table with existing records.

When trying to partition a large table that contain data, the main thing is trying to minimise data movements. When creating a new partition, if data exists in the table that falls in the new partition, Sql Server will delete the data from the old partition and insert it into the new partition. This can be a very expensive operation and can lead to locking the table.

It might not be obvious at first glance, but creating the partitions in a certain order will have a significant impact on the amount of data moved and the time it takes to partition a large table. I will try to illustrate this tip with an example;

Let’s assume you have a table that contains 12 million records of historic orders made in 2012 which you want to partition by Order Date. To make the calculation easier, lets also assume that the records are distributed evenly so for each month of 2012 there are 1 million orders.

Option 1: Start by creating a partition for each month starting from February 2012 using partition right on the order date column, the result will be as shown in Table 1 below:


Steps performed to partition the table
What happens (Assuming partition right is used)
Split Partition at FEB 2012
11 Million records DELETED from the previous partition
11 Million records INSERTED into the new partition
Split Partition at MAR 2012
10 Million records DELETED from the previous partition
10 Million records INSERTED into the new partition
Split Partition at APR 2012
9 Million records DELETED from the previous partition
9 Million records INSERTED into the new partition
Split Partition at MAY 2012
8 Million records DELETED from the previous partition
8 Million records INSERTED into the new partition
Split Partition at JUN 2012
7 Million records DELETED from the previous partition
7 Million records INSERTED into the new partition
Split Partition at JUL 2012
6 Million records DELETED from the previous partition
6 Million records INSERTED into the new partition
Split Partition at AUG 2012
5 Million records DELETED from the previous partition
5 Million records INSERTED into the new partition
Split Partition at SEP 2012
4 Million records DELETED from the previous partition
4 Million records INSERTED into the new partition
Split Partition at OCT 2012
3 Million records DELETED from the previous partition
3 Million records INSERTED into the new partition
Split Partition at NOV 2012
2 Million records DELETED from the previous partition
2 Million records INSERTED into the new partition
Split Partition at DEC 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
TOTAL RECORDS DELETED + INSERTED

132 million records


Option 2: Create the partitions in a reverse order starting from the last partition DEC 2012, will result in far less data movement as illustrated in table 2 below:

Table 2 : creating the partitions in a different order

Steps performed to partition the table
What happens (Assuming partition right is used)
Split Partition at DEC 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at NOV 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at OCT 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at SEP 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at AUG 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at JUL 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at JUN 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at MAY 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at APR 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at MAR 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
Split Partition at FEB 2012
1 Million records DELETED from the previous partition
1 Million records INSERTED into the new partition
TOTAL RECORDS DELETED + INSERTED

22 million records



By creating the partitions in the reverse order the number of records moved is drastically reduced (132 vs 22) resulting in a more efficient way to partition large tables with existing data.


No comments: