Microsoft Business Intelligence TOOLS(SSIS,SSRS and SSAS): Table Partitioning

SQL Server Database Table Partitioning technique is fully available in SQL Server 2005 and SQL Server 2008 but logically it was available in SQL Server 2000 as a "Partitioned View". Partitioned View had several limitations. Lets looks at few facts of Partitioned View implementation:

Partitioning a table horizontally by replacing original table with several smaller member tables.
Data separation loaded into table was based on CHECK Constraint created on one of their column
In case of distributed partitioned view, each member table is an individual member server
You can not exceed 256 member tables
One member table can not have more than one range.
If you are implementing distributed partitioned view then add a linked server
Any linked server cannot be a loopback linked server, that is, a linked server that points to the same instance of SQL Server
Data type mapping limitation in Distributed partitioned view for example smallmoney columns in remote tables is mapped as money.
Data placement has to be planned properly as it is not controlled automatically
Partitioned views are transparent to the application
Limitations of data types and column types.
Having views on every table in your database could be a maintenance nightmare
SQL Server 2005 and 2008 extended horizontal partitioning concept that 2000 was using by allowing partition placement on the same table within a single database. Hence, planning, implementation and maintenance is very easy and manageable. It also improves the performance, reduce contention and increase availability of data.

How to implement Partitioning in SQL Server 2005/2008?

Lets assume you have Sales database that has a table named SalesLog. Your goal is to to keep 3 years of data at any given point and purge old data. You also know that your mostly report requires data for 6 month to analyze sales detail. There is no update or delete operation on table. Table structure is:

SaleDate datetime,

.....................
.....................

Below is step by step to implement partitioning on SalesLog table to meet requirement and also to gain performance.

Step 1

As we identified, 3 years is life cycle of data, mostly report requires 6 month of data and table has only insert and select operation. So, it would be ideal to have 7 partitions with separate file groups. Main Reasons:

Performance
Data Backup Strategy. Since, data is static/read only so we can implement File group backup instead of performing full backup. Number of filegroups and files per database could be up to 32767 in both 32 bit and 64 bit SQL Server
Lets Alter database to add filegroups using command below:

Alter Database Sales Add FileGroup [PartitionFG1]
Alter Database Sales Add FileGroup [PartitionFG2]
Alter Database Sales Add FileGroup [PartitionFG3]
Alter Database Sales Add FileGroup [PartitionFG4]
Alter Database Sales Add FileGroup [PartitionFG5]
Alter Database Sales Add FileGroup [PartitionFG6]
Alter Database Sales Add FileGroup [PartitionFG7]

Step 2

Add files to filegroups:

Alter Database Sales Add File
(
Name = 'Partition_Data1',
FileName = 'F:\MSSQL\Data\Partition_Data1.ndf',
Size = 10240 MB
)
To FileGroup [PartitionFG1]

.
.
.
.
.
.
.

Alter Database Sales Add File
(
Name = 'Partition_Data7',
FileName = 'F:\MSSQL\Data\Partition_Data7.ndf',
Size = 10240 MB
)
To FileGroup [PartitionFG7]

Step 3

Create Partition Function to define data range. It will map row of SalesLog table into partitions based on the values of SaleDate column.

use Sales
GO
Create Partition Function SalesLogPFN (datetime)
as
range RIGHT for values
(
'2008-01-01',
'2008-07-01',
'2009-01-01',
'2009-07-01',
'2010-01-01',
'2010-07-01'
)

Data Type specified in Partition Function must be the same as Partition Column in Table that you are going to partition. Partition function as such is not directly related to any table. Possible value for range is RIGHT and LEFT. In our case, 1st Partition will contain rows that has SaleDate less than 2008-01-01, 2nd Partition will have row starting from 2008-01-01 up to 2008-06-30 23:59:59:900 and similarly 3rd and others.

Step 4

Create Partition Scheme to link partition function to proper file groups that we have created as part of Step 2. So, in our case, we created 7 file groups which is correct as we have 6 value for partition function.

use Sales
GO

Create Partition Scheme SalesLogScheme
AS
PARTITION SalesLogPFN
TO ([PartitionFG1], [PartitionFG2], [PartitionFG3], [PartitionFG4], [PartitionFG5], [PartitionFG6], [PartitionFG7])

Step 5

Now, we need to create clustered Index on SaleDate column that will place data to corresponding file groups automatically based on definition of SalesLogScheme

CREATE CLUSTERED INDEX [IDX_SalesLog_SaleDate] ON [dbo].[SalesLog]([SaleDate]) ON SalesLogScheme ([SaleDate])

It is important to create clustered index aligned which is by creating index on column that is partitioned. If a clustered index is not aligned then any merging of partition will require us to drop the clustered index along with the non-clustered indexes and rebuild them.

Now your are all set.

Conclusion

Partitioning a table divides the table and its indexes into smaller chunks so that it can be easily maintained as maintenance operation can be applied partition by partition basis and also improves query performance as optimizer can use proper queries to appropriate partitions instead of querying entire table.

Microsoft Business Intelligence TOOLS(SSIS,SSRS and SSAS)

http://reddymsbitools.blogspot.com

Wednesday, 3 November 2010

Table Partitioning

No comments:

Post a Comment

Followers

Popular Posts

Blog Archive