WebsiteSpark

Thursday, 12 January 2012

Table partition

First and foremost requirement for table partition is a very large table in terms of data it holds generally the individual table data is more than 50 GB.
Next is file groups and files to hold the data and then partition function and a partition scheme and finally apply the partition scheme on the table which needs partitioning.
For this we will again use AdventureWorksDW2008R2 database and will be creating a new table for this demonstration. Data used is from FactInternetSales table, hence table schema for our table remains same as that of FactInternetSales.
Let’s start by altering database to add three additional file groups in addition to PRIMARY

USE AdventureWorksDW2008R2;
GO

-- Adding file groups
-- file group 1
ALTER DATABASE AdventureWorksDW2008R2
ADD FILEGROUP PVFG1 ;
GO

ALTER DATABASE AdventureWorksDW2008R2
ADD FILE (
      NAME = N'PVFG1',
      FILENAME = N'K:\LiveDatabase\PVFG1.ndf',
      SIZE = 50, FILEGROWTH = 10% )
TO FILEGROUP PVFG1 ;
go

-- file group 2
ALTER DATABASE AdventureWorksDW2008R2
ADD FILEGROUP PVFG2 ;
GO

ALTER DATABASE AdventureWorksDW2008R2
ADD FILE (
      NAME = N'PVFG2',
      FILENAME = N'I:\LiveDatabase\PVFG2.ndf',
      SIZE = 50, FILEGROWTH = 10% )
TO FILEGROUP PVFG2 ;
GO

-- file group 3
ALTER DATABASE AdventureWorksDW2008R2
ADD FILEGROUP PVFG3 ;
GO

ALTER DATABASE AdventureWorksDW2008R2
ADD FILE (
      NAME = N'PVFG3',
      FILENAME = N'J:\LiveDatabase\PVFG3.ndf',
      SIZE = 50, FILEGROWTH = 10% )
TO FILEGROUP PVFG3 ;
GO


Next is to create the partition function and partition scheme, nothing happens yet.

-- create partition function
-- this does nothing though
CREATE PARTITION FUNCTION PF4Year (Int)
AS
RANGE LEFT FOR VALUES (20060101,20070101,20080101,20090101)
GO

--Drop partition function PF4Year

-- create partition scheme over partition function
CREATE PARTITION SCHEME PF4YearScheme
As
PARTITION PF4Year TO
([Primary],[PVFG1],[PVFG2],[PVFG3],[Primary]);
GO


Now create the table over the partition scheme and function.
Pay close attention to where (“ON”) this table is created.


--Create the table
-- pay close attention to where its been created
CREATE TABLE [dbo].[PFSI](
      [ProductKey] [int] NOT NULL,
      [OrderDateKey] [int] NOT NULL,
      [DueDateKey] [int] NOT NULL,
      [ShipDateKey] [int] NOT NULL,
      [CustomerKey] [int] NOT NULL,
      [PromotionKey] [int] NOT NULL,
      [CurrencyKey] [int] NOT NULL,
      [SalesTerritoryKey] [int] NOT NULL,
      [SalesOrderNumber] [nvarchar](20) NOT NULL,
      [SalesOrderLineNumber] [tinyint] NOT NULL,
      [RevisionNumber] [tinyint] NOT NULL,
      [OrderQuantity] [smallint] NOT NULL,
      [UnitPrice] [money] NOT NULL,
      [ExtendedAmount] [money] NOT NULL,
      [UnitPriceDiscountPct] [float] NOT NULL,
      [DiscountAmount] [float] NOT NULL,
      [ProductStandardCost] [money] NOT NULL,
      [TotalProductCost] [money] NOT NULL,
      [SalesAmount] [money] NOT NULL,
      [TaxAmt] [money] NOT NULL,
      [Freight] [money] NOT NULL,
      [CarrierTrackingNumber] [nvarchar](25) NULL,
      [CustomerPONumber] [nvarchar](25) NULL,
 CONSTRAINT [PK_PFSI]  PRIMARY KEY CLUSTERED
(     [OrderDateKey] ,
      [SalesOrderNumber] ,
      [SalesOrderLineNumber]
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON PF4YearScheme(OrderDateKey)
) ON PF4YearScheme(OrderDateKey);


Time to insert data into our table.

-- pump in data
INSERT INTO
      PFSI
SELECT
      *
FROM
      FactInternetSales;
GO   


Select data from our table and view execution plan in detail.

SET STATISTICS IO ON;
GO

SELECT
      *
FROM
      PFSI
WHERE
      OrderDateKey = 20070701
-- watchout for partition accessed in property window within plan window     








The plan clearly indicates that, SQL server identified the partition where the data reside and a clustered index seek operation was performed to arrive at the data (as we have the OrderDataKey as one of the constituent in the clustering key).
If we drill further on the property window for the seek operation, we can see that there are 3 partition for the table but only one was actually accessed.
Now if we want to verify on which partition our data might reside, use the following SQL.

-- to view number of rows held by each partition
SELECT
      $PARTITION.PF4Year(OrderDateKey) as PartitionNo,
      MIN(OrderDateKey) as Min,
      MAX(OrderDateKey) as Max,
      COUNT(1) as Count
FROM
      PFSI
GROUP BY
      $PARTITION.PF4Year(OrderDateKey)
ORDER BY
      1;


Hope enough practical insight is delivered by this post on table partitioning.

Enjoy!

No comments:

Post a Comment