SQL Merge statement is specifically targeted towards ETL operation for a data warehouse. This facilitates insert update and delete in a single statement and with conditions depending upon match / unmatched between source and target.
Typically a MERGE statement works on two tables, namely source and target.
In similar to Join statement, there has to be a join criteria for MERGE.
It does the following
1. Update when there is a match between source & target.
2. Insert on target, when there is no matching record on target and
3. Optionally delete when there are extra records on target.
Let’s create a scenario for demonstrating MERGE.
I’ve a database with customer table filled with some hypothetical data assuming this to be a data warehouse Customers dimension.
I create another table tmpCustomers as a copy, and then I perform some updates and deletes on the table to make them out of sync and finally execute MERGE to bring them back in sync.
Here is the code
/*
-- script for customer table
CREATE TABLE [dbo].[Customers](
[CustomerID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](40) NOT NULL,
[MiddleInitial] [nvarchar](40) NULL,
[LastName] [nvarchar](40) NOT NULL,
CONSTRAINT [CustomerPK] PRIMARY KEY CLUSTERED
(
[CustomerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
*/
-- we don't want unnecessary count getting displayed
SET NOCOUNT ON;
GO
-- copy data into tmpCustomers
SELECT * INTO
tmpCustomers
FROM
Customers;
-- 19759 rows in all
-- lets update customer First name & Last name for all the customers
UPDATE
tmpCustomers
SET
FirstName = 'Some',
LastName = 'Somewhere'
-- lets delete some rows from tmpCustomers
DECLARE @i INT;
SET @i = 1;
WHILE (@i < 19760)
BEGIN
IF( @i % 3 = 0 or @i % 5 = 0)
DELETE FROM tmpCustomers
WHERE CustomerId = @i;
SELECT @i = @i + 1;
END
GO
SELECT COUNT(1) FROM tmpCustomers;
GO
--10539 rows now
-- lets insert some data into tmpCustomer
DECLARE @i INT;
SET @i = 19760
-- need to set IDENTITY_INSERT as we are inserting into PK, which is identity col.
SET IDENTITY_INSERT tmpCustomers ON;
WHILE(@i < 20001)
BEGIN
INSERT INTO tmpCustomers
(CustomerID, FirstName,MiddleInitial,LastName)
VALUES
(@i, N'New', NULL,N'Insert');
SELECT @i = @i + 1;
END;
Go
Set IDENTITY_INSERT tmpCustomers off;
SELECT COUNT(1) FROM tmpCustomers;
--10780 rows in all
-- using merge to update the titles back
SET IDENTITY_INSERT tmpCustomers ON;
-- seting identity insert because the insert part of MERGE will insert identity values.
MERGE dbo.tmpCustomers AS Target
USING (SELECT * FROM dbo.Customers) AS Source
ON (TARGET.CustomerID = SOURCE.CustomerID)
WHEN MATCHED THEN
UPDATE SET
Target.FirstName = Source.FirstName,
Target.MiddleInitial = Source.MiddleInitial,
Target.LastName = Source.LastName
WHEN NOT MATCHED BY Target THEN
INSERT
(CustomerID, FirstName,MiddleInitial,LastName)
VALUES
(source.CustomerID, source.FirstName,source.MiddleInitial,source.LastName)
WHEN NOT MATCHED BY Source THEN
DELETE
OUTPUT $ACTION, INSERTED.*, DELETED.*;
SET IDENTITY_INSERT tmpCustomers OFF;
SELECT COUNT(1) FROM Customers;
SELECT COUNT(1) FROM tmpCustomers;
|
No comments:
Post a Comment