A clarification of MERGE best practices

by Scott Chamberlain   Last Updated January 11, 2018 18:06 PM

I just wanted to ask a clarification about something on the MSDN from the "Optimizing MERGE Statement Performance" page.

I am working with a data warehouse that takes in records from many different databases and stores the data. All of the tables in my warehouse database basically follow this same pattern:

CREATE TABLE Foo (
    database_guid UNIQUEIDENTIFIER
    ,FooPk BIGINT
    ,Bar NVARCHAR(20)
    ,Qix NCHAR(10)
    ,CONSTRAINT [PK_Foo] PRIMARY KEY CLUSTERED (
        database_guid ASC
        ,FooPk ASC
        )
    )
GO

CREATE PROCEDURE [iv].[LoadSomeTable] 
    @databaseGUID UNIQUEIDENTIFIER
AS
BEGIN
    SET NOCOUNT ON

    MERGE Foo
    USING #FooStaging AS Source
    ON Foo.FooPk = Source.FooPk AND Foo.database_guid = @databaseGUID
    WHEN MATCHED THEN
        UPDATE SET Bar = Source.Bar
                  ,Qix = Source.Qix
    WHEN NOT MATCHED THEN
        INSERT (database_guid, FooPk, Bar, Qix)
            VALUES (@databaseGUID, FooPk, Bar, Qix);
END
GO

CREATE TABLE #FooStaging (
    FooPk BIGINT
    ,Bar NVARCHAR(20)
    ,Qix NCHAR(10)
    )

--Data gets loaded in to #FooStaging from a C# call to SqlBulkCopy then calls iv.LoadSomeTable

The thing I am now concerned about is I just read this statement from that MSDN page

Specify only search conditions in the ON <merge_search_condition> clause that determine the criteria for matching data in the source and target tables. That is, specify only columns from the target table that are compared to the corresponding columns of the source table. Do not include comparisons to other values such as a constant.

After reading that I am thinking I did my query wrong and my merge statement should be

MERGE Foo
USING #FooStaging AS Source
ON Foo.FooPk = Source.FooPk
WHEN MATCHED AND Foo.database_guid = @databaseGUID THEN
    UPDATE SET Bar = Source.Bar
              ,Qix = Source.Qix
WHEN NOT MATCHED THEN
    INSERT (database_guid, FooPk, Bar, Qix)
        VALUES (@databaseGUID, FooPk, Bar, Qix);

But that does not "feel" right to me because the database_guid field is part of the primary key so shouldn't it be included with the on? If I had it in the WHEN MATCHED and I upload one database with a FooPk of 1 then I upload a second database with a FooPk and a different @databaseGUID I am not sure if NOT MATCHED will fire or not (Just tested, it does not).

Which way is the correct way to use MERGE?



Answers 1


I would venture that your best approach is to use separate statements for each of the potential actions, and put them in a serializable transaction. You get to use tried and true statements with no funny semantics or "best practices" violations, and you get to avoid all of the problems I outline in this post, including wrong results bugs and potential index corruption:

Aaron Bertrand
Aaron Bertrand
March 18, 2014 21:42 PM

Related Questions



How to administrate large databases?

Updated May 04, 2017 19:06 PM



Concurrent Merge Issues

Updated July 19, 2016 08:02 AM