September 15, 2021 at 8:17 am
Hi All,
We have a current ETL process that takes a delta and uses MERGE to decide whether the data needs to be updated. The current process uses an audit table (effectively a hand-built CDC) to drive the delta but this has proved to be unreliable. Most tables do have a ROWSTAMP field so we are going to use this to drive the delta instead. (applying CDC is not an option unfortunately)
We have about 120 tables to migrate each run, The current process has a separate stored procedure for each table that has a hard-coded list of fields doing a null = null or value - value comparison E.g. (and this is a fairly narrow table...)
AND
(
SRC.[Asset_ID] <> TGT.[Asset_ID] OR (SRC.[Asset_ID] IS NULL AND TGT.[Asset_ID] IS NOT NULL) OR (SRC.[Asset_ID] IS NOT NULL AND TGT.[Asset_ID] IS NULL)
OR SRC.[Attribute_ID] <> TGT.[Attribute_ID] OR (SRC.[Attribute_ID] IS NULL AND TGT.[Attribute_ID] IS NOT NULL) OR (SRC.[Attribute_ID] IS NOT NULL AND TGT.[Attribute_ID] IS NULL)
OR SRC.[IsDeleted_BT] <> TGT.[IsDeleted_BT] OR (SRC.[IsDeleted_BT] IS NULL AND TGT.[IsDeleted_BT] IS NOT NULL) OR (SRC.[IsDeleted_BT] IS NOT NULL AND TGT.[IsDeleted_BT] IS NULL)
OR SRC.[Location_ID] <> TGT.[Location_ID] OR (SRC.[Location_ID] IS NULL AND TGT.[Location_ID] IS NOT NULL) OR (SRC.[Location_ID] IS NOT NULL AND TGT.[Location_ID] IS NULL)
OR SRC.[Position_ID] <> TGT.[Position_ID] OR (SRC.[Position_ID] IS NULL AND TGT.[Position_ID] IS NOT NULL) OR (SRC.[Position_ID] IS NOT NULL AND TGT.[Position_ID] IS NULL)
OR SRC.[Type_ID] <> TGT.[Type_ID] OR (SRC.[Type_ID] IS NULL AND TGT.[Type_ID] IS NOT NULL) OR (SRC.[Type_ID] IS NOT NULL AND TGT.[Type_ID] IS NULL)
OR SRC.[Condition_ID] <> TGT.[Condition_ID] OR (SRC.[Condition_ID] IS NULL AND TGT.[Condition_ID] IS NOT NULL) OR (SRC.[Condition_ID] IS NOT NULL AND TGT.[Condition_ID] IS NULL)
OR SRC.[Condition_AsAt_DT] <> TGT.[Condition_AsAt_DT] OR (SRC.[Condition_AsAt_DT] IS NULL AND TGT.[Condition_AsAt_DT] IS NOT NULL) OR (SRC.[Condition_AsAt_DT] IS NOT NULL AND TGT.[Condition_AsAt_DT] IS NULL)
OR SRC.[Priority_ID] <> TGT.[Priority_ID] OR (SRC.[Priority_ID] IS NULL AND TGT.[Priority_ID] IS NOT NULL) OR (SRC.[Priority_ID] IS NOT NULL AND TGT.[Priority_ID] IS NULL)
OR SRC.[Manufacturer_ID] <> TGT.[Manufacturer_ID] OR (SRC.[Manufacturer_ID] IS NULL AND TGT.[Manufacturer_ID] IS NOT NULL) OR (SRC.[Manufacturer_ID] IS NOT NULL AND TGT.[Manufacturer_ID] IS NULL)
OR SRC.[Pattern_ID] <> TGT.[Pattern_ID] OR (SRC.[Pattern_ID] IS NULL AND TGT.[Pattern_ID] IS NOT NULL) OR (SRC.[Pattern_ID] IS NOT NULL AND TGT.[Pattern_ID] IS NULL)
OR SRC.[Material_ID] <> TGT.[Material_ID] OR (SRC.[Material_ID] IS NULL AND TGT.[Material_ID] IS NOT NULL) OR (SRC.[Material_ID] IS NOT NULL AND TGT.[Material_ID] IS NULL)
OR SRC.[Colour_ID] <> TGT.[Colour_ID] OR (SRC.[Colour_ID] IS NULL AND TGT.[Colour_ID] IS NOT NULL) OR (SRC.[Colour_ID] IS NOT NULL AND TGT.[Colour_ID] IS NULL)
OR SRC.[Comment_Lines_VC] <> TGT.[Comment_Lines_VC] OR (SRC.[Comment_Lines_VC] IS NULL AND TGT.[Comment_Lines_VC] IS NOT NULL) OR (SRC.[Comment_Lines_VC] IS NOT NULL AND TGT.[Comment_Lines_VC] IS NULL)
OR SRC.[Risk_ID] <> TGT.[Risk_ID] OR (SRC.[Risk_ID] IS NULL AND TGT.[Risk_ID] IS NOT NULL) OR (SRC.[Risk_ID] IS NOT NULL AND TGT.[Risk_ID] IS NULL)
OR SRC.[Group_ID] <> TGT.[Group_ID] OR (SRC.[Group_ID] IS NULL AND TGT.[Group_ID] IS NOT NULL) OR (SRC.[Group_ID] IS NOT NULL AND TGT.[Group_ID] IS NULL)
OR SRC.[Image_Reference_VC] <> TGT.[Image_Reference_VC] OR (SRC.[Image_Reference_VC] IS NULL AND TGT.[Image_Reference_VC] IS NOT NULL) OR (SRC.[Image_Reference_VC] IS NOT NULL AND TGT.[Image_Reference_VC] IS NULL)
OR SRC.[Media_Reference_VC] <> TGT.[Media_Reference_VC] OR (SRC.[Media_Reference_VC] IS NULL AND TGT.[Media_Reference_VC] IS NOT NULL) OR (SRC.[Media_Reference_VC] IS NOT NULL AND TGT.[Media_Reference_VC] IS NULL)
OR SRC.[Current_Value_VC] <> TGT.[Current_Value_VC] OR (SRC.[Current_Value_VC] IS NULL AND TGT.[Current_Value_VC] IS NOT NULL) OR (SRC.[Current_Value_VC] IS NOT NULL AND TGT.[Current_Value_VC] IS NULL)
OR SRC.[Quantity_DC] <> TGT.[Quantity_DC] OR (SRC.[Quantity_DC] IS NULL AND TGT.[Quantity_DC] IS NOT NULL) OR (SRC.[Quantity_DC] IS NOT NULL AND TGT.[Quantity_DC] IS NULL)
OR SRC.[Fitted_Renewed_Date_DT] <> TGT.[Fitted_Renewed_Date_DT] OR (SRC.[Fitted_Renewed_Date_DT] IS NULL AND TGT.[Fitted_Renewed_Date_DT] IS NOT NULL) OR (SRC.[Fitted_Renewed_Date_DT] IS NOT NULL AND TGT.[Fitted_Renewed_Date_DT] IS NULL)
OR SRC.[Default_Expected_Life_IN] <> TGT.[Default_Expected_Life_IN] OR (SRC.[Default_Expected_Life_IN] IS NULL AND TGT.[Default_Expected_Life_IN] IS NOT NULL) OR (SRC.[Default_Expected_Life_IN] IS NOT NULL AND TGT.[Default_Expected_Life_IN] IS NULL)
OR SRC.[Expected_Life_IN] <> TGT.[Expected_Life_IN] OR (SRC.[Expected_Life_IN] IS NULL AND TGT.[Expected_Life_IN] IS NOT NULL) OR (SRC.[Expected_Life_IN] IS NOT NULL AND TGT.[Expected_Life_IN] IS NULL)
OR SRC.[Default_Replacement_Type_ID] <> TGT.[Default_Replacement_Type_ID] OR (SRC.[Default_Replacement_Type_ID] IS NULL AND TGT.[Default_Replacement_Type_ID] IS NOT NULL) OR (SRC.[Default_Replacement_Type_ID] IS NOT NULL AND TGT.[Default_Replacement_Type_ID] IS NULL)
OR SRC.[Replacement_Type_ID] <> TGT.[Replacement_Type_ID] OR (SRC.[Replacement_Type_ID] IS NULL AND TGT.[Replacement_Type_ID] IS NOT NULL) OR (SRC.[Replacement_Type_ID] IS NOT NULL AND TGT.[Replacement_Type_ID] IS NULL)
OR SRC.[Default_Replacement_Pattern_ID] <> TGT.[Default_Replacement_Pattern_ID] OR (SRC.[Default_Replacement_Pattern_ID] IS NULL AND TGT.[Default_Replacement_Pattern_ID] IS NOT NULL) OR (SRC.[Default_Replacement_Pattern_ID] IS NOT NULL AND TGT.[Default_Replacement_Pattern_ID] IS NULL)
OR SRC.[Replacement_Pattern_ID] <> TGT.[Replacement_Pattern_ID] OR (SRC.[Replacement_Pattern_ID] IS NULL AND TGT.[Replacement_Pattern_ID] IS NOT NULL) OR (SRC.[Replacement_Pattern_ID] IS NOT NULL AND TGT.[Replacement_Pattern_ID] IS NULL)
OR SRC.[Default_Replacement_Expected_Life_IN] <> TGT.[Default_Replacement_Expected_Life_IN] OR (SRC.[Default_Replacement_Expected_Life_IN] IS NULL AND TGT.[Default_Replacement_Expected_Life_IN] IS NOT NULL) OR (SRC.[Default_Replacement_Expected_Life_IN] IS NOT NULL AND TGT.[Default_Replacement_Expected_Life_IN] IS NULL)
OR SRC.[Replacement_Expected_Life_IN] <> TGT.[Replacement_Expected_Life_IN] OR (SRC.[Replacement_Expected_Life_IN] IS NULL AND TGT.[Replacement_Expected_Life_IN] IS NOT NULL) OR (SRC.[Replacement_Expected_Life_IN] IS NOT NULL AND TGT.[Replacement_Expected_Life_IN] IS NULL)
OR SRC.[Initial_Override_Replacement_Date_DT] <> TGT.[Initial_Override_Replacement_Date_DT] OR (SRC.[Initial_Override_Replacement_Date_DT] IS NULL AND TGT.[Initial_Override_Replacement_Date_DT] IS NOT NULL) OR (SRC.[Initial_Override_Replacement_Date_DT] IS NOT NULL AND TGT.[Initial_Override_Replacement_Date_DT] IS NULL)
OR SRC.[Subsequent_Override_Replacement_Date_DT] <> TGT.[Subsequent_Override_Replacement_Date_DT] OR (SRC.[Subsequent_Override_Replacement_Date_DT] IS NULL AND TGT.[Subsequent_Override_Replacement_Date_DT] IS NOT NULL) OR (SRC.[Subsequent_Override_Replacement_Date_DT] IS NOT NULL AND TGT.[Subsequent_Override_Replacement_Date_DT] IS NULL)
OR SRC.[Default_Replacement_Cost_DC] <> TGT.[Default_Replacement_Cost_DC] OR (SRC.[Default_Replacement_Cost_DC] IS NULL AND TGT.[Default_Replacement_Cost_DC] IS NOT NULL) OR (SRC.[Default_Replacement_Cost_DC] IS NOT NULL AND TGT.[Default_Replacement_Cost_DC] IS NULL)
OR SRC.[Replacement_Cost_DC] <> TGT.[Replacement_Cost_DC] OR (SRC.[Replacement_Cost_DC] IS NULL AND TGT.[Replacement_Cost_DC] IS NOT NULL) OR (SRC.[Replacement_Cost_DC] IS NOT NULL AND TGT.[Replacement_Cost_DC] IS NULL)
OR SRC.[Default_Replacement_Cost_AsAt_DT] <> TGT.[Default_Replacement_Cost_AsAt_DT] OR (SRC.[Default_Replacement_Cost_AsAt_DT] IS NULL AND TGT.[Default_Replacement_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Default_Replacement_Cost_AsAt_DT] IS NOT NULL AND TGT.[Default_Replacement_Cost_AsAt_DT] IS NULL)
OR SRC.[Replacement_Cost_AsAt_DT] <> TGT.[Replacement_Cost_AsAt_DT] OR (SRC.[Replacement_Cost_AsAt_DT] IS NULL AND TGT.[Replacement_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Replacement_Cost_AsAt_DT] IS NOT NULL AND TGT.[Replacement_Cost_AsAt_DT] IS NULL)
OR SRC.[Default_Cyclical_Cost_DC] <> TGT.[Default_Cyclical_Cost_DC] OR (SRC.[Default_Cyclical_Cost_DC] IS NULL AND TGT.[Default_Cyclical_Cost_DC] IS NOT NULL) OR (SRC.[Default_Cyclical_Cost_DC] IS NOT NULL AND TGT.[Default_Cyclical_Cost_DC] IS NULL)
OR SRC.[Cyclical_Cost_DC] <> TGT.[Cyclical_Cost_DC] OR (SRC.[Cyclical_Cost_DC] IS NULL AND TGT.[Cyclical_Cost_DC] IS NOT NULL) OR (SRC.[Cyclical_Cost_DC] IS NOT NULL AND TGT.[Cyclical_Cost_DC] IS NULL)
OR SRC.[Default_Cyclical_Cost_AsAt_DT] <> TGT.[Default_Cyclical_Cost_AsAt_DT] OR (SRC.[Default_Cyclical_Cost_AsAt_DT] IS NULL AND TGT.[Default_Cyclical_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Default_Cyclical_Cost_AsAt_DT] IS NOT NULL AND TGT.[Default_Cyclical_Cost_AsAt_DT] IS NULL)
OR SRC.[Cyclical_Cost_AsAt_DT] <> TGT.[Cyclical_Cost_AsAt_DT] OR (SRC.[Cyclical_Cost_AsAt_DT] IS NULL AND TGT.[Cyclical_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Cyclical_Cost_AsAt_DT] IS NOT NULL AND TGT.[Cyclical_Cost_AsAt_DT] IS NULL)
OR SRC.[Cyclical_Date_DT] <> TGT.[Cyclical_Date_DT] OR (SRC.[Cyclical_Date_DT] IS NULL AND TGT.[Cyclical_Date_DT] IS NOT NULL) OR (SRC.[Cyclical_Date_DT] IS NOT NULL AND TGT.[Cyclical_Date_DT] IS NULL)
OR SRC.[Override_Cyclical_Date_DT] <> TGT.[Override_Cyclical_Date_DT] OR (SRC.[Override_Cyclical_Date_DT] IS NULL AND TGT.[Override_Cyclical_Date_DT] IS NOT NULL) OR (SRC.[Override_Cyclical_Date_DT] IS NOT NULL AND TGT.[Override_Cyclical_Date_DT] IS NULL)
OR SRC.[Default_Cyclical_Period_IN] <> TGT.[Default_Cyclical_Period_IN] OR (SRC.[Default_Cyclical_Period_IN] IS NULL AND TGT.[Default_Cyclical_Period_IN] IS NOT NULL) OR (SRC.[Default_Cyclical_Period_IN] IS NOT NULL AND TGT.[Default_Cyclical_Period_IN] IS NULL)
OR SRC.[Cyclical_Period_IN] <> TGT.[Cyclical_Period_IN] OR (SRC.[Cyclical_Period_IN] IS NULL AND TGT.[Cyclical_Period_IN] IS NOT NULL) OR (SRC.[Cyclical_Period_IN] IS NOT NULL AND TGT.[Cyclical_Period_IN] IS NULL)
OR SRC.[Last_Cyclical_Cost_ID] <> TGT.[Last_Cyclical_Cost_ID] OR (SRC.[Last_Cyclical_Cost_ID] IS NULL AND TGT.[Last_Cyclical_Cost_ID] IS NOT NULL) OR (SRC.[Last_Cyclical_Cost_ID] IS NOT NULL AND TGT.[Last_Cyclical_Cost_ID] IS NULL)
OR SRC.[Last_Planned_Cost_ID] <> TGT.[Last_Planned_Cost_ID] OR (SRC.[Last_Planned_Cost_ID] IS NULL AND TGT.[Last_Planned_Cost_ID] IS NOT NULL) OR (SRC.[Last_Planned_Cost_ID] IS NOT NULL AND TGT.[Last_Planned_Cost_ID] IS NULL)
OR SRC.[User_ID] <> TGT.[User_ID] OR (SRC.[User_ID] IS NULL AND TGT.[User_ID] IS NOT NULL) OR (SRC.[User_ID] IS NOT NULL AND TGT.[User_ID] IS NULL)
OR SRC.[Created_Date_DT] <> TGT.[Created_Date_DT] OR (SRC.[Created_Date_DT] IS NULL AND TGT.[Created_Date_DT] IS NOT NULL) OR (SRC.[Created_Date_DT] IS NOT NULL AND TGT.[Created_Date_DT] IS NULL)
OR SRC.[Feature_ID] <> TGT.[Feature_ID] OR (SRC.[Feature_ID] IS NULL AND TGT.[Feature_ID] IS NOT NULL) OR (SRC.[Feature_ID] IS NOT NULL AND TGT.[Feature_ID] IS NULL)
OR SRC.[Damage_ID] <> TGT.[Damage_ID] OR (SRC.[Damage_ID] IS NULL AND TGT.[Damage_ID] IS NOT NULL) OR (SRC.[Damage_ID] IS NOT NULL AND TGT.[Damage_ID] IS NULL)
OR SRC.[Accessibility_ID] <> TGT.[Accessibility_ID] OR (SRC.[Accessibility_ID] IS NULL AND TGT.[Accessibility_ID] IS NOT NULL) OR (SRC.[Accessibility_ID] IS NOT NULL AND TGT.[Accessibility_ID] IS NULL)
OR SRC.[Access_Restrictions_VC] <> TGT.[Access_Restrictions_VC] OR (SRC.[Access_Restrictions_VC] IS NULL AND TGT.[Access_Restrictions_VC] IS NOT NULL) OR (SRC.[Access_Restrictions_VC] IS NOT NULL AND TGT.[Access_Restrictions_VC] IS NULL)
OR SRC.[Special_Instructions_VC] <> TGT.[Special_Instructions_VC] OR (SRC.[Special_Instructions_VC] IS NULL AND TGT.[Special_Instructions_VC] IS NOT NULL) OR (SRC.[Special_Instructions_VC] IS NOT NULL AND TGT.[Special_Instructions_VC] IS NULL)
OR SRC.[Default_Subsequent_Replacement_Cost_DC] <> TGT.[Default_Subsequent_Replacement_Cost_DC] OR (SRC.[Default_Subsequent_Replacement_Cost_DC] IS NULL AND TGT.[Default_Subsequent_Replacement_Cost_DC] IS NOT NULL) OR (SRC.[Default_Subsequent_Replacement_Cost_DC] IS NOT NULL AND TGT.[Default_Subsequent_Replacement_Cost_DC] IS NULL)
OR SRC.[Default_Subsequent_Replacement_Cost_AsAt_DT] <> TGT.[Default_Subsequent_Replacement_Cost_AsAt_DT] OR (SRC.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NULL AND TGT.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NOT NULL AND TGT.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NULL)
OR SRC.[Max_Smooth_Date_DT] <> TGT.[Max_Smooth_Date_DT] OR (SRC.[Max_Smooth_Date_DT] IS NULL AND TGT.[Max_Smooth_Date_DT] IS NOT NULL) OR (SRC.[Max_Smooth_Date_DT] IS NOT NULL AND TGT.[Max_Smooth_Date_DT] IS NULL)
)
The business is unwilling to allow me to use a 3rd party SSIS plugins like Konesan's Hash task so I am looking for alternatives that mean we don't have to maintain a complex and labour intensive set of stored procedures. I have two broad strategies which I am considering.
I like 1 but I think it could be a real performance hog as for each row in the ETL process it is effectively going to have to rebuild the dynamic SQL but means the process is more robust because it is not sensitive to changes in schema for the source tables (They would however give a different HASH result so all would get updated on first delta)
We are currently running on-prem VMs on SQL 2017 for the OLTP and Azure hosted VMs on SQL 2016 for the DWH, however there is a strong possibility that the DWH will move to SQL Azure in the near future just in case that changes the proposed solution.
September 15, 2021 at 9:50 am
Not answering your question, but a quick note about your existing SQL, as it can be made considerably more concise, using the following structure:
SELECT *
FROM src
JOIN trg
ON src.Id = trg.Id
WHERE NOT EXISTS
(
SELECT src.col1, src.col2 INTERSECT SELECT trg.col1, trg.col2
);
If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.
September 15, 2021 at 2:41 pm
#1 is (way) too much overhead.
#2 is better. You want static code that is generated dynamically.
SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!
September 27, 2021 at 11:12 pm
check the following link(s) if it helps:
1) https://michaeljswart.com/2012/05/t-sql-merge-statement-syntax-generator/
2) https://www.pigeonsql.com/single-post/Dynamic-SQL-Merge-Script-for-Generate-Merge-statement
CDC would have been better as there are only 4 operation code to find which rows are altered viz. Insert, Delete, Before Update and After Update. Definitely, each table needs a separate data flow in SSIS.
Good luck.
---------------------------------------------------------------------------------------------------------------------------------
IrfanHyd
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply