• To use a checksum to determine row changes you need to have a unique key so that individual rows can be identified. (Checksum does not work on text, ntext, image, timestamp or ROWGUID data types.)

    One manner in which I have used checksum is to create a checksum table for each table I wish to check for changes. The table contains a column for each field of the unique key and a column for the calculated checksum. For instance, a checksum table for an employee may look like:

    CREATE TABLE EmployeeCheckSum(

    EmployeeNumber CHAR(10) NOT NULL,

    EmployeeCheckSum INT NOT NULL)

    GO

    ALTER TABLE EmployeeCheckSum ADD CONSTRAINT PK_EmployeeCheckSum PRIMARY KEY (EmployeeNumber)

    GO

    I then make an initial load of the checksum table.

    SELECT EmployeeNumber, EmployeeCheckSum=checksum(*)

    INTO #TempEmployeeCheckSum

    FROM Employee

    GO

    INSERT EmployeeCheckSum

    SELECT * FROM #TempEmployeeCheckSum

    GO

    Finding rows that exist in one table but not the other is a simple query I will not cover here. Finding changed rows is simply a matter of creating a new checksum table and comparing the old checksums with the new ones. For example:

    CREATE TABLE EmployeeCheckSumNew(

    EmployeeNumber CHAR(10) NOT NULL,

    EmployeeCheckSum INT NOT NULL)

    GO

    ALTER TABLE EmployeeCheckSumNew ADD CONSTRAINT PK_EmployeeCheckSumNew PRIMARY KEY (EmployeeNumber)

    GO

    SELECT EmployeeNumber, EmployeeCheckSum=checksum(*)

    INTO #TempNewEmployeeCheckSum

    FROM NewEmployee

    GO

    INSERT EmployeeCheckSumNew

    SELECT * FROM #TempNewEmployeeCheckSum

    GO

    SELECT C1.EmployeeNumber

    FROM EmployeeCheckSum C1, EmployeeCheckSum C2

    WHERE C1.EmployeeNumber=C2.EmployeeNumber AND C1.EmployeeCheckSum<>C2.EmployeeCheckSum

    GO

    For my purposes I only wish to determine if a row has changed and I don't need to know what changed, I just replace the old row with the new one.

    The function which calculates checksum is very fast so for finding changes to very large tables where only a small portion of the rows get modified frequently the checksum method is a good option to have.

    If you just wish to see if a table has had a change you can calculate a table level checksum. For example, the following will generate a table level checksum for the employee table:

    SELECT CHECKSUM_AGG(CHECKSUM(*))

    FROM Employee

    GO

    There is also a BINARY_CHECKSUM function which can be used to check for case-sensitive values. Checksums can also be used to build a hash index on a table by adding a checksum column to a table and indexing the checksum column.

    I believe SQL Server uses checksums internally for hash indexing and replication verification.