To use a checksum to determine row changes you need to have a unique key so that individual rows can be identified. (Checksum does not work on text, ntext, image, timestamp or ROWGUID data types.)
One manner in which I have used checksum is to create a checksum table for each table I wish to check for changes. The table contains a column for each field of the unique key and a column for the calculated checksum. For instance, a checksum table for an employee may look like:
CREATE TABLE EmployeeCheckSum(
EmployeeNumber CHAR(10) NOT NULL,
EmployeeCheckSum INT NOT NULL)
GO
ALTER TABLE EmployeeCheckSum ADD CONSTRAINT PK_EmployeeCheckSum PRIMARY KEY (EmployeeNumber)
GO
I then make an initial load of the checksum table.
SELECT EmployeeNumber, EmployeeCheckSum=checksum(*)
INTO #TempEmployeeCheckSum
FROM Employee
GO
INSERT EmployeeCheckSum
SELECT * FROM #TempEmployeeCheckSum
GO
Finding rows that exist in one table but not the other is a simple query I will not cover here. Finding changed rows is simply a matter of creating a new checksum table and comparing the old checksums with the new ones. For example:
CREATE TABLE EmployeeCheckSumNew(
EmployeeNumber CHAR(10) NOT NULL,
EmployeeCheckSum INT NOT NULL)
GO
ALTER TABLE EmployeeCheckSumNew ADD CONSTRAINT PK_EmployeeCheckSumNew PRIMARY KEY (EmployeeNumber)
GO
SELECT EmployeeNumber, EmployeeCheckSum=checksum(*)
INTO #TempNewEmployeeCheckSum
FROM NewEmployee
GO
INSERT EmployeeCheckSumNew
SELECT * FROM #TempNewEmployeeCheckSum
GO
SELECT C1.EmployeeNumber
FROM EmployeeCheckSum C1, EmployeeCheckSum C2
WHERE C1.EmployeeNumber=C2.EmployeeNumber AND C1.EmployeeCheckSum<>C2.EmployeeCheckSum
GO
For my purposes I only wish to determine if a row has changed and I don't need to know what changed, I just replace the old row with the new one.
The function which calculates checksum is very fast so for finding changes to very large tables where only a small portion of the rows get modified frequently the checksum method is a good option to have.
If you just wish to see if a table has had a change you can calculate a table level checksum. For example, the following will generate a table level checksum for the employee table:
SELECT CHECKSUM_AGG(CHECKSUM(*))
FROM Employee
GO
There is also a BINARY_CHECKSUM function which can be used to check for case-sensitive values. Checksums can also be used to build a hash index on a table by adding a checksum column to a table and indexing the checksum column.
I believe SQL Server uses checksums internally for hash indexing and replication verification.