Problem: Equal Checksums for different values

  • Hello,

    we need to determine, if contents over several tables are equal. We used the functions checksum() and checksum_agg().

    But now we discovered equal checksums for different values. I show you 2 statements which result in the same checksums. Do you know why?

    Thanks,

    Tobias

    select checksum_agg(checksum(v))

    from

    (

    select 'DZNAKNL' as v

    UNION

    select 'DZNZINL' as v

    ) ua

    select checksum_agg(checksum(v))

    from

    (

    select 'DZNAKFO' as v

    UNION

    select 'DZNZIFO' as v

    ) ua

  • From Books Online:

    "CHECKSUM satisfies the properties of a hash function: CHECKSUM applied over any two lists of expressions returns the same value if the corresponding elements of the two lists have the same type and are equal when compared using the equals (=) operator. For the purpose of this definition, NULL values of a given type are considered to compare as equal. If one of the values in the expression list changes, the checksum of the list also usually changes. However, there is a small chance that the checksum will not change."

    "CHECKSUM_AGG can be used along with BINARY_CHECKSUM to detect changes in a table. The order of the rows in the table does not affect the result of CHECKSUM_AGG. In addition, CHECKSUM_AGG functions may be used with the DISTINCT keyword and the GROUP BY clause. If one of the values in the expression list changes, the checksum of the list also usually changes. However, there is a small chance that the checksum will not change."

  • The problem is that CheckSum_Agg just does a simple XOR. You can cheat... it's no guarantee, but it's more likely to find something different...

    DECLARE @CS1 bigint

    SET @CS1 = 0

    DECLARE @Sign1 INT

    SET @Sign1 = 1

    select @CS1 = @CS1^(CHECKSUM(v)*sign(@Sign1)),

    @Sign1 = @Sign1*(-1)

    from

    (

    select 'DZNAKNL' as v

    UNION

    select 'DZNZINL' as v

    ) ua

    DECLARE @cs2 bigint

    SET @cs2 = 0

    DECLARE @Sign2 INT

    SET @Sign2 = 1

    select @cs2 = @cs2^(CHECKSUM(v)*sign(@Sign2)),

    @Sign2 = @Sign2*(-1)

    from

    (

    select 'DZNAKFO' as v

    UNION

    select 'DZNZIFO' as v

    ) ua

    SELECT @CS1,@CS2

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply