Performance Help: Binary Comparisons vs Full Normalization?

  • I have a pretty intensive query that I need performance help on. There are ~1 million de-normalized 'adjustment rows' that I am checking about 20 different conditions on, but each of these conditions has multiple possibile entries.

    For example, one condition is 'what counties apply' to each row? Now I could cross-join a table listing every county that applies to every row, which would mean 1 million rows X 3,000 potential counties. And for every one of these 20 condition, I'd need to be joining tables for each of these lookups.

    Instead, I was told to do a binary comparison of some sort, but I'm not exactly sure of how to do it. This way, I'm not needing to do any joins, but just have a large binary string, with bits representing each county.

    Since each query I know the exact county searched, I can see if each row applies (along with each of the other conditions I must check vs the other binary strings).

    I accomplished this using:

    AND Substring(County, @CountyIndex, 1) = '1'

    I have a character string for county, which is painfully slow when running all of these checks.

    My hope is if the county in the lookup is 872, I can just scan the table, looking at bit #872 for the county field in each record, rather than joining huge tables for every one of these fixed fields I need to test.

    My guess is the fastest way is some sort of binary string comparisons, but I can't find any good resources on the subject. PLEASE HELP!

  • I still have only an incredibly foggy of what you're trying to do, so I have no clue if you're being given good advice or not. For one - you're going to need a lot of varbinary data to track 3000 flags, never mind building the binary data tracking the flags. Sounds to me that you'd have to do your cross join just to get to the point where you can BUILD the flags, etc... But with no specifics - no way to tell.

    That being said, once you have multiple bits organized as bitflags, running binary logic to run single or multi-criteria filters is simply through the use of a binary AND.

    For example - assuming that in a given byte, you were tracking the following counties:

    bit# County

    0 Johnson

    1 Wake

    2 Alameda

    3 Durham

    4 Philadelphia

    5 DeKalb

    6 Chester

    7 York

    A record relevant to York, Alameda and Johnson only would look like 10000101 (decimal = 133). Finding everyone that matches a specific combination (like - York and Alameda) is simply a matter of doing a binary AND between the column and the filter. and you compare the result to the filter to know. So in this case:

    1 0 0 0 0 1 0 1 (The column value)

    & (bitwise AND is & in T-SQL)

    1 0 0 0 0 1 0 0 (the filter=York +Alameda)

    ---------------

    1 0 0 0 0 1 0 0 RESULT

    Since the result=the filter - then the record qualifies.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • thx for the tip on the logical AND. Yes it does seem like quite a bit of work...not sure if its worth it.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply