Search for partial credit card number

  • Hi,

    We have several million credit card numbers in a table.  In using the application, our operators want to type in just the last x digits of the card no, and then select the correct card from a list.

    Using WHERE CardNumber LIKE %PartOfCardNumber% is too slow (nice table scan there...).  What ways have others come up to achieve this with acceptable performance?

    Thanks,

    Steve

  • I got 2 ideas.

    1 - you can make another field that keeps lets say the 8 last numbers of the card and do a search on those number.. however you might need to check for only the last 6 or worst the last n so you'd be back to square one.

    2 - You could keep the card number in the reversed order and index that field. That way you could do a search with "MyFiled like '1234%'" which will at least buy you an index scan if not an index seek (sorry can't do test at the moment). But since you kill one wildcard, the search will already be much faster.

  • Steve,

    Just checking - There is an index on the card number field isn't there?

    Have fun

    Steve

    We need men who can dream of things that never were.

  • WHERE CardNumber LIKE %PartOfCardNumber%

    With a search condition like that, no index is terribly useful.

    To add to Remi, what about a computed column? You can index it like this

    CREATE TABLE t2

    (

     c1 INT PRIMARY KEY

     , c2 VARCHAR(20)

     , my_dummy AS RIGHT(c2,9)

    )

    CREATE UNIQUE NONCLUSTERED INDEX ix_my_dummy ON t2(my_dummy)

    INSERT INTO t2 VALUES(1,'3900-4000-2378')

    SELECT * FROM t2

    DROP TABLE t2

    I must admit, I don't know how this will perform on a table with millions of rows.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Hi Guys,

    Just a thought as I haven't got too much time to test it - how much difference to performance (If any) would this give:-

    SELECT CardNumber FROM CreditCardDetails

    WHERE RIGHT(CardNumber, 4) LIKE 1234

    Removing both wildcards...

    Have fun

    Steve

    We need men who can dream of things that never were.

  • Using

    CardNumber LIKE '%1234'

    or

    RIGHT(CardNumber,4) = '1234'

    will at best do an Index Scan

    The only way I can see to achieve an Index Seek to do like Frank's suggestion, a slight variation

    CREATE TABLE t2

    (c1 INT PRIMARY KEY

     , c2 VARCHAR(20)

     , my_dummy AS REVERSE(c2))

    CREATE UNIQUE NONCLUSTERED INDEX ix_my_dummy ON t2(my_dummy)

    INSERT INTO t2 VALUES(1,'3900-4000-2378')

    SELECT * FROM t2 WHERE my_dummy LIKE REVERSE('2378')+'%'

    DROP TABLE t2

    Far away is close at hand in the images of elsewhere.
    Anon.

  • In your case you might want to consider storing the number twice and use a bigint or decimal(16,0) datatype.

    The first field would be stored in left to right order, the second right to left order (this could also be a computed column using reverse to take the first column and reverse order the data and still be indexable). Indexing both seperately.

    Now when they are searching you can simply take there search criteria use revers to flip it right to left and search the right to left version. And because you are not doing like '%234' but instead are doing like '432%' it can take advantage of the index on that column.

    Or you could look into using full text indexes or a similar 3rd party indexing tool to see if you can get better performance that way.

  • Same here, don't have time to test it, but what about creating a view where you only have the last 4 digits of CC in a column?  Do search queries against that view with an "=" operator rather than a LIKE wildcard.

    Again, make sure you create an index on that column.

  • Althought _ is a great option to % especially for fixing positions I would still need to know exactly how they stored it. Personally I use decimal(16,0) and let the presentation layer handle the rest if it were up to me but he may have used all numbers or the pattern is not presnted.

    However that said I would have to suggest instead of _ use LIKE '[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]-' + lastfour.

    But my understanding is it really isn't an issue of results but the preformance of the search.

    That is why I suggest that if he is going to do a partial he reverse in a computed column and reverse their criteria.

    In addition Only Mastercard, VISA and maybe Discover use the 16 digit method, but American Express does not last I looked and the pattern is different.

    The reality is that also on Mastercard, Visa, and Discover the first 6 Digits represent the type of card and institution backing it as assigned by the Federal Reserve I do believe, the rest is account specifc for the most part.

    My personal suggestion is forgo the pattern period especially if you accept multiple type of cards or may even add types in the future as that is entirel presentational type stuff.

    Then go back to my other suggestion and use the Reverse method with and index and becaue you are not leading with a wildcard for the search you should see a huge performance increase.

  • A variation of Steve's answer, plus adopting msahoo's suggestion to use '=' ... and I wonder about the performance of this option.

    SELECT CardNumber FROM CreditCardDetails

    WHERE RIGHT(CardNumber,LEN(srcharg)) = srcharg

    ... with "srcharg" being the variable containing the entered value.

     Surely the indexing suggested in posts above would improve performance, however, I wonder how much more space the db will grow.  Ah, yes, space is cheap ... relatively.  It is great to have options, though.

    Yall keep smilin'    

     

     

     

    Norm Johnson

    "Keep smiling ... it gives your face something happy to do
            ... and it makes people wonder what you're up to!"
       

  • this statement or the LIKE version thereof would both work fine. However, given the number of records, creating a new, calculated column based on the last 4 digits and then indexing that would give you a massive performance improvement.

    ie. new column is based on the formula "RIGHT(CardNumber,LEN(srcharg))", and index it!

    I do the equivalent with Text based columns all the time ... although in those cases, u use

    CHECKSUM(atextfield), and the subsequent queries u write are

    SELECT * FROM aTable WHERE chkField = CHECKSUM('search text') AND atextField = 'search text'.

    The only issues you get with this occur if you haven't got all ur ANSI settings correct at the time of creation of the SP or Table ... so having ...

    "SET

    ANSI_NULLS, QUOTED_IDENTIFIER, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL ON

    SET

    NUMERIC_ROUNDABORT OFF"

    ... before any DDL statements in ur DB creation scripts is essential. Thankfully, my DB build tool does all that for me!!

     

  • Would be really nice if MS offered other types of indexing methods and would allow you to have more custimization over indexing.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply