Customer matching using in-line function instead of stored procedure

  • I want to iterate over thousands of customers to find the CustomerNumber based on the values passed (SSN, LastName, FirstName, DOB, etc)

    Currently, I'm using the dreaded CURSOR because my "GetCustomerNumber" stored procedure uses temp tables and dynamic sql to generate the select statement based on the values passed (and whether or not they have a value).

    I want to be able to convert this stored procedure into a function so the customer matching process can be in-line/set-based.

    I'm going to try and dummy-up my stored procedure because it's over 250 lines long, and I probably don't need to post all of it to get my point across. Please don't correct the original stored procedure because I'm sure I'll miss some syntax trying to give you a short, condensed version.

    Current stored procedure:

    Create Procedure [dbo].[GetCustomerNumber]

    (

    @SSN varchar(9) = null,

    @LastName varchar(20) = null,

    @FirstName varchar(20) = null,

    @DOB DateTime = Null

    )

    As

    Declare @sql varchar(max)

    Set @sql = '

    Select CustomerNumber

    Into #tempTable

    From Table t

    Where 1 = 1

    '

    If (@SSN IS NOT NULL)

    Set @sql = @sql + 'And SSN = @SSN' --yes, I know the @SSN should be outside of quotes

    If (@LastName IS NOT NULL)

    Set @sql = @sql + 'And LastName = @LastName'

    -- etc...you see where I'm going here

    Exec (@sql)

    Select @MatchCount = Count(1) From #tempTable

    If (@MatchCount <= 0)

    Begin

    -- Do different matching logic here

    End

    Else

    Return (Select CustomerNumber From #tempTable)

    The point I'm trying to get across is the #tempTable and using dynamic sql. I know you can't use (#) temp tables, so how do I go about doing this?

    If I create a "real" table, then when running the customer matching logic as a set-based script, would ALL of the matching processes be inserting into the exact same table, thus creating incorrect results because that table may have hundreds of customer numbers, all being different because of being called from a different row of data?

    What about using a table variable (Declare @table Table)? How would that work with dynamic sql though?

    Thanks

  • Ah, after some more googling, apparently you can't run dynamic sql inside a function...darn.

  • You're dealing with a "catch-all"-query here. But that might be just because of the way the "outer logic" is implemented.

    How do you use this sproc (What code calls it)?

    It might be possible to change it into a left outer join together with some CASE functions.

    Or, if it can't be done that way there might be the option of a sproc dealing with all rows at once using a temp table.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Here's the portion of code that calls the stored procedure:

    USE [IARTS]

    GO

    Set NoCount On

    Declare pwdCursor Cursor Local Forward_Only For

    Select CustomerNumber,

    SSN,

    DLN,

    LastName,

    FirstName,

    MiddleName,

    Suffix,

    DOB

    From pwdList

    Where NewCustomerNumber IS NULL

    Declare @oldCustomerNumber Int,

    @SSN varchar(9),

    @DLN varchar(15),

    @LastName varchar(40),

    @FirstName varchar(40),

    @MiddleName varchar(40),

    @Suffix varchar(5),

    @DOB varchar(8),

    @newCustomerNumber Int,

    @output varchar(max)

    Open pwdCursor

    Fetch Next From pwdCursor Into @oldCustomerNumber, @SSN, @DLN, @LastName, @FirstName, @MiddleName, @Suffix, @DOB

    While (@@Fetch_Status = 0)

    Begin

    Exec @newCustomerNumber = IARTS..PWDMatchCustomer @SSN = @SSN, @DLN = @DLN, @LastName = @LastName, @FirstName = @FirstName, @MiddleName = @MiddleName, @DOB = @DOB

    Update pwdList

    Set NewCustomerNumber = @newCustomerNumber

    Where CustomerNumber = @oldCustomerNumber

    Fetch Next From pwdCursor Into @oldCustomerNumber, @SSN, @DLN, @LastName, @FirstName, @MiddleName, @Suffix, @DOB

    End

    Close pwdCursor

    Deallocate pwdCursor

    Here's the entire stored procedure:

    CREATE Procedure [dbo].[PWDMatchCustomer]

    (

    @SSN varchar(9) = Null,

    @DLN varchar(15) = Null,

    @LastName varchar(40) = Null,

    @FirstName varchar(40) = Null,

    @MiddleName varchar(40) = Null,

    @DOB varchar(8) = Null

    )

    As

    Set NOCOUNT On

    Declare @True Bit

    Declare @False Bit

    Declare @CustIdentType_SSN Int

    Declare @CustIdentType_DLN Int

    Set @True = 0x1

    Set @False = 0x0

    Set @CustIdentType_SSN = IARTS.dbo.CDV('CustIdentType', 'SSN')

    Set @CustIdentType_DLN = IARTS.dbo.CDV('CustIdentType', 'DriversLicense')

    Declare @CustomerNumber Int

    Declare @SQL varchar(max)

    Declare @MatchCount Int

    If ((@SSN IS NULL Or @SSN = '') And (@DLN IS NULL Or @DLN = '') And (@LastName IS NULL Or @LastName = '')

    And (@FirstName IS NULL Or @FirstName = '') And (@MiddleName IS NULL Or @MiddleName = ''))

    Begin

    Set @CustomerNumber = -1

    GOTO RETURNSECTION

    End

    ----------------------------------------------------------------------------------------------------------------

    -- First, try and match on all fields passed

    ----------------------------------------------------------------------------------------------------------------

    Set @SQL = 'Select c.Cust_No

    Into ##matchList

    From IARTS..Cust c

    Inner Join IARTS..Cust_Ident ci On ci.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_Name cn On cn.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_No co On co.Cust_No = c.Cust_No

    Where c.End_DT IS NULL

    And ci.End_DT IS NULL

    And cn.End_DT IS NULL

    And co.Active = ' + Convert(varchar, @True) + CHAR(13)

    If (@LastName IS NOT NULL And @LastName <> '')

    Set @SQL = @SQL + ' And cn.Last_Name = ''' + @LastName + '''' + CHAR(13)

    If (@FirstName IS NOT NULL And @FirstName <> '')

    Set @SQL = @SQL + ' And cn.First_Name = ''' + @FirstName + '''' + CHAR(13)

    If (@MiddleName IS NOT NULL And @MiddleName <> '')

    Set @SQL = @SQL + ' And cn.Middle_Name = ''' + @MiddleName + '''' + CHAR(13)

    If (@SSN IS NOT NULL And @SSN <> '' And LEN(@SSN) = 9)

    Set @SQL = @SQL + ' And ci.Ident_No = ''' + @SSN + ''' And ci.Cust_Ident_Type_ID = ' + Convert(varchar, @CustIdentType_SSN) + CHAR(13)

    If (@DLN IS NOT NULL And @DLN <> '' And LEN(@DLN) = 9)

    Set @SQL = @SQL + ' And ci.Ident_No = ''' + @DLN + ''' And ci.Cust_Ident_Type_ID = ' + Convert(varchar, @CustIdentType_DLN) + CHAR(13)

    If (@DOB IS NOT NULL And @DOB <> '' And LEN(@DOB) = 8 And ISDATE(@DOB) = 1)

    Set @SQL = @SQL + ' And c.Date_Of_Birth = Convert(DateTime, ''' + @DOB + ''')' + CHAR(13)

    --RAISERROR(@SQL, 0, 1) WITH NOWAIT

    Exec(@SQL)

    Select @MatchCount = Count(1) From ##matchList

    If (@MatchCount <= 0)

    Begin

    Set @CustomerNumber = -1

    --RAISERROR('No matches', 0, 1) WITH NOWAIT

    --GOTO RETURNSECTION

    End

    If (@MatchCount > 1)

    Begin

    --RAISERROR('Multiple matches', 0, 1) WITH NOWAIT

    GOTO MULTIPLEMATCHESSECTION

    End

    ----------------------------------------------------------------------------------------------------------------

    -- Now let's search just on the Name portion

    ----------------------------------------------------------------------------------------------------------------

    If (OBJECT_ID('tempdb..##matchList') IS NOT NULL)

    Begin

    --RAISERROR('Dropping temp table ##matchList', 0, 1) WITH NOWAIT

    Drop Table ##matchList

    End

    Set @SQL = 'Select c.Cust_No

    Into ##matchList

    From IARTS..Cust c

    Inner Join IARTS..Cust_Ident ci On ci.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_Name cn On cn.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_No co On co.Cust_No = c.Cust_No

    Where c.End_DT IS NULL

    And ci.End_DT IS NULL

    And cn.End_DT IS NULL

    And co.Active = ' + Convert(varchar, @True) + CHAR(13)

    If (@LastName IS NOT NULL And @LastName <> '')

    Set @SQL = @SQL + ' And cn.Last_Name = ''' + @LastName + '''' + CHAR(13)

    If (@FirstName IS NOT NULL And @FirstName <> '')

    Set @SQL = @SQL + ' And cn.First_Name = ''' + @FirstName + '''' + CHAR(13)

    If (@MiddleName IS NOT NULL And @MiddleName <> '')

    Set @SQL = @SQL + ' And cn.Middle_Name = ''' + @MiddleName + '''' + CHAR(13)

    If (@DOB IS NOT NULL And @DOB <> '' And LEN(@DOB) = 8 And ISDATE(@DOB) = 1)

    Set @SQL = @SQL + ' And c.Date_Of_Birth = Convert(DateTime, ''' + @DOB + ''')' + CHAR(13)

    --RAISERROR(@SQL, 0, 1) WITH NOWAIT

    Exec(@SQL)

    Select @MatchCount = Count(1) From ##matchList

    If (@MatchCount <= 0)

    Begin

    Set @CustomerNumber = -1

    --RAISERROR('No matches', 0, 1) WITH NOWAIT

    --GOTO RETURNSECTION

    End

    If (@MatchCount > 1)

    Begin

    --RAISERROR('Multiple matches', 0, 1) WITH NOWAIT

    GOTO MULTIPLEMATCHESSECTION

    End

    ----------------------------------------------------------------------------------------------------------------

    -- Now let's search on just the ident type portion

    ----------------------------------------------------------------------------------------------------------------

    If (OBJECT_ID('tempdb..##matchList') IS NOT NULL)

    Begin

    --RAISERROR('Dropping temp table ##matchList', 0, 1) WITH NOWAIT

    Drop Table ##matchList

    End

    Set @SQL = 'Select c.Cust_No

    Into ##matchList

    From IARTS..Cust c

    Inner Join IARTS..Cust_Ident ci On ci.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_Name cn On cn.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_No co On co.Cust_No = c.Cust_No

    Where c.End_DT IS NULL

    And ci.End_DT IS NULL

    And cn.End_DT IS NULL

    And co.Active = ' + Convert(varchar, @True) + CHAR(13)

    If (@SSN IS NOT NULL And @SSN <> '' And LEN(@SSN) = 9)

    Set @SQL = @SQL + ' And ci.Ident_No = ''' + @SSN + ''' And ci.Cust_Ident_Type_ID = ' + Convert(varchar, @CustIdentType_SSN) + '' + CHAR(13)

    If (@DLN IS NOT NULL And @DLN <> '' And LEN(@DLN) = 9)

    Set @SQL = @SQL + ' And ci.Ident_No = ''' + @DLN + ''' And ci.Cust_Ident_Type_ID = ' + Convert(varchar, @CustIdentType_DLN) + '' + CHAR(13)

    --RAISERROR(@SQL, 0, 1) WITH NOWAIT

    Exec(@SQL)

    Select @MatchCount = Count(1) From ##matchList

    If (@MatchCount <= 0)

    Begin

    Set @CustomerNumber = -1

    --RAISERROR('No matches', 0, 1) WITH NOWAIT

    GOTO RETURNSECTION

    End

    If (@MatchCount > 1)

    Begin

    --RAISERROR('Multiple matches', 0, 1) WITH NOWAIT

    GOTO MULTIPLEMATCHESSECTION

    End

    ----------------------------------------------------------------------------------------------------------------

    MULTIPLEMATCHESSECTION:

    ----------------------------------------------------------------------------------------------------------------

    -- If we are here, that means we had multiple matches (could be Merge Customer candidates) so we want to check to

    -- see if they are actually the same customer, if so, return the CustomerNumber based on the 'ConfidenceRating'

    Declare @MultipleMatchCount Int

    Select @MultipleMatchCount = Count(1)

    From IARTS..Cust c

    Inner Join IARTS..Cust_Ident ci On ci.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_Name cn On cn.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_No co On co.Cust_No = c.Cust_No

    Where c.End_DT IS NULL

    And ci.End_DT IS NULL

    And cn.End_DT IS NULL

    And co.Active = @True

    And cn.Last_Name = @LastName

    And cn.First_Name = @FirstName

    And cn.Middle_Name = @MiddleName

    If (@MultipleMatchCount = @MatchCount)

    Begin

    --RAISERROR('Multiple matches are the same customer', 0, 1) WITH NOWAIT

    -- Get the CustomerNumber based on the ConfidenceRating

    Select Top 1

    @CustomerNumber = c.Cust_No

    From ##matchList ml

    Inner Join IARTS..Cust c On c.Cust_No = ml.Cust_No

    Inner Join IARTS..Cust_Ident ci On ci.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_Name cn On cn.Cust_No = c.Cust_No

    Inner Join IARTS..Cust_No co On co.Cust_No = c.Cust_No

    Where c.End_DT IS NULL

    And ci.End_DT IS NULL

    And cn.End_DT IS NULL

    And co.Active = @True

    Order By

    Case When co.DriverAuthenticated = @True Then 2

    When co.DriverUnique = @True Then 1

    Else 0

    End Desc

    End

    Else

    Begin

    Set @CustomerNumber = -1

    --RAISERROR('No matches', 0, 1) WITH NOWAIT

    GOTO RETURNSECTION

    End

    ----------------------------------------------------------------------------------------------------------------

    RETURNSECTION:

    ----------------------------------------------------------------------------------------------------------------

    If (OBJECT_ID('tempdb..##matchList') IS NOT NULL)

    Begin

    --RAISERROR('Dropping temp table ##matchList', 0, 1) WITH NOWAIT

    Drop Table ##matchList

    End

    Return @CustomerNumber

  • Please also post the code for the IARTS.dbo.CDV() function.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Oh, it just gets the primary key value for the 'text' version of something.

    CustIdentType is the table name, get the primary key value for 'individual'. (Edit: woops, I mean 'DriversLicense' or 'SSN' as the value to lookup, not 'individual').

    All of our 'LookupTables' have their values in a 'CodeDep' table so we don't have to know what the primary key value is when joining tables, etc...

    Anyway, here's that code:

    ALTER FUNCTION [dbo].[CDV]

    (

    @tableName varchar(50),

    @valueName varchar(50)

    )

    RETURNS int

    AS

    BEGIN

    DECLARE @retVal int

    SELECT @retVal = ValueID FROM CodeDep

    WHERE TableName = @tableName AND ValueName = @valueName

    RETURN @retVal

    END

  • I'm wondering why the code is written in a way that even if the first search block (match on all fields passed) returns a single CustomerNumber, the code will proceed with the next step (search just on the Name portion) and so forth (even if single match, then search on just the ident type portion).

    Is it really required to work that way? Why?

    It would be interesting to see some (fake!!) sample data from the pwdList table (including tricky ones). Also, table def and sample data for all other tables invovled would help us to test our solutions.

    As an alternative, you could use the same concept and just change it to deal with all rows from your pwdCursor c.u.r.s.o.r. *cough*.

    The function IARTS.dbo.CDV() should also be rewritten as an iTvF and called using CROSS APPLY. The function should look like

    ALTER FUNCTION [dbo].[itvf_CDV]

    (

    @tableName varchar(50),

    @valueName varchar(50)

    )

    RETURNS table

    AS

    RETURN

    (

    SELECT ValueID FROM CodeDep

    WHERE TableName = @tableName AND ValueName = @valueName

    )



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (9/16/2011)


    I'm wondering why the code is written in a way that even if the first search block (match on all fields passed) returns a single CustomerNumber, the code will proceed with the next step (search just on the Name portion) and so forth (even if single match, then search on just the ident type portion).

    Is it really required to work that way? Why?

    It would be interesting to see some (fake!!) sample data from the pwdList table (including tricky ones). Also, table def and sample data for all other tables invovled would help us to test our solutions.

    As an alternative, you could use the same concept and just change it to deal with all rows from your pwdCursor c.u.r.s.o.r. *cough*.

    The function IARTS.dbo.CDV() should also be rewritten as an iTvF and called using CROSS APPLY. The function should look like

    ALTER FUNCTION [dbo].[itvf_CDV]

    (

    @tableName varchar(50),

    @valueName varchar(50)

    )

    RETURNS table

    AS

    RETURN

    (

    SELECT ValueID FROM CodeDep

    WHERE TableName = @tableName AND ValueName = @valueName

    )

    Wow! Gaping hole in my code...thanks for catching that (helps to have new eyes). If it returns a single match for CustomerNumber, it should definitely exit out of the proc. I'm wondering if I didn't notice this because this is just a 'cleanup' script for customers that had multiple records in our system that couldn't be matched...either way, that code needs to be there.

    As for the CDV function, I can't change that because we have thousands of scripts/stored procedures/functions that rely on that so...

    Here's the create table script for pwdList:

    Create Table dbo.pwdList

    (

    pwdListID Int Identity(1,1),

    CustomerNumber Int,

    SSN varchar(9) Null,

    DLN varchar(15) Null,

    LastName varchar(40) Null,

    FirstName varchar(40) Null,

    MiddleName varchar(40) Null,

    Suffix varchar(5) Null,

    DOB varchar(8) Null,

    NewCustomerNumber Int Null

    )

    Create NonClustered Index idxNC_pwdList_CustomerNumber On [dbo].[pwdList]

    (

    CustomerNumber Asc

    )

    Create NonClustered Index idxNC_pwdList_NewCustomerNumber On [dbo].[pwdList]

    (

    NewCustomerNumber Asc

    )

    I'll have to take some time on generating some sample data because the scenario is a little tricky (like you're requesting).

    Basically, the original matching logic was going strictly against SSN and/or DriversLicense, but found out after the conversion (you helped me alot on that conversion script...if you recall) that some of the SSN's/Drivers License numbers in the old system were 'fat-fingered', aka one or two numbers off, so it matched on the completely wrong customer. This is the new version as I had to take into account name, dob, etc.

    I'll try and get you that sample/test data on Monday since I need to leave for the day now.

    Thanks again Lutz for looking at my code!!!

  • Hi Anthony,

    here's a first draft describing the way how I would do it. Of course untested due to missing source tables and sample data.

    It might help to improve performance even more if the "Basic Join" of Cust on Cust_Ident, Cust_Name, and Cust_No woud be separated into another temp table. This would allow to add indexes as needed without affecting the main tables. "It Depends". 😀



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply