Performance tuning, alternative to cursors for my situation?

  • Hi everyone,

    I'm trying to come up with something that is better performance-wise with a stored procedure I've built. I'm not going to go into too much detail but here's a summary:

    I've got a Client_merge table containing two columns: Main_client, Dupe_client (both int type) and I want to loop through every tables in my database and update every values that contains the Dupe_client_id to the Main_client_id. It's basically for merging two client files into one.

    Pretty easy but what I've built is definitely not great performance-wise. I'm always looking for new stuff to learn so I would be very happy if someone would explain me a better way to do this.


    SET NOCOUNT ON;
    DECLARE @No_client_main as int;
    DECLARE @No_client_dupe as int;

    DECLARE @Code_store_main as nvarchar(3);
    DECLARE @Code_store_dupe as nvarchar(3);
    DECLARE @Code_store_ref as nvarchar(3);

    DECLARE @No_file_main as int;
    DECLARE @No_file_main_vc as int;

    DECLARE @No_file_dupe as int;
    DECLARE @No_file_dupe_vc as int;

    DECLARE @No_file_ref as int;
    DECLARE @No_file_vc_ref as int;

    DECLARE @Derniere_visite_main as datetime;
    DECLARE @Derniere_visite_dupe as datetime;

    DECLARE @Memo_merge as nvarchar(200);
    DECLARE @Memo_merge_dupe as nvarchar(200);

    DECLARE @reversed_main_client as bit;

    DECLARE @cursorclient as cursor;

    SET @cursorclient = CURSOR FOR
        SELECT top 1000    No_client_main, No_client_dupe
        FROM    client_import_merge
        WHERE    Transfered = 0
    OPEN @cursorclient

    FETCH NEXT FROM @cursorclient INTO
        @No_client_main, @No_client_dupe;

    WHILE @@FETCH_STATUS = 0
    BEGIN
        BEGIN TRY
            BEGIN TRANSACTION
                SELECT TOP 1 @Code_store_main = Code_store, @No_file_main = No_file, @No_file_main_vc = No_file_VC FROM associe_a where no_client = @No_client_main
                SELECT TOP 1 @Code_store_dupe = Code_store, @No_file_dupe = No_file, @No_file_dupe_vc = No_file_VC FROM associe_a where no_client = @No_client_dupe

                SET @reversed_main_client = 0;
                SET @No_file_ref = @No_file_dupe;
                SET @No_file_vc_ref = @No_file_dupe_vc;
                SET @Code_store_ref = @Code_store_dupe;

                IF @Code_store_main = @Code_store_dupe AND @No_file_main > @No_file_dupe
                    BEGIN
                        SET @reversed_main_client = 1;
                        SET @No_file_ref = @No_file_main;
                        SET @No_file_vc_ref = @No_file_main_vc;
                        SET @Code_store_ref = @Code_store_main;

                        UPDATE    Associe_a
                        SET
                                No_file = @No_file_dupe,
                                No_file_vc = @No_file_dupe_vc
                        WHERE    No_client = @No_client_main
                        AND        Code_store = @Code_store_main

                        UPDATE    Associe_a
                        SET
                                No_file = @No_file_main,
                                No_file_vc = @No_file_main_vc
                        WHERE    No_client = @No_client_dupe
                        AND        Code_store = @Code_store_dupe
                    END

                -- Do a bunch of updates like the one below..
                UPDATE    Facture
                SET
                        No_client = @No_client_main
                WHERE    No_client = @No_client_dupe

                SELECT @Derniere_visite_main = Derniere_visite FROM Suivi WHERE No_client = @No_client_main
                SELECT @Derniere_visite_dupe = Derniere_visite FROM Suivi WHERE No_client = @No_client_dupe

                IF DATEDIFF(day, ISNULL(@Derniere_visite_main, '1900-01-01 00:00:00.000'), ISNULL(@Derniere_visite_dupe, '1900-01-01 00:00:00.000')) >= 0
                    BEGIN
                        UPDATE    Suivi
                        SET
                                Derniere_visite = @Derniere_visite_dupe
                        WHERE    No_client = @No_client_main
                    END

                SET @Memo_merge = 'Fusion avec file ' + RTRIM(@Code_store_ref) + '-' + CONVERT(nvarchar(12), @No_file_ref) + ', VC(' + CONVERT(nvarchar(12), ISNULL(@No_file_vc_ref, 0)) + ') le ' + CONVERT(char(10), getdate(), 126) + '.';
                SET @Memo_merge_dupe = 'Fusion avec file ' + RTRIM(@Code_store_main) + '-' + CONVERT(nvarchar(12), @No_file_main) + ', VC(' + CONVERT(nvarchar(12), ISNULL(@No_file_main_vc, 0)) + ') le ' + CONVERT(char(10), getdate(), 126) + '.';

                INSERT INTO Memo(No_client, Date_memo, Memo)
                VALUES (@No_client_main, getdate(), @Memo_merge)

                INSERT INTO Memo(No_client, Date_memo, Memo)
                VALUES (@No_client_dupe, getdate(), @Memo_merge_dupe)

                UPDATE    client
                SET
                        actif = 0
                WHERE    No_client = @No_client_dupe

                -- Log action
                UPDATE    client_import_merge
                SET
                        transfered = 1,
                        transfered_at = GETDATE(),
                        reversed = @reversed_main_client
                WHERE    no_client_main = @No_client_main
                AND        no_client_dupe = @No_client_dupe
            COMMIT TRANSACTION
        END TRY
        BEGIN CATCH
            IF @@TRANCOUNT > 0
                ROLLBACK TRANSACTION
                PRINT ERROR_SEVERITY()
                PRINT ERROR_MESSAGE()
                PRINT ERROR_STATE()
        END CATCH
    FETCH NEXT FROM @cursorclient INTO
        @No_client_main, @No_client_dupe;
    END
    CLOSE @cursorclient;
    DEALLOCATE @cursorclient;

    Maybe including everything was not necessary but did it just in case. Right now I can merge about 7 clients per second, but I have over 50k clients to update so, if there is a better way, please let me know.

    Thanks

  • Start by declaring the cursor as fast_forward!
    😎

    For proper help, you'll need to provide schema, realistic test data and desired results

  • Eirikur Eiriksson - Tuesday, August 29, 2017 1:38 PM

    Start by declaring the cursor as fast_forward!
    😎

    For proper help, you'll need to provide schema, realistic test data and desired results

    I didn't even know that this existed so I'm gonna look at this, thanks.

  • You may want to use this combination <fast forward, local>

    Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options

  • SMNayak - Tuesday, August 29, 2017 11:27 PM

    You may want to use this combination <fast forward, local>

    Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options

    Great article, thanks. I ran my stored proc using LOCAL FAST_FORWARD on my test DB and just like in the article, it was 5x faster.

  • sibs132 - Wednesday, August 30, 2017 7:03 AM

    SMNayak - Tuesday, August 29, 2017 11:27 PM

    You may want to use this combination <fast forward, local>

    Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options

    Great article, thanks. I ran my stored proc using LOCAL FAST_FORWARD on my test DB and just like in the article, it was 5x faster.

    Don't forget NOT to use a procedural approach. SQL is best for set based operations. But in some situations, you can't avoid writing a loop.

  • Looking at the code, at least at first glance, it looks like this doesn't need a cursor at all. 
    You should be able to these updates in sets....

    IF OBJECT_ID('tempdb..#MergeData', 'U') IS NOT NULL
    DROP TABLE #MergeData;

    SELECT DISTINCT
        Code_store_main    = aam.Code_store,
        No_file_main    = aam.No_file,
        No_file_main_vc = aam.No_file_VC,
        Code_store_dupe    = aad.Code_store,
        No_file_dupe    = aad.No_file,
        No_file_dupe_vc = aad.No_file_VC
        INTO #MergeData
    FROM
        dbo.client_import_merge cim
        JOIN associe_a aam
            ON cim.No_client_main = aam.no_client
        JOIN associe_a aad
            ON cim.No_client_dupe = aad.no_client;

    UPDATE aa SET
        aa.No_file = md.No_file_dupe,
      aa.No_file_vc = md.No_file_dupe_vc
    FROM
        Associe_a aa
        JOIN #MergeData md
            ON aa.No_client = md.No_client_main
            AND aa.Code_store = md.Code_store_main
    WHERE
        aa.Code_store_main = aa.Code_store_dupe
        AND aa.No_file_main > aa.No_file_dupe;

    -- ... an so on...

  • Jason A. Long - Wednesday, August 30, 2017 11:12 AM

    Looking at the code, at least at first glance, it looks like this doesn't need a cursor at all. 
    You should be able to these updates in sets....

    [/quote]

    I have to agree.  I didn't see anything in there to merit a cursor.

    sibs132, I'd suggest creating a copy of the database on a test server and playing with it there.  You'll want to make sure you cover everything and a test database is the place to play with that.  Based on my experience, I think you'll find that the work is worth the time you'll save if you do this type of thing semi-frequently.  The performance comparison of RBAR (row by agonizing row) versus set-based processing in SQL Server isn't much of a comparison.

    Please post back if you have questions.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply