cursor elimination (performance improvement)

  • I am using a system to determine a country name from a telephone prefix. Right now I cursor through each of the prefixes in my codes table and update my other table with the country name. Like so:

    declare x cursor for select prefix, country from countrycodes order by prefix desc

    open x

    while 1=1 begin

    fetch next from x into @prefix, @country

    if (@@fetch_status<>0) break

    update calltable set country=@country where left(callnumber,len(@prefix))=@prefix and country is null

    end

    close x deallocate x

    This is a very slow process when you are dealing with millions of records. How can I eliminate the cursor from this scenario?

  • This should do it.

    UPDATE calltable

    SET ct.country = cc.country

    FROM calltable ct

    INNER JOIN

    countrycodes cc

    ON

    left(ct.callnumber,len(cc.prefix)) = cc.prefix

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • I am having a little trouble getting this to work nicely. It is actually running about twice as long as the cursor. Also it is defaulting to the proper countrycode. Here is a sample of the prefix table (sorted descending)

    998 Uzbekistan

    996 Kyrgyzstan

    995328 Georgia - Audiotext

    995325 Georgia - Audiotext

    995322 Georgia - Audiotext

    995321 Georgia - Audiotext

    995 Georgia

    994 Azerbaijan

    993 Turkmenistan

    992 Tajikistan

    989 Iran - Cellular

    98 Iran

    9779 Nepal - Cellular

    977 Nepal

    976 Mongolia

    975 Bhutan

    9745 Qatar - Cellular

    97422 Qatar - Cellular

    974 Qatar

    9739 Bahrain - Cellular

    973 Bahrain

    9729 Israel - Cellular

    97264 Israel - Cellular

    9725 Israel - Cellular

    9723 Israel - Tel Aviv

    972 Israel

    97179 United Arab Emirates - Cellular

    97150 United Arab Emirates - Cellular

    97144 United Arab Emirates - Cellular

    971 United Arab Emirates

    968 Oman

    96779 Yemen - Cellular

    967179 Yemen - Cellular

    967 Yemen

    966630 Saudi Arabia - Cellular

    9665 Saudi Arabia - Cellular

    966430 Saudi Arabia - Cellular

    9663 Saudi Arabia - Dahran

    9662 Saudi Arabia - Mecca

    Let's say the actual country is Qatar - Cellular, the system is automatically defaulting to Qatar. The way I controlled this in the cursor was by cursoring through the prefixes one at a time in descending order and only updating null values. By this I would 'check off' the breakouts so to speak.

  • Can you post the DDL of the tables so I can understand the structure. Also cursors can sometimes actually perform better than straight code but it is hard to tell what I am truely looking at.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Um, I am kinda a newby. I am not sure how to post the DDL.

  • Quickest way to get is open enterprise manager and drill to your db and the tables in question. Right click on each table an choose copy, then paste into notepad or something to old text while you get the other tabel as well, then just paste into forum reply screen.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • This is what I got. I scripted the tables out. I hope this is what you needed. They are in separate databases.

    if exists (select * from sysobjects where id = object_id(N'[dbo].[CallTable]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)

    drop table [dbo].[CallTable]

    GO

    CREATE TABLE [dbo].[CallTable] (

    [recid] [int] IDENTITY (1, 1) NOT NULL ,

    [CallDateTime] [datetime] NULL ,

    [StatsDate] [datetime] NULL ,

    [Inbound] [char] (10) NULL ,

    [CustomerRatePlan] [varchar] (50) NULL ,

    [Outbound] [char] (10) NULL ,

    [ProviderRatePlan] [varchar] (50) NULL ,

    [CallNumber] [nvarchar] (50) NULL ,

    [StatsCountry] [varchar] (100) NULL ,

    [CallANI] [nvarchar] (50) NULL ,

    [CallDuration] [numeric](10, 0) NULL ,

    [StatsMinutes] [numeric](18, 1) NULL ,

    [Importfile] [varchar] (20) NULL ,

    [CallClearingCause] [nvarchar] (50) NULL ,

    [Unused] [nvarchar] (50) NULL ,

    [ProviderMinutes] [numeric](18, 1) NULL ,

    [CustomerRate] [numeric](18, 4) NULL ,

    [ProviderRate] [numeric](18, 4) NULL ,

    [R1] [numeric](18, 0) NULL ,

    [R2] [numeric](18, 0) NULL ,

    [R3] [numeric](18, 0) NULL ,

    [R4] [numeric](18, 0) NULL ,

    [ProviderCountry] [varchar] (100) NULL

    )

    GO

    if exists (select * from sysobjects where id = object_id(N'[dbo].[CountryCodes]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)

    drop table [dbo].[CountryCodes]

    GO

    CREATE TABLE [dbo].[CountryCodes] (

    [Prefix] [char] (25) NOT NULL ,

    [Country] [varchar] (100) NULL

    )

    GO

  • This might be a dumb question, but on your table defs I can't see any indexes, you do have some don't you?

    Assuming you have indexes I notice you're using a LEFT function in the where clause. Im not too sure in SQLSERVER, but in ORACLE using any form of modification on a WHERE value means the query engine does a full table scan, rather than use an index.


    I have found it is possible to please all of the people all of the time if you do exactly what they want. Harold Macmillan 1961

  • Yes, I have a unique clustered index on recid in the calltable, and index on the callnumber. I created a temp table for the countrycodes table which I tried to use many different types of indexes on the prefix field. A clustered index will not allow me to sort descending. A unique nonclustered index allows this.

    I just did a test using a select instead of a update statement and I can see whre a lot of my problem lies:

    22505661512 Ivory Coast 225

    22505661512 Ivory Coast - Cellular 2250

    22505661700 Ivory Coast 225

    22505661700 Ivory Coast - Cellular 2250

    22505661700 Ivory Coast 225

    22505661700 Ivory Coast - Cellular 2250

    22505662952 Ivory Coast 225

    22505662952 Ivory Coast - Cellular 2250

    22505663149 Ivory Coast 225

    22505663149 Ivory Coast - Cellular 2250

    22505664393 Ivory Coast 225

    22505664393 Ivory Coast - Cellular 2250

    22505664750 Ivory Coast 225

    22505664750 Ivory Coast - Cellular 2250

    22505667742 Ivory Coast 225

    22505667742 Ivory Coast - Cellular 2250

    22505667742 Ivory Coast 225

    22505667742 Ivory Coast - Cellular 2250

    22505668198 Ivory Coast 225

    22505668198 Ivory Coast - Cellular 2250

    22505668949 Ivory Coast 225

    22505668949 Ivory Coast - Cellular 2250

    22505669560 Ivory Coast 225

    22505669560 Ivory Coast - Cellular 2250

    22505670574 Ivory Coast 225

    22505670574 Ivory Coast - Cellular 2250

    22505673137 Ivory Coast 225

    22505673137 Ivory Coast - Cellular 2250

    22505675454 Ivory Coast 225

    22505675454 Ivory Coast - Cellular 2250

    22505678321 Ivory Coast 225

    22505678321 Ivory Coast - Cellular 2250

    22505678622 Ivory Coast 225

    22505678622 Ivory Coast - Cellular 2250

    22505678800 Ivory Coast 225

    22505678800 Ivory Coast - Cellular 2250

    22505679056 Ivory Coast 225

    22505679056 Ivory Coast - Cellular 2250

    22505681469 Ivory Coast 225

    22505681469 Ivory Coast - Cellular 2250

    22505682320 Ivory Coast 225

    The join is causing it to find every instance which matches the phone number. If there is more than one prefix per country then I will get multiple instances of each. Any idea how to get around that?

  • Crosspatch is right as I see no index, also the other problem with having to do the left is the size will vary based on the other tables value. I also noted you have items like

    995321

    995325

    995328

    and

    995

    so the update I wrote may actually be causing more updates than needs if the left value matches 995321 it will also match 995 so I have to think around this. I have SQL do work fine with LEFT function on indexes but it does an index scan instead of seek (which can be faster that a table scan but not as efficient as seek). I will sugest that if you don't have indexes at least put one clustered on your CountryCodes table on the column Prefix and a Non-Clustered on CallTable on the column callnumber can be helpfull.

    Then here is a thought and a question with my answer. Does the column country in calltable start out null if does then you could put a Non-Clustered index on it as well then add the line

    WHERE ct.country IS NULL

    to speed up the update, then to make sure if a number changes you could have a trigger that sets the country = null on any record where column callnumber is changed so the update will fix.

    Now on the subject of trigger if you do that then you could create updates for on UPDATE to process these as they are entered and not have to worry about the update it any other time. Something like this for updates

    CREATE TRIGGER tr_SetCountry ON dbo.CallTable

    FOR UPDATE

    AS

    IF UPDATE(callnumber)

    BEGIN

    UPDATE calltable

    SET ct.country = cc.country

    FROM calltable ct

    INNER JOIN

    countrycodes cc

    ON

    left(ct.callnumber,len(cc.prefix)) = cc.prefix AND

    ct.recid in (SELECT recid FROM inserted)

    END

    Let me know what path helps or anything else that you question.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • You apparently posted at the same time I was, I am trying to think about that to make sure it is covered.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Wow, you guys are really working hard on this!! Thanks so much for your help. Yes, the the callcountry column starts out as null. I did try the WHERE ct.callnumber is null thing and saw no improvement. I thought it may fix the problem since this is how I fixed my cursor to work correctly.

  • Still thinking about the problem with the multiple problem which is the base issue and will slow you down. I will try a few things later and let you know what I think will work. Anyone else fell free to join in.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Ok this was really fun and here is how I figured to get around the duplicates

    UPDATE calltable

    SET ct.country = cc.country

    FROM

    calltable ct

    INNER JOIN

    countrycodes cc

    ON

    left(ct.callnumber,len(cc.prefix)) = cc.prefix

    WHERE

    cc.prefix = (SELECT TOP 1 pt.prefix FROM countrycodes pt WHERE pt.prefix = left(ct.callnumber,len(pt.prefix)) ORDER BY prefix DESC)

    Which you can test the difference with the SELECT version if you prefer. I did have a non-clustered index on callnumber and a clustered index on prefix but unfortunately did not try your way of non-clustered sorted descending (may improve performance).

    I am posting my execution plan to show that the indexes are being used (I'll leave it until I need to conserve space on my geocities account).

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

    Edited by - antares686 on 03/20/2002 7:28:13 PM

  • Hit this about 30 minutes after going to bed last night and see it much better than my previous query.

    UPDATE calltable SET ProviderCountry =

    (SELECT TOP 1 country FROM countrycodes WHERE prefix = left(callnumber,len(prefix)) ORDER BY prefix DESC)

    The I decided to add a non-clustered index to ProviderCountry which on an initial UPDATE when it was not set does not help. But on subsequent runs when I add more data doing

    WHERE ProviderCountry IS NULL

    really helped get it done faster than just doing the straight update on all records again. Run both as a select and look at the execution plan, the major difference is I get rid of several steps with the join method that I didn't need. To do as a select just do

    SELECT recid, callnumber,

    (SELECT TOP 1 country FROM countrycodes WHERE prefix = left(callnumber,len(prefix)) ORDER BY prefix DESC) AS Country

    FROM calltable

    I cannot see much more than can be done to this to improve and with it I recommend having an index as follows

    CountryCodes = Clustered PirmaryKey Index on Prefix which will be utilized when looking for where prefix = left(callnumber,len(prefix))

    CallTable = NonClustered NonUnique index on ProviderCountry which will help with added records later, an index on callnumber will not bennifit this query and does not have to exist if you do not need for any other query specifically, if so I suggest don't waste the space for a callnumber index.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply