Performance issue

  • Chim Kalunta (2/23/2010)


    It would appear that the clause:

    inner join WhitePage w on w.displayname = a.displayname or w.eid = a.eid

    is forcing a Clustered Index Scan on the Whitepage table. Thats where the time is spent.

    I have had success with splitting out OR clauses in WHERE/JOINs to be two queries (each for just one of the conditions) that are unioned together:

    select *

    from tableA

    INNER JOIN tableB ON tableA.ColA = tableB.ColA

    UNION

    select *

    from tableA

    INNER JOIN tableB ON tableA.ColB = tableB.ColB

    Though I do like Dave's suggestion below... I'll have to check out how that works on some code...

    Another thing I've noticed is your where clause:

    WHERE a.account LIKE '%' + @query + '%'

    By having the wildcard at the start of the expression, I don't think that you will ever be able to utilize an index on the account column.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • How do I know if the stats is up to date?

    Have you got auto update stats set on the database ?

    If not use UPDATE STATISTICS command

    I click on the Estimated Exec Plan icon to generate this one, how do I create an actual plan then?

    In SSMS before executing your query check the 'Include actual execution plan' option.

    See the links in my sig below for a better explanation of this.



    Clear Sky SQL
    My Blog[/url]

  • Does this perform better ?

    SELECT TOP 20 a.employeeid AS [ID], a.account, a.displayname AS [Name], w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email

    FROM activedirectory a

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    WHERE a.account LIKE '%' + @query + '%'

    Thanks to all replies:

    Things seem to get more complex:

    Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'

    I believe this would worse the performance, right?

  • Does this perform better ?

    SELECT TOP 20 a.employeeid AS [ID], a.account, a.displayname AS [Name], w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email

    FROM activedirectory a

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    WHERE a.account LIKE '%' + @query + '%'

    Thanks to all replies:

    Things seem to get more complex:

    Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'

    I believe this would worse the performance, right?

  • Does this perform better ?

    SELECT TOP 20 a.employeeid AS [ID], a.account, a.displayname AS [Name], w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email

    FROM activedirectory a

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    WHERE a.account LIKE '%' + @query + '%'

    Thanks to all replies:

    Things seem to get more complex:

    Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'

    I believe this would worse the performance, right?

  • Dave's query also can give you incorrect results because of the LEFT joins: you can get records where w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email ARE ALL NULLS.

    You can tweak the query to avoid that, but my point is - don't just follow people because they have spare time to earn points 🙂

    Anyway, from the query plan I can see that the problem is in scanning WhitePaper table for every qualifying record from ActiveDirectory table (you can verify that by running the query with SET STATISTICS IO ON). And the reason for that is: the query optimizer chooses NestedLoops join.

    Add to the end of the query: OPTION(HASH JOIN, MERGE JOIN)

    and you will scan WhitePaper only once. That should eliminate 90% of your reads.

    If you try and confirm that, then if you want to eliminate the need for the hint - play with indexes to invite the query optimizer to use HASH or MERGE join

    Technically it is better to use UNION instead of OR, but with your data it does not really make much of a difference. Your problem is NestedLoops join

  • Alex 469216 (2/25/2010)


    Dave's query also can give you incorrect results because of the LEFT joins: you can get records where w.workphonenumber AS phone, w.emailaddress AS email,w2.workphonenumber AS phone, w2.emailaddress AS email ARE ALL NULLS.

    You can tweak the query to avoid that, but my point is - don't just follow people because they have spare time to earn points 🙂

    Anyway, from the query plan I can see that the problem is in scanning WhitePaper table for every qualifying record from ActiveDirectory table (you can verify that by running the query with SET STATISTICS IO ON). And the reason for that is: the query optimizer chooses NestedLoops join.

    Add to the end of the query: OPTION(HASH JOIN, MERGE JOIN)

    and you will scan WhitePaper only once. That should eliminate 90% of your reads.

    If you try and confirm that, then if you want to eliminate the need for the hint - play with indexes to invite the query optimizer to use HASH or MERGE join

    Technically it is better to use UNION instead of OR, but with your data it does not really make much of a difference. Your problem is NestedLoops join

    Thank you very much for the suggestion, actually I do have problem with Dave's query, although in s simple case his solution greatly improve the performance, however, because of splitting same table into multiple tables (w, w2), that means I might have to include more OR in the WHERE clause --- which definitely brings down the performance.

    More over, what should I do if I need to consolidate more tables in the future?

  • More over, what should I do if I need to consolidate more tables in the future?

    You could insert your top 20 ActiveDirectory records into a temp table

    and UNION separate selects from joins between that temp table to every table you want to consolidate.

  • Try playing with a temp table. This keeps the original employeeid and filters from the activedirectory table first. It should let you use the indexes on WhitePaper:

    --------------------------------

    create table #ActiveDirectory

    (empnum varchar(20) NOT NULL PRIMARY KEY,

    account varchar (100) NOT NULL,

    displayname varchar (100) NOT NULL)

    CREATE NONCLUSTERED INDEX adName ON #ActiveDirectory (displayname)

    declare @query varchar(20)

    INSERT INTO #ActiveDirectory (employeeid, account, displayname)

    SELECT '0000' + a.employeeid AS empnum, a.account, a.displayname as [Name]

    FROM activedirectory a WHERE charindex(@query, a.account) > 0

    SELECT TOP 20 emp.empnum as [ID], emp.account, emp.displayname,

    w.workphonenumber as phone, w.emailaddress as email

    FROM emp

    INNER JOIN WhitePage w on

    w.displayname = a.displayname OR

    w.empnum = a.empnum

    ---------------------------------------------

  • Alex 469216 (2/26/2010)


    More over, what should I do if I need to consolidate more tables in the future?

    You could insert your top 20 ActiveDirectory records into a temp table

    and UNION separate selects from joins between that temp table to every table you want to consolidate.

    but my point is - don't just follow people because they have spare time to earn points

    😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Things seem to get more complex:

    Because I want to consolidate matches from different tables, in Dave's suggestion, splitting WhitePage to two tables means I need to add extra OR in the WHERE clause, like:

    left JOIN WhitePage w ON (w.displayname = a.displayname AND w.eid <> a.eid)

    left join WhitePage w2 on w2.eid = a.eid

    where a.account like '%' + @query + '%' or w.displayname like '%' + @query + '%' or w2.displayname like '%' + @query + '%' or w2.workphonenumber like '%' + @query + '%' or w.workphonenumber like '%' + @query + '%'

    I believe this would worse the performance, right?

    Possibly , but as with a lot of SQL , it depends.

    Have you established weather your stats are up to date or not ? Without good stats , sqlserver will not be able to make good choices.

    In terms of the data ,will there always be a row found within the whitepage table for each row in activedirectory ? Is it rare for a match to be found or common ?



    Clear Sky SQL
    My Blog[/url]

  • @halix,

    Out of all the solutions, simple and the one thats most likely to improve the performance is the solution

    based on UNION.

    Yet 'w.empnum = '00000' + a.employeeid' wouldnt use a index. If the query is really important to you then

    try adding a computed column, index that and use it perhaps. I realize its going bit too far to get rid of a scan, but if it matters to you then its worth giving it a try.

    Also, a quick way to find whether stats are updated is using STATS_DATE function

  • @alex,

    /*

    You could insert your top 20 ActiveDirectory records into a temp table

    and UNION separate selects from joins between that temp table to every table you want to consolidate.

    */

    Isnt that supposed to be 'You insert all qualifying ActiveDirectory records into temp table'?

  • Dave Ballantyne (2/27/2010)


    Have you established weather your stats are up to date or not ? Without good stats , sqlserver will not be able to make good choices.

    In terms of the data ,will there always be a row found within the whitepage table for each row in activedirectory ? Is it rare for a match to be found or common ?

    You actually reminded me this: I can do some auto nightly job to check the data and consolidate different tables, that should speed up everything. I would think one single table is the final solution, I mean, if possible. In the case I have to dynamically consolidate different tables, I don't know, I suspect any query would take long time to execute.

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply