SQL Query tuning

  • GilaMonster: For the grouping that makes sense (or well any aggregate function).

    For the use of multiple fields in an index I'm still a BIT fuzzy on that.

    Examples:

    1. CustomerID = y

    2. OrderID !=x and customerID = y

    3. OrderID > x and customerID = y

    Example 1 will not use the index because it references OrderID first, and must use OrderID AND CustomerID together, so just using CustomerID will disallow the use of the index?

    If that is true, then what do I do if I often search the order's table to find all the orders a customer made? Do I have to choose which search I do more often and create my index appropriately?

    Example 2 will not use the index because the first field (OrderID) just says "Everything EXCEPT this one ID" so that doesn't really benefit much as you have to scan the entire table anyway to find "everyone except one". However your first example where you said OrderID = X AND CustomerID != Y still uses the index because you CAN first pull ALL the orders with OrderID = X and then once you've pulled those, you go back into that sub set and toss out the one CustomerID you didn't want.

    Example 3 is similar to example two, except you are saying "I want everything EXCEPT any values less than X (you only want >X) so the index isn't much use there? Could it use the index to at least decide where to start it's search? If the ID is numeric, and you said >50 couldn't it quickly jump to "50" in the table and then only search "above" that point, or would that still be of little help because while the index may be ordered sequentially it doesn't necessarily mean the table is?

    ---

    So let's say that I have my tbl_CustomerOrders which is indexed on OrderID, CustomerID.

    tbl_CustomerOrders: PK, OrderID, CustomerID, SalesPersonID

    Now I want to join it with two other tables:

    tbl_SalesPeople (to see who sold the order)and tbl_Billing to (I don't know.. see if they paid the bill or something)

    tbl_Billing PK, OrderID, CustomerID (who knows, just an example)

    So what is the IDEAL way to join these tables?

    SELECT * -- a bad idea I know

    FROM tbl_CustomerOrders AS CO INNER JOIN tbl_Billing AS Bil ON

    CO.OrderID = Bil.OrderID AND CO.CustomerID = Bil.CustomerID

    INNER JOIN tbl_SalesPeople AS SalesP ON

    CO.SalesPersonID = SalesP.SalesPersonID

    WHERE OrderID = 12345 AND CustomerID = 777

    Now granted I'm going to pay the price for not indexing SalesPersonID in the CustomerOrders table, but other than that, is that all there is to it? Just make sure I use the indexes in the proper orders?

  • GilaMonster (10/12/2007)


    In my experience it has the most impact when the aggregated columns are also in the index.

    CREATE TABLE Payments (

    PaymentKey INT IDENTITY PRIMARY KEY,

    CustomerID int,

    InvoiceID int,

    PaymentDate datetime,

    Amount Numeric(17,5)

    )

    Assume that the clustered index (and hence the physical order of the rows) is on the identity column. Assume there are lots of rows.

    Take the following query.

    SELECT CustomerID, SUM(Amount) FROM Payments group by customerID

    Now add an index on CustomerID, Amount and run the same query. Now, the aggregate can be satisfied just with the non clustred index and, more importantly, the data is in order of customerID, so just a stream aggregate is required and the data does not have to be resorted or run through a hashing algorithm.

    One thing to add for this is that since you are on SQL 2005, consider making Amount an INCLUDE column instead of part of the actual index list of columns if Amount gets updated regularly. This will lead to less maintenance being required on the index since the Amount would just be at the base level of the NC index.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (10/12/2007)


    One thing to add for this is that since you are on SQL 2005, consider making Amount an INCLUDE column instead of part of the actual index list of columns if Amount gets updated regularly. This will lead to less maintenance being required on the index since the Amount would just be at the base level of the NC index.

    Oh, sure. Was just keeping things simple for the point of illustration.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Maxer (10/12/2007)


    Examples:

    1. CustomerID = y

    2. OrderID !=x and customerID = y

    3. OrderID > x and customerID = y

    Example 1 will not use the index because it references OrderID first, and must use OrderID AND CustomerID together, so just using CustomerID will disallow the use of the index?

    More accurate to say the query needs to reference at least OrderID to get an index seek. Imagine a telephone book, a list of most of the people in a city ordered by surname, firstname and listing address and phone number. The first column of the index is surname, the second is first name, so it's very easy to find the entries for Mr Black or Mr Matthew Black (index seek). However to find all the people with a first name of Matthew requires reading every entry (index scan)

    If that is true, then what do I do if I often search the order's table to find all the orders a customer made? Do I have to choose which search I do more often and create my index appropriately?

    You can have multiple indexes on a table. Work out what you search on, then create indexes appropriately.

    Example 2 will not use the index because the first field (OrderID) just says "Everything EXCEPT this one ID" so that doesn't really benefit much as you have to scan the entire table anyway to find "everyone except one". However your first example where you said OrderID = X AND CustomerID != Y still uses the index because you CAN first pull ALL the orders with OrderID = X and then once you've pulled those, you go back into that sub set and toss out the one CustomerID you didn't want.

    Yup. 😀

    Example 3 is similar to example two, except you are saying "I want everything EXCEPT any values less than X (you only want >X) so the index isn't much use there? Could it use the index to at least decide where to start it's search? If the ID is numeric, and you said >50 couldn't it quickly jump to "50" in the table and then only search "above" that point

    Certainly. I said it's can't be satisfied with an index seek. Will require a partial or complete index scan.

    So what is the IDEAL way to join these tables?

    Doesn't matter. The order of joins or order of where clause entries is irrelevant. The optimiser will pick the best based on the indexes available and the volume of data

    Now granted I'm going to pay the price for not indexing SalesPersonID in the CustomerOrders table, but other than that, is that all there is to it? Just make sure I use the indexes in the proper orders?

    Make sure you have indexes to support your common queries. You have little control over the order of operations that the optimiser picks (see my comments on that here[/url])

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks, that does help clear that up.

  • Okay, this indexing topic has been bugging me for a long time now and I guess this is the place to ask what's been on my mind:

    if you put an index on every SINGLE column that occurs in the WHERE clause of a View or Stored Procedure, and you're using a WHERE clause that contains multiple columns in a query, will the optimizer COMBINE the indexes or do you really need to create a separate index on multiple fields ?

    Can the old hands elaborate on this one ?

  • The optimiser will sometimes use a technique called index intersection, where it will seek multiple indexes to satisfy different conditions, then do the equivalent of inner joins on the resultsets involved to get the records that satisfy all conditions. It's not something I've seen often.

    Since it involves seeks on multiple indexes and inner joins, it is more efficient to have a single multi-column indes that requires a single seek to get the required records.

    In general (and there are exceptions) a smaller number of wider nonclustered indexes are better than multiple narrow indexes.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply