Optimize WHERE predicate

Question

Post reply

Optimize WHERE predicate

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply

vsamantha35 SSChampion Points: 11253 More actions · Answer 1

Chris Harshman - Thursday, August 24, 2017 7:39 AM

Grant Fritchey - Thursday, August 24, 2017 6:07 AM

First option, look at what's going on in Cognos. Why is it making that calculation? Could you do something else in Cognos so that the calculation doesn't occur. I'm not a Cognos expert, so you'll have to go elsewhere for definitive answers there.

Second option, instead of having Cognos run calculations against the date time column to get the two values of last_update_date and one month from last_update_date (is that right? whatever it is), create two additional columns on your table that perform these functions on the data, computed columns. Because you're storing the calculations within the table and you can create indexes on computed columns, your query can take advantage of an index which should make it run faster.

Another option could be to create some kind of calendar table that has 1 row for each day and some kind of YearMonth integer column. You could then join to this calendar table and compare the 2 YearMonths to see if they match.

Chris, can you give me a small practical example for the above highlighted text. I didnt understand the message exactly on how it can help solving the problem plz.

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 2

vsamantha35 - Thursday, August 24, 2017 11:58 AM

Hi Chris,
I tried to create those covering indexes but i didnt see much improvement in query performance . it was 5-10 sec difference. Sometimes the query without indices performed well. so decided not to create those 2 indexes. Thanks for the query re-write. I ll try to see if thats works well and returning the same result in terms of rowcount and data. Thank you for taking your valuable time out to help people like me who wanted to learn some tips in regards with query tuning and approach towards it. Thanks a lot.

No, not covering indexes - clustered indexes.

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Chris Harshman SSC-Forever Points: 42179 More actions · Answer 3

here is a most basic Calendar table that could help. With any of these techniques performance could vary depending on your system and data:
/* create and initialize Calendar Table */ CREATE TABLE Calendar ( CalendarDate Date NOT NULL, YearMonth int, CONSTRAINT PK_Calendar PRIMARY KEY (CalendarDate));

DECLARE @StartDate Date = '2016-12-31';

WITH
E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1),
E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c),
cte_Tally AS (SELECT TOP 365 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E4)
INSERT INTO Calendar
    (CalendarDate)
SELECT DATEADD(day, t.N, @StartDate) AS CalendarDate
FROM cte_Tally t;

UPDATE Calendar SET YearMonth = YEAR(CalendarDate) * 100 + MONTH(CalendarDate);

so instead of doing functions to compare t1.tran_dt to t2.updated_dt, you could do something like:
INNER JOIN Calendar c1 ON t1.tran_dt = c1.CalendarDate INNER JOIN Calendar c2 ON t2.updated_dt = c2.CalendarDate ... WHERE c1.YearMonth = c2.YearMonth

vliet SSCommitted Points: 1998 More actions · Answer 4

vsamantha35 - Wednesday, August 23, 2017 1:20 PM

SELECT

....

....

WHERE t1.tran_dt BETWEEN DATEADD(DAY,-DAY(t2.updated_dt) + 1,t2.updated_dt) AND DATEADD(DAY,-1, DATEADD(MONTH,1,DATEADD(DAY,-DAY(t2.updated_dt)+1,t2.updated_dt)))

Hello Sam,

I'm always glad to find women querying databases, since they seem to be quite rare. Assuming [tran_dt] and [update_dt] are both dates (meaning DATE columns or columns with 0:00:00 as time of day), the above WHERE clause is equivalent with WHERE DATEDIFF(MONTH, t1.tran_dt, t2.updated_dt) = 0, which lowers the complexity and thus maybe increase the speed of the original query using a full table scan. To estimate the validity of certain solutions, it would be nice if you could tell whether this condition is mostly true or is quite selective, and if this is the only condition for joining [t1] and [t2]. If the later is indeed the case this query does a full cross join scan resulting in very poor performance.

Could you please provide some additional information about the number of rows in [t1] and [t2], and also the JOIN-clause between these table in this query? That would help us a lot when we try to help you with this problem at hand ... Furthermore, could you tell us whether this query runs once a day to fill a cube, or more often as source of some report directly available to end users? Adding an indexed view might improve query performance but hampers load speed if these tables are refilled completely each night. Welcome to the world of the DBAs where the general answer to every question is: It depends 😉

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

ChrisM@home - Thursday, August 24, 2017 12:14 PM

vsamantha35 - Thursday, August 24, 2017 11:58 AM

Hi Chris,
I tried to create those covering indexes but i didnt see much improvement in query performance . it was 5-10 sec difference. Sometimes the query without indices performed well. so decided not to create those 2 indexes. Thanks for the query re-write. I ll try to see if thats works well and returning the same result in terms of rowcount and data. Thank you for taking your valuable time out to help people like me who wanted to learn some tips in regards with query tuning and approach towards it. Thanks a lot.

No, not covering indexes - clustered indexes.

The two indexes I suggested won't work as ordinary indexes, only as clustered indexes, because they aren't covering (don't contain all of the columns referenced by the query).
Since neither of the two tables have a clustered index, I figured it was the most natural approach. Your query will be much faster if you create them.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden