Query Performance

Question

Post reply

Query Performance

chandan_jha18

SSChampion

Points: 11330
More actions
January 2, 2019 at 6:49 am

#376711

Hello All,

Wish you a very happy new year 2019!!

I had been asked to review performance of a procedure and I am producing a part of it. There are two similar queries in this procedure and both when combined take upto 12-15 seconds on a big server(DR of production with hardly any resource consumption).
Since both queries are similar, I am putting one of them here.

Even though it is a nested query but even when I tried to evaluate it independently using temp tables for better visualization, there was hardly any performance improvement. I tried to look into the query to see if I can modify it a bit but did not find much success here.

Can one of the gurus see if there is a way to re-write this better so that the performance can be improved. I am including the table schema along with indexes and execution plan. Please let me know if I can frame my question in a better way or supply more information.

Thanks in advance!!

declare @maxResults INT = 25, @companyName varchar(100)=N'Northwestern Corp Discount Commercial Paper 4/2 Yrs 3&4', @issuerCode nvarchar(6)='' DECLARE @searchText NVARCHAR(300) DECLARE @issuerCodeType int DECLARE @searchTextType int
SELECT @searchText = dbautil.dbo.stripAutoSuggestCompanyName_fn(@companyName) --SELECT @searchText = @companyName -- Get company by QGrams DECLARE @minLen int DECLARE @Q int SET @Q = 3 SET @minLen = 4 IF LEN(@searchText) < 6 BEGIN SET @minLen = 2 END Create table #qgram ( gram NCHAR(3) NOT NULL PRIMARY KEY CLUSTERED , [count] INT NOT NULL ) DECLARE @SearchTextLenght INT SET @SearchTextLenght = LEN(@searchText) INSERT INTO #qgram SELECT ISNULL(gram, '') as gram , ISNULL(COUNT(*), 1) AS [count] FROM ( SELECT CONVERT(NCHAR(3), SUBSTRING(SUBSTRING('###', 1, @Q-1) + @searchText + SUBSTRING('%%%', 1, @Q-1), n.n, @Q)) AS gram FROM dbautil.dbo.numbers_tbl n (NOLOCK) WHERE n.n <= @SearchTextLenght + @Q - 2 ) sub GROUP BY gram --option(recompile) DECLARE @maxMatch INT SELECT @maxMatch = SUM(count) FROM #qgram SELECT sub2.fundOrComp, sub2.nameCacheId, sub2.ratio ,row_number() over(partition by sub2.companyid order by ratio desc) as rn FROM ( SELECT TOP(@maxResults) 1 as fundOrComp,sub.nameCacheId, ((sub.count * 1.0 /(qn.strippedNameLength + @Q - 2)) * (sub.count * 1.0/ @maxMatch)) AS ratio,qn.companyId FROM ( SELECT TOP 1500 g.nameCacheId,SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) AS count FROM #qgram q INNER JOIN SearchSuggest_QGram_tbl g (NOLOCK) ON g.gram = q.gram GROUP BY g.nameCacheId HAVING SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) >= @minLen ORDER BY 2 DESC, 1 ASC ) sub INNER JOIN SearchSuggest_QGramNameCache_tbl qn (NOLOCK) ON qn.nameCacheId = sub.nameCacheId ORDER BY ratio DESC ) sub2
drop table #qgram
Count indicates the number oftimes the spilt string is present in the search string.

2)     Qgram table and SearchSuggest_QGram_tbl are comibined based on gram column.
3)     After Qgram and SearchSuggest_QGram_tbl are combined, the rows are grouped on basis of same namecacheid and count values are summed up. And top 1500 rows are selected.
Since we have count column in both the tables, the column with least value is used for adding the all the count values while grouping based on namecacheid.
Example
nameCacheId count
5359401                                 9
512010 6
2545081                                 8
2875285              11
1934139                                10
4)     The result fetched in step 4 is combined with SearchSuggest_QGramNameCache_tbl based on â€˜nameCacheIdâ€™

5) From the result set obtained in step 6, top 500 records are fetched which contain company id, ratio, namecache id, strippednamelength (company or fund name).
6) Ratio for each company id is calculated as below
(count (obtained in step 4) * 0.1 /length(companyname fetched in step7) +1) * count (obtained in step 4) * 0.1/ sum of count values obtained earlier.

Can you see if this matches your observation.
Regards
Chandan Jha

Eirikur Eiriksson SSC Guru Points: 182631 More actions · Answer 1

chandan_jha18 - Wednesday, January 2, 2019 6:49 AM

Hello All,

Wish you a very happy new year 2019!!

I had been asked to review performance of a procedure and I am producing a part of it. There are two similar queries in this procedure and both when combined take upto 12-15 seconds on a big server(DR of production with hardly any resource consumption).
Since both queries are similar, I am putting one of them here.

Even though it is a nested query but even when I tried to evaluate it independently using temp tables for better visualization, there was hardly any performance improvement. I tried to look into the query to see if I can modify it a bit but did not find much success here.

Can one of the gurus see if there is a way to re-write this better so that the performance can be improved. I am including the table schema along with indexes and execution plan. Please let me know if I can frame my question in a better way or supply more information.

Thanks in advance!!

declare @maxResults INT = 25, @companyName varchar(100)=N'Northwestern Corp Discount Commercial Paper 4/2 Yrs 3&4', @issuerCode nvarchar(6)='' DECLARE @searchText NVARCHAR(300) DECLARE @issuerCodeType int DECLARE @searchTextType int
SELECT @searchText = dbautil.dbo.stripAutoSuggestCompanyName_fn(@companyName) --SELECT @searchText = @companyName -- Get company by QGrams DECLARE @minLen int DECLARE @Q int SET @Q = 3 SET @minLen = 4 IF LEN(@searchText) < 6 BEGIN SET @minLen = 2 END Create table #qgram ( gram NCHAR(3) NOT NULL PRIMARY KEY CLUSTERED , [count] INT NOT NULL ) DECLARE @SearchTextLenght INT SET @SearchTextLenght = LEN(@searchText) INSERT INTO #qgram SELECT ISNULL(gram, '') as gram , ISNULL(COUNT(*), 1) AS [count] FROM ( SELECT CONVERT(NCHAR(3), SUBSTRING(SUBSTRING('###', 1, @Q-1) + @searchText + SUBSTRING('%%%', 1, @Q-1), n.n, @Q)) AS gram FROM dbautil.dbo.numbers_tbl n (NOLOCK) WHERE n.n <= @SearchTextLenght + @Q - 2 ) sub GROUP BY gram --option(recompile) DECLARE @maxMatch INT SELECT @maxMatch = SUM(count) FROM #qgram SELECT sub2.fundOrComp, sub2.nameCacheId, sub2.ratio ,row_number() over(partition by sub2.companyid order by ratio desc) as rn FROM ( SELECT TOP(@maxResults) 1 as fundOrComp,sub.nameCacheId, ((sub.count * 1.0 /(qn.strippedNameLength + @Q - 2)) * (sub.count * 1.0/ @maxMatch)) AS ratio,qn.companyId FROM ( SELECT TOP 1500 g.nameCacheId,SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) AS count FROM #qgram q INNER JOIN SearchSuggest_QGram_tbl g (NOLOCK) ON g.gram = q.gram GROUP BY g.nameCacheId HAVING SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) >= @minLen ORDER BY 2 DESC, 1 ASC ) sub INNER JOIN SearchSuggest_QGramNameCache_tbl qn (NOLOCK) ON qn.nameCacheId = sub.nameCacheId ORDER BY ratio DESC ) sub2
drop table #qgram

Start by getting rid of the nolock hints, the implicit conversions and the parallel execution, run the query again and post the execution plan.
😎

ScottPletcher SSC Guru Points: 100262 More actions · Answer 2

ScottPletcher

SSC Guru

Points: 100262

January 2, 2019 at 8:26 am

#2017745

Cluster the [SearchSuggest_QGram_tbl] on [nameCacheId] rather than on gram.

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

chandan_jha18 SSChampion Points: 11330 More actions · Answer 3

Eirikur Eiriksson - Wednesday, January 2, 2019 8:21 AM

chandan_jha18 - Wednesday, January 2, 2019 6:49 AM

Hello All,

Wish you a very happy new year 2019!!

I had been asked to review performance of a procedure and I am producing a part of it. There are two similar queries in this procedure and both when combined take upto 12-15 seconds on a big server(DR of production with hardly any resource consumption).
Since both queries are similar, I am putting one of them here.

Even though it is a nested query but even when I tried to evaluate it independently using temp tables for better visualization, there was hardly any performance improvement. I tried to look into the query to see if I can modify it a bit but did not find much success here.

Can one of the gurus see if there is a way to re-write this better so that the performance can be improved. I am including the table schema along with indexes and execution plan. Please let me know if I can frame my question in a better way or supply more information.

Thanks in advance!!

declare @maxResults INT = 25, @companyName varchar(100)=N'Northwestern Corp Discount Commercial Paper 4/2 Yrs 3&4', @issuerCode nvarchar(6)='' DECLARE @searchText NVARCHAR(300) DECLARE @issuerCodeType int DECLARE @searchTextType int
SELECT @searchText = dbautil.dbo.stripAutoSuggestCompanyName_fn(@companyName) --SELECT @searchText = @companyName -- Get company by QGrams DECLARE @minLen int DECLARE @Q int SET @Q = 3 SET @minLen = 4 IF LEN(@searchText) < 6 BEGIN SET @minLen = 2 END Create table #qgram ( gram NCHAR(3) NOT NULL PRIMARY KEY CLUSTERED , [count] INT NOT NULL ) DECLARE @SearchTextLenght INT SET @SearchTextLenght = LEN(@searchText) INSERT INTO #qgram SELECT ISNULL(gram, '') as gram , ISNULL(COUNT(*), 1) AS [count] FROM ( SELECT CONVERT(NCHAR(3), SUBSTRING(SUBSTRING('###', 1, @Q-1) + @searchText + SUBSTRING('%%%', 1, @Q-1), n.n, @Q)) AS gram FROM dbautil.dbo.numbers_tbl n (NOLOCK) WHERE n.n <= @SearchTextLenght + @Q - 2 ) sub GROUP BY gram --option(recompile) DECLARE @maxMatch INT SELECT @maxMatch = SUM(count) FROM #qgram SELECT sub2.fundOrComp, sub2.nameCacheId, sub2.ratio ,row_number() over(partition by sub2.companyid order by ratio desc) as rn FROM ( SELECT TOP(@maxResults) 1 as fundOrComp,sub.nameCacheId, ((sub.count * 1.0 /(qn.strippedNameLength + @Q - 2)) * (sub.count * 1.0/ @maxMatch)) AS ratio,qn.companyId FROM ( SELECT TOP 1500 g.nameCacheId,SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) AS count FROM #qgram q INNER JOIN SearchSuggest_QGram_tbl g (NOLOCK) ON g.gram = q.gram GROUP BY g.nameCacheId HAVING SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) >= @minLen ORDER BY 2 DESC, 1 ASC ) sub INNER JOIN SearchSuggest_QGramNameCache_tbl qn (NOLOCK) ON qn.nameCacheId = sub.nameCacheId ORDER BY ratio DESC ) sub2
drop table #qgram

Start by getting rid of the nolock hints, the implicit conversions and the parallel execution, run the query again and post the execution plan.
😎

Implicit conversions are happening in very small tables. I can try removing nolock hint and add maxdop 1 to remove parallelism if that's what you mean by parallel execution.

chandan_jha18 SSChampion Points: 11330 More actions · Answer 4

chandan_jha18

SSChampion

Points: 11330

January 2, 2019 at 8:48 am

#2017751

ScottPletcher - Wednesday, January 2, 2019 8:26 AM

Cluster the [SearchSuggest_QGram_tbl] on [nameCacheId] rather than on gram.

Let me get back to you by making this change and report the performance.

Thanks

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 5

And, calculations like this are best done into a variable once:
@SearchTextLenght + @Q - 2

Having it in the filter criteria leads to problems.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 6

ScottPletcher - Wednesday, January 2, 2019 8:26 AM

Cluster the [SearchSuggest_QGram_tbl] on [nameCacheId] rather than on gram.

While I agree that will cure the problem of doing 1,500 individual seeks by (possibly... no guarantee without knowing more about the data) by changing to a (possibly, again) seek followed by a range scan, it will only solve about 5% or less of the problem.

The real problem is that the table (apparently) has almost a quarter million rows in it and the query at hand is returning more than 90 million of them in 51 seek/scans. Apparently (after a quick glance at the code and the schema and the execution plan) there are 51 items in the temp table to be checked and trying to do this in a single query is absolutely the worst way to do this. Since there is no early disqualification of data, you end up with what amounts to as an accidental Cross Join.

The design of the tables also has a problem... there are distinctly (pun intended) no unique indexes.

Just changing one index (or any index, for that matter) just isn't going to solve the performance issues with this query.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 7

chandan_jha18 - Wednesday, January 2, 2019 6:49 AM

Hello All,

Wish you a very happy new year 2019!!

I had been asked to review performance of a procedure and I am producing a part of it. There are two similar queries in this procedure and both when combined take upto 12-15 seconds on a big server(DR of production with hardly any resource consumption).
Since both queries are similar, I am putting one of them here.

Even though it is a nested query but even when I tried to evaluate it independently using temp tables for better visualization, there was hardly any performance improvement. I tried to look into the query to see if I can modify it a bit but did not find much success here.

Can one of the gurus see if there is a way to re-write this better so that the performance can be improved. I am including the table schema along with indexes and execution plan. Please let me know if I can frame my question in a better way or supply more information.

Thanks in advance!!

declare @maxResults INT = 25, @companyName varchar(100)=N'Northwestern Corp Discount Commercial Paper 4/2 Yrs 3&4', @issuerCode nvarchar(6)='' DECLARE @searchText NVARCHAR(300) DECLARE @issuerCodeType int DECLARE @searchTextType int
SELECT @searchText = dbautil.dbo.stripAutoSuggestCompanyName_fn(@companyName) --SELECT @searchText = @companyName -- Get company by QGrams DECLARE @minLen int DECLARE @Q int SET @Q = 3 SET @minLen = 4 IF LEN(@searchText) < 6 BEGIN SET @minLen = 2 END Create table #qgram ( gram NCHAR(3) NOT NULL PRIMARY KEY CLUSTERED , [count] INT NOT NULL ) DECLARE @SearchTextLenght INT SET @SearchTextLenght = LEN(@searchText) INSERT INTO #qgram SELECT ISNULL(gram, '') as gram , ISNULL(COUNT(*), 1) AS [count] FROM ( SELECT CONVERT(NCHAR(3), SUBSTRING(SUBSTRING('###', 1, @Q-1) + @searchText + SUBSTRING('%%%', 1, @Q-1), n.n, @Q)) AS gram FROM dbautil.dbo.numbers_tbl n (NOLOCK) WHERE n.n <= @SearchTextLenght + @Q - 2 ) sub GROUP BY gram --option(recompile) DECLARE @maxMatch INT SELECT @maxMatch = SUM(count) FROM #qgram SELECT sub2.fundOrComp, sub2.nameCacheId, sub2.ratio ,row_number() over(partition by sub2.companyid order by ratio desc) as rn FROM ( SELECT TOP(@maxResults) 1 as fundOrComp,sub.nameCacheId, ((sub.count * 1.0 /(qn.strippedNameLength + @Q - 2)) * (sub.count * 1.0/ @maxMatch)) AS ratio,qn.companyId FROM ( SELECT TOP 1500 g.nameCacheId,SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) AS count FROM #qgram q INNER JOIN SearchSuggest_QGram_tbl g (NOLOCK) ON g.gram = q.gram GROUP BY g.nameCacheId HAVING SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) >= @minLen ORDER BY 2 DESC, 1 ASC ) sub INNER JOIN SearchSuggest_QGramNameCache_tbl qn (NOLOCK) ON qn.nameCacheId = sub.nameCacheId ORDER BY ratio DESC ) sub2
drop table #qgram

It would really have helped if you had identified what the actual purpose of the query is. So, for better understanding by all, let me see if I get it right.

In this particular bit of code, it would appear that you're retrieving 51 "key items" that uniquely identify a company and then storing that information in a temp table.

After that, you're trying to figure out which rows in a much larger table provide a match of those 51 "grams" so you can return only those nameCacheId that are actually a match for the given company.

Is that correct? If not, please expound.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Jeff Moden - Wednesday, January 2, 2019 9:52 AM

chandan_jha18 - Wednesday, January 2, 2019 6:49 AM

Hello All,

Wish you a very happy new year 2019!!

I had been asked to review performance of a procedure and I am producing a part of it. There are two similar queries in this procedure and both when combined take upto 12-15 seconds on a big server(DR of production with hardly any resource consumption).
Since both queries are similar, I am putting one of them here.

Even though it is a nested query but even when I tried to evaluate it independently using temp tables for better visualization, there was hardly any performance improvement. I tried to look into the query to see if I can modify it a bit but did not find much success here.

Can one of the gurus see if there is a way to re-write this better so that the performance can be improved. I am including the table schema along with indexes and execution plan. Please let me know if I can frame my question in a better way or supply more information.

Thanks in advance!!

declare @maxResults INT = 25, @companyName varchar(100)=N'Northwestern Corp Discount Commercial Paper 4/2 Yrs 3&4', @issuerCode nvarchar(6)='' DECLARE @searchText NVARCHAR(300) DECLARE @issuerCodeType int DECLARE @searchTextType int
SELECT @searchText = dbautil.dbo.stripAutoSuggestCompanyName_fn(@companyName) --SELECT @searchText = @companyName -- Get company by QGrams DECLARE @minLen int DECLARE @Q int SET @Q = 3 SET @minLen = 4 IF LEN(@searchText) < 6 BEGIN SET @minLen = 2 END Create table #qgram ( gram NCHAR(3) NOT NULL PRIMARY KEY CLUSTERED , [count] INT NOT NULL ) DECLARE @SearchTextLenght INT SET @SearchTextLenght = LEN(@searchText) INSERT INTO #qgram SELECT ISNULL(gram, '') as gram , ISNULL(COUNT(*), 1) AS [count] FROM ( SELECT CONVERT(NCHAR(3), SUBSTRING(SUBSTRING('###', 1, @Q-1) + @searchText + SUBSTRING('%%%', 1, @Q-1), n.n, @Q)) AS gram FROM dbautil.dbo.numbers_tbl n (NOLOCK) WHERE n.n <= @SearchTextLenght + @Q - 2 ) sub GROUP BY gram --option(recompile) DECLARE @maxMatch INT SELECT @maxMatch = SUM(count) FROM #qgram SELECT sub2.fundOrComp, sub2.nameCacheId, sub2.ratio ,row_number() over(partition by sub2.companyid order by ratio desc) as rn FROM ( SELECT TOP(@maxResults) 1 as fundOrComp,sub.nameCacheId, ((sub.count * 1.0 /(qn.strippedNameLength + @Q - 2)) * (sub.count * 1.0/ @maxMatch)) AS ratio,qn.companyId FROM ( SELECT TOP 1500 g.nameCacheId,SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) AS count FROM #qgram q INNER JOIN SearchSuggest_QGram_tbl g (NOLOCK) ON g.gram = q.gram GROUP BY g.nameCacheId HAVING SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) >= @minLen ORDER BY 2 DESC, 1 ASC ) sub INNER JOIN SearchSuggest_QGramNameCache_tbl qn (NOLOCK) ON qn.nameCacheId = sub.nameCacheId ORDER BY ratio DESC ) sub2
drop table #qgram

It would really have helped if you had identified what the actual purpose of the query is. So, for better understanding by all, let me see if I get it right.

In this particular bit of code, it would appear that you're retrieving 51 "key items" that uniquely identify a company and then storing that information in a temp table.

After that, you're trying to figure out which rows in a much larger table provide a match of those 51 items so you can return only those rows that are actually a match for the given company.

Is that correct? If not, please expound.

It looks to me like a primitive attempt at fuzzy matching, with that 250 million row table containing n-grams of say company names. It might be more performant (and far less "codey") to implement Alan B's n-gram fuzzy matching.
In the meantime, as you point out Jeff, the most costly part of the batch is this bit:
SELECT TOP(1500)
g.nameCacheId,
SUM(CASE WHEN g.count > q.count THEN q.count ELSE g.count END) AS count
FROM #qgram q
INNER JOIN SearchSuggest_QGram_tbl g
ON g.gram = q.gram
GROUP BY g.nameCacheId
HAVING SUM(CASE WHEN g.count > q.count THEN q.count ELSE g.count END) >= @minLen
ORDER BY 2 DESC, 1 ASC

For a quick & temporary small improvement I'd try something like this:
CREATE INDEX ix_JustTesting ON SearchSuggest_QGram_tbl (gram, nameCacheId) INCLUDE ([count])

SELECT TOP(1500)
x.nameCacheId,
x.count
FROM #qgram q
CROSS APPLY (
SELECT
  g.gram,
  g.nameCacheId,
  SUM(CASE WHEN g.count > q.count THEN q.count ELSE g.count END) AS count
FROM SearchSuggest_QGram_tbl g
WHERE g.gram = q.gram
GROUP BY g.gram, g.nameCacheId
) x
WHERE x.count >= @minLen
ORDER BY 2 DESC

But it's never going to perform as well as a redesign.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ScottPletcher SSC Guru Points: 100262 More actions · Answer 9

Jeff Moden - Wednesday, January 2, 2019 9:46 AM

ScottPletcher - Wednesday, January 2, 2019 8:26 AM

Cluster the [SearchSuggest_QGram_tbl] on [nameCacheId] rather than on gram.

While I agree that will cure the problem of doing 1,500 individual seeks by (possibly... no guarantee without knowing more about the data) by changing to a (possibly, again) seek followed by a range scan, it will only solve about 5% or less of the problem.

The real problem is that the table (apparently) has almost a quarter million rows in it and the query at hand is returning more than 90 million of them in 51 seek/scans. Apparently (after a quick glance at the code and the schema and the execution plan) there are 51 items in the temp table to be checked and trying to do this in a single query is absolutely the worst way to do this. Since there is no early disqualification of data, you end up with what amounts to as an accidental Cross Join.

The design of the tables also has a problem... there are distinctly (pun intended) no unique indexes.

Just changing one index (or any index, for that matter) just isn't going to solve the performance issues with this query.

Quite right. I didn't look as fully at the query as I should have (I'm just too busy at work right now, but I look at SQL qs as kind of a quick break from work).

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 10

The Op hasn't gotten back on my questions yet. If the problem definition turns out to be what I what I posted in my questions, then it will be what I've done on similar things in the past (one being just 2 weeks ago) and we should be able to have the code complete in about a quarter of a second. Maybe less.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

chandan_jha18 SSChampion Points: 11330 More actions · Answer 11

Jeff Moden - Wednesday, January 2, 2019 4:39 PM

The Op hasn't gotten back on my questions yet. If the problem definition turns out to be what I what I posted in my questions, then it will be what I've done on similar things in the past (one being just 2 weeks ago) and we should be able to have the code complete in about a quarter of a second. Maybe less.

Sir, I apologize for delay in response which is mainly due to our different timezones and I could have got a chance to talk to the dev today only. I have got the document explaining the logic and will post the information shortly.

chandan_jha18 SSChampion Points: 11330 More actions · Answer 12

chandan_jha18

SSChampion

Points: 11330

January 3, 2019 at 6:54 am

#2017849

chandan_jha18 SSChampion Points: 11330 More actions · Answer 13

chandan_jha18

SSChampion

Points: 11330

January 3, 2019 at 6:55 am

#2017851

ScottPletcher - Wednesday, January 2, 2019 8:26 AM

Cluster the [SearchSuggest_QGram_tbl] on [nameCacheId] rather than on gram.

Sir, No change in performance even after this.

Thanks

chandan_jha18 SSChampion Points: 11330 More actions · Answer 14

Jeff Moden - Wednesday, January 2, 2019 9:52 AM

chandan_jha18 - Wednesday, January 2, 2019 6:49 AM

Hello All,

Wish you a very happy new year 2019!!

I had been asked to review performance of a procedure and I am producing a part of it. There are two similar queries in this procedure and both when combined take upto 12-15 seconds on a big server(DR of production with hardly any resource consumption).
Since both queries are similar, I am putting one of them here.

Even though it is a nested query but even when I tried to evaluate it independently using temp tables for better visualization, there was hardly any performance improvement. I tried to look into the query to see if I can modify it a bit but did not find much success here.

Can one of the gurus see if there is a way to re-write this better so that the performance can be improved. I am including the table schema along with indexes and execution plan. Please let me know if I can frame my question in a better way or supply more information.

Thanks in advance!!

declare @maxResults INT = 25, @companyName varchar(100)=N'Northwestern Corp Discount Commercial Paper 4/2 Yrs 3&4', @issuerCode nvarchar(6)='' DECLARE @searchText NVARCHAR(300) DECLARE @issuerCodeType int DECLARE @searchTextType int
SELECT @searchText = dbautil.dbo.stripAutoSuggestCompanyName_fn(@companyName) --SELECT @searchText = @companyName -- Get company by QGrams DECLARE @minLen int DECLARE @Q int SET @Q = 3 SET @minLen = 4 IF LEN(@searchText) < 6 BEGIN SET @minLen = 2 END Create table #qgram ( gram NCHAR(3) NOT NULL PRIMARY KEY CLUSTERED , [count] INT NOT NULL ) DECLARE @SearchTextLenght INT SET @SearchTextLenght = LEN(@searchText) INSERT INTO #qgram SELECT ISNULL(gram, '') as gram , ISNULL(COUNT(*), 1) AS [count] FROM ( SELECT CONVERT(NCHAR(3), SUBSTRING(SUBSTRING('###', 1, @Q-1) + @searchText + SUBSTRING('%%%', 1, @Q-1), n.n, @Q)) AS gram FROM dbautil.dbo.numbers_tbl n (NOLOCK) WHERE n.n <= @SearchTextLenght + @Q - 2 ) sub GROUP BY gram --option(recompile) DECLARE @maxMatch INT SELECT @maxMatch = SUM(count) FROM #qgram SELECT sub2.fundOrComp, sub2.nameCacheId, sub2.ratio ,row_number() over(partition by sub2.companyid order by ratio desc) as rn FROM ( SELECT TOP(@maxResults) 1 as fundOrComp,sub.nameCacheId, ((sub.count * 1.0 /(qn.strippedNameLength + @Q - 2)) * (sub.count * 1.0/ @maxMatch)) AS ratio,qn.companyId FROM ( SELECT TOP 1500 g.nameCacheId,SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) AS count FROM #qgram q INNER JOIN SearchSuggest_QGram_tbl g (NOLOCK) ON g.gram = q.gram GROUP BY g.nameCacheId HAVING SUM (CASE WHEN g.count > q.count THEN q.count ELSE g.count END ) >= @minLen ORDER BY 2 DESC, 1 ASC ) sub INNER JOIN SearchSuggest_QGramNameCache_tbl qn (NOLOCK) ON qn.nameCacheId = sub.nameCacheId ORDER BY ratio DESC ) sub2
drop table #qgram

It would really have helped if you had identified what the actual purpose of the query is. So, for better understanding by all, let me see if I get it right.

In this particular bit of code, it would appear that you're retrieving 51 "key items" that uniquely identify a company and then storing that information in a temp table.

After that, you're trying to figure out which rows in a much larger table provide a match of those 51 "grams" so you can return only those nameCacheId that are actually a match for the given company.

Is that correct? If not, please expound.

Sir, here is what I got from the developer as per his document:

1) When the search text is entered, the length of the search text is calculated and then the search string is broken into multiple small strings of 3 characters and stored in a table called qgram
Example: If I pass the search text as 'Legg Mason Partners', we will get the following results:

Results:
gram     count
Ma         1
Pa          1
##L         1
#Le         1

etc.