Excessive stored procedure [COMPILE] lock

  • Hello!

        I am trying to investigate strange problem with particular stored

    procedure. It runs OK for several days and suddenly we start getting and lot

    of locks. The reason being [COMPILE] lock placed on this procedure. As a

    result, we have 40-50 other connections waiting, then next connection using

    this procedure has [COMPILE] lock etc. Client is fully qualifying stored

    procedure by database/owner name and it doesn't start with sp_.  I know

    these are the reasons for [COMPILE] lock being placed. Is there something

    else that might trigger this lock? When troubleshooting this issue, I

    noticed there was no plan for this procedure in syscacheobjects. The stored

    procedure is very simple (I know it could be rewritten/optimized but our

    developer wrote it):

    CREATE PROCEDURE [dbo].[vsp_mail_select]

     @user_id int,

     @folder_id int,

     @is_read bit = 1, --IF 1, pull everything, else just pull unread mail

     @start_index int = null,  --unused for now, we return everything

     @total_count int = null output, -- count of all mail in specified folder

     @unread_count int = null output -- count of unread mail in specified folder

    AS

    SET NOCOUNT ON

     select m1.* from mail m1(nolock) where m1.user_id=@user_id and

    folder_id=@folder_id and ((@is_read=0 and is_read=0) or (@is_read=1))  order

    by date_sent desc

     select @total_count = count(mail_id) from mail m1(nolock) where

    m1.user_id=@user_id and folder_id=@folder_id and ((is_read=0 and @is_read=0)

    or (@is_read=1))

     select @unread_count = count(mail_id) from mail m1(nolock) where

    m1.user_id=@user_id and folder_id=@folder_id and is_read=0

    GO

    I was monitoring server for a couple of day before and I am not sure why

    this happens every 3-4 days only!

    I would like to indetify the reason why [COMPILE] lock is placed on this procedure.

    Any help on this matter would be greately appreciated!

    Thanks,

    Igor

  • The compile locks are used whenever SQL Server needs to recompile a plan for a procedure, or when it thinks it need to (due to clients not qualifying the name). You seem to have covered the reasons for SQL Server thinking it needs to recompile the proc, so that seems not to be the issue. Also, since you say the compile locks only happen every 3-4 days it seems that something forces the recompile at that time. You also said there where no plan in syscacheobjects so that further indicates that the plan is really being compiled.

    The question is of course why it happens only every 3-4 days. Is there something else happening at that time?

  • Hi, There are number of reasons the procedure is getting recompiled.

    These are some of the reasons with reference to your code here..

    1. Procedure cache is not in memory. somebody might have issued dbcc commands to clear the procedure cache or lack of memory could cause the procedure cache being removed

    2. Procedure qualification also matters when calling the procedure eg, (exec procname or exec dbo.procname). for naming resolution.

    Fully qualified procedure calls results in less recompilation

    3. Use explicit column names instead of m1.*

    4. If you have auto update statistics enabled,and if there are 500 or more modifications are made to the table mail, the system might force the procedure to get recompiled as the statistics are out of sync.

    I may be wrong saying this, I usually set the out params before I return the result set by thinking to reduce the round trips from client. But check this out..

    The case number 4 is the most likely to fit into your issue..

  • Hi, Check this link

    http://support.microsoft.com/default.aspx?scid=kb;en-us;263889

    may help you understand most of the cases...

  • Petmani,

     

    You seem to be right on number 4. According to our developer, there are changes happening in base,mail, table:' table gets purged of rows belonging to inactive accounts on a regular basis – every 1 hour. It is possible that this purge could cause the deletion of enough rows to trigger a recompile.'

    Assuming that massive modifications causing auto update of statistics and recompilation of all procedures referencing this tables, what would be the right approach to handle it? What would you recommend to alleviate this problem:

    1.Disabling auto update statistics on this table and then manually updating it during off-peak ours help.

    2.Using KEEP PLAN hint inside SELECT statement?

    3. Using sp_executesql inside stored procedure to ensure better plan caching.

    4. Something else?

    Massive modifications still do not explaing why we experince this issue every 3-4 days only. I would expect this to happen mo frequently.

    Thanks,

    Igor

  • Chris,

       I am also puzzled why this is happening every 3-4 days. Base table is modified every hour. I think SQL Server is forcing recompilation of all procedures referencing base,mail, table when certain threshold of modifications reached and it beleives statistics is out of date. Would you agree?

       When [COMPILE] locks were generated, there was no plan cached for this proc in syscachobjects. I issued sp_recompile vsp_mail_select. This seemed to help (I am not 100% sure though). After sp_recompile, I started seeing Compiled and Executable plans in syscaheobjects for this proc. Assuming changes in the base table is the reason for recompiling, what course of actions would you recommend?

    Thanks,

    Igor

  • Well each modification every hour will not trigger a recompile. Assuming your table has more than 500 rows, the stats will need to be updated when there have been a number of modifications equal to 20% of the number of rows in the table + 500. So this means that if it is a very large table then 3-4 days might be the time it takes until this happens. You can watch the rowmodctr column in sysindexes to see how many modifications have been made to a table (or clustered index) at a specific time. This value is reset to 0 when stats are updated. As a side-note, you might want to think about taking care of stats yourself if you have this big a table, since autostats update might happen a) too seldom and b) at a bad time.

    Anyway, it still seems strange that you should be seeing such a big effect due to one procedure being recompiled once every 3-4 days. The problem with [COMPILE] locks blocking processes usually happens when there are lots of processes that needs to recompile the same proc over and over again, for instance because of not qualifying object names correctly. Or have I misunderstood the problem?

  •    Chris, thanks for you hint regarding sysindexes table. I am monitoring rowmodctr  where indid=1 (clustered index).You are right, table is big (27 million rows). All procedure calls qualify it with an owner. I think, since this table is updated every hour, we reach threshlod in 3-4 days (number of modifications equal to 20% of the number of rows in the table + 500) and then update statistics kicks in.

       When I was monitoring problem yeaterday, I saw around 40 processes blocked with [COMPILE] lock in waitresource. There was several levels of blocks. In other words, 30-35 processes were blocked by spid X. spid X was blocked by spid Y (lead blocker).  Once process Y was executed, X became a leading blocker etc.

        I am not sure what happens in case there several stored procedure calls submitted and auto update statistics is issued against base table. Will stored procedure have to be recompiled once inly? It appears as if every user request caused procedure to be recompiled.

        Please let me know if I description of the problem is clear.

    Thanks,

    Igor 

  • Yes, I got a good description of the problem now. A couple of further questions:

    Are all of the blocked processes trying to execute the same stored procedure?

    Does it take a long time to compile the procedure?

    Have you tried using KEEPPLAN or even KEEPFIXED PLAN option hints in the procedure?

  • Chris,

     

    Yes, all blocked processes were trying to execute the same stored procedure. The stored procedure is very simple and it doesn't take long time to compile it. I was thinking of using hints you mentioned but was not sure whether it is a good idea. Would you recommend them?

    I have looked at the output of sp_autostats 'mail' (mail is the base tabke). There is one statistic,[_WA_Sys_folder_id_45F365D3],

    that was update right before the problem started. Currently, I am monitoring all entries in sysindexes associated with mail table. I noticed that rowcnt values for statistics are always 0. rowmodctr field has non 0 values.

    In case rowcnt is 0,I am not sure when SQL Server decides to update statistics.

    For clustered index number of modified is bigger than expected threshold (20% of table cardinality+ 500) but SQL Serve didn't trigger statistic update:

    rowcnt value for indid=1 (cluistered index) is 28250802

    rowmodctr=8002019

    I was hoping to come up with accurate number of modified rows using formula above, but know I am sort of puzzled as to when to expect statistic update to fire for clustered index.

    The last thing is that I noticed negative values in rowmodctr for indid=2. Do you know when this might happen?

     

    AUTOSTATS is enabled for all indexes/statistics on mail table.

     

    Thank you very much for you help on this.

     

    Igor

  • > I was thinking of using hints you mentioned but was not sure whether it is a good idea. Would you recommend them?

    If you are still having the problems every 3-4 days and can correlate them to statistics being updated at that time, then I would recommend you try them. Always use caution when using hints of course in a production system.

    Unfortunately it is not always as easy as calculating the 20%+500 rows for seeing when an update happens, since the algorithm also has some kind of modifier that further defines the exact number of modifications that should happen.

    Note that the _WA_SYS... statistic indicates that it might be a good thing to create an index for that column (folder_id I guess). If you do so then you should also recompile the procedures after that since a new index will not force a recompile.

    I think rowcnt and rowmodctr are only 'valid' (used in the way we discuss) for tables and clustered indexes.

  • Chris,

     

    I have done some research on UPDATE STATISTICS. It appears that even if threshold of modifications is reached, SQL Server issue statistic update only when first query/stored procedure using referencing table is executed. I have also found atricle explaining rowmodctr value for non-clustered index and statistics:

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/statquery.asp

     

    It says that rowmodctr is a relative number and can be negative under certain circumstances.

    Thanks on your tip regarding converting statistics into index. What would be the adavantage of having index instead of statistics? Is Query Optimizer more inclined to use index?

    Thanks,

    Igor

     

  • Well, the term statistics and index is used a bit incorrectly here. Statistics are gathered for all indexes, and used by the optimizer when deciding how to respond to a request. The _WA_SYS_... rows in sysindexes are automatically generated statistics that the optimizer has deemed helpful and created 'on the fly'. But they are not real indexes, and so I think they cannot be used for index seeks. They only tell the optimizer information about a certain column, so that it knows better how that column can be used.

    Good thing you pointed to the great article by Lubor. I have read it before, but while skimming it now I think I might have found the reason for your problem, and also figured out a possible solution. The interesting part of the article is this:

    "To avoid long term maintenance of unused statistics, SQL Server 2000 ages the automatically created statistics (only those that are not a byproduct of the index creation). After several automatic updates the column statistics are dropped rather than updated. If they are needed in the future, they may be created again. There is no substantial cost difference between statistics that are updated and created. The aging does not affect user-created statistics."

    As you can see, automatically created stats (_WA_SYS...) are after a while dropped instead of SQL Server keeping updating them. When a query that needs it is then executed they are created again. We can imagine that it takes a while to create the auto-generated stats for your large table. Since the stats are actually created to make it possible to use a certain execution plan, we can assume that the stats creation is part of the compile time for that plan.

    So, what I guess is happening in your system is that first, SQL Server auto-creates the _WA_SYS... stats for a column when the proc is first executed. A plan is compiled using this autostats. Then after a while SQL Server decides to drop the autostats. Since the execution plan used these stats the plan is now invalid and needs to be recompiled. But when recompiling the plan the optimizer notices that the autostats might be useful for this plan, so it auto-creates the stats again. Since this takes some time, you start seeing the [COMPILE] locks until the stats are created and the plan is compiled, then execution can continue.

    If this guessing is actually correct, then the obvious solution should be to create an index for the column(s) that are having autostats created for them. This way the stats will not be dropped and the optimizer will not need to auto-create them while compiling a plan for the query.

    If you can try this out it would be very interesting to hear about the results you get.

  • Chris,

    Unfortunately, I am leaving company tomorrow I will pass on recommendations to developer but won't be able to report results back.

    Thanks a lot for your help,

    Igor

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply