Hidden Tricks To SQL Server Table Cleanup

  • David Walker-278941 (3/6/2013)


    But!!!!!

    ORDER IS IGNORED IN A VIEW, starting with SQL 2005. Even with Top 100 Percent specified, starting with (I believe) SQL 2008. See Books Online and many online posts. How does that affect this article?

    (From one blog entry: When you save a VIEW definition, the ORDER BY is ignored on the returnset. That is by SQL ANSI standards.)

    That's what I understand. If that's wrong, please let me know. Thanks.

    I'm curious to see if the author has any comment on this. thanks.

  • Edward.Polley 76944 (3/6/2013)


    Try this.

    CREATE VIEW dbo.test as (SELECT top (100) * from table ORDER BY primary_key_column). Look at the result they will be ordered.

    "ORDER BY guarantees a sorted result only for the outermost SELECT statement of a query." From http://msdn.microsoft.com/en-us/library/ms188723.aspx. Order By in a view is DOCUMENTED to be ignored. Records TYPICALLY come out in primary key order, so it may appear that they are ordered, but that is not a contractual behavior, especially if your primary key is different than the order by clause within the view -- and it's especially tricky to rely on something that appears to work when you look at one set of results, but is documented as being not guaranteed!

  • Just for fun:

    CREATE TABLE t1 (x INT PRIMARY KEY, y INT UNIQUE);

    GO

    INSERT INTO t1 (x,y) VALUES (1,3); INSERT INTO t1 (x,y) VALUES (2,2); INSERT INTO t1 (x,y) VALUES (3,1);

    GO

    CREATE VIEW v1 AS SELECT TOP 100 percent x FROM t1 ORDER BY x

    GO

    SELECT x FROM v1

    GO

    -- Returns 3 2 1 in SQL 2008 (10.50.277 and 10.0.4067) and probably in SQL 2005.

    The Order By clause in a View, even with Select Top 100 Percent, is ignored.

  • BTW - Ceiling works here because these are integer variables.

    Thanks,

  • Well your view was not defined as discussed in the article, for more fun try these statements and the data is returned in order - always.

    CREATE TABLE t1 (x INT PRIMARY KEY, y INT UNIQUE);

    GO

    INSERT INTO t1 (x,y) VALUES (1,3); INSERT INTO t1 (x,y) VALUES (2,2); INSERT INTO t1 (x,y) VALUES (3,1);

    GO

    CREATE VIEW v1 AS SELECT TOP 100 * FROM t1 ORDER BY x

    GO

    SELECT * FROM v1

    GO

    /* results

    x y

    1 3

    2 2

    3 1

    */

    🙂

  • Edward.Polley 76944 (3/7/2013)


    BTW - Ceiling works here because these are integer variables.

    Thanks,

    It doesn't work in your algorithm. I've pulled it apart to see what it does if there were 5001 rows that needed deleting and it was doing it in batches of 2000.

    DECLARE @N INT -- Row count for each transaction

    DECLARE @cnt INT -- Total row count for this run, can be hardcoded

    DECLARE @loops INT -- Number of transactions

    BEGIN

    SET @N = 2000 --must be same value as rows selected in view

    SELECT @cnt = 5001 -- Just pretend there are 5001 rows to delete

    SET @loops = CEILING(@cnt/@N) -- transactionHow many times to run

    WHILE @loops > 0

    BEGIN

    SET @cnt= @cnt - 2000

    SET @loops = @loops - 1 -- decrement @loop

    END

    END

    PRINT 'There are ' + CAST(@cnt as varchar(13)) + ' rows left that your algorithm doesn''t delete'

    try it

  • I see the issue here, yes the left over rows aren't deleted on purpose. My goal is to remove 50 million rows using 2000 row chunks so yes there will be up to 1999 rows left out of 50 million. Hope this clarifys and thanks for the follow up.

  • If you can stop the system for 15 mins then you can following either of the following

    First, this kind of table should have partitions. Simply using a switch partition would reduce the data in table.

    Second, if this is not the case, it will be faster to re-create an another table with Select Into (use parallelism and reduced logging) to create another table with the required 70 million rows. This will take 5 mins max.

    Then rename the tables as required and all is well.

    Run the table stats in both cases.

  • Thanks for all the comments. I add that my shop doesn't ever go offline - we are truly 7*24*365. The article discusses one technique to remove obsolete rows without a maintenance window - something the business owners really appreciate.:-)

  • I did this in a SQL 2000 db many years ago, 300 million rows and it had to be archived, used a loop and wait for delay, tricky when you don't have something as simple as DELETE TOP xxxxx.

    I also remembered that the optimal batch size to delete was 300,000.

    Good article, thanks.

    qh

    [font="Tahoma"]Who looks outside, dreams; who looks inside, awakes. – Carl Jung.[/font]
  • Cant we use TRUNCATE?

  • Depends on scenario, obsolete vs current rows count, you might also create a copy of table copy rows you need to preserve and then switch them by altering names.

  • I'm sure there's no need for the view, I'd just loop on the identity coulmn(or another suitable indexed column) and delete where less than min(ident column ) + 2000 depending on the seed.

    I guess I'm old school but I wouldn't like to rely on a top always bringing back the rows I wanted to delete, to be 100% and use a variable in the where clause, just in case ansi or the optimiser changes how the top statement works.

  • andy.m.guscott (9/4/2015)


    I'm sure there's no need for the view, I'd just loop on the identity coulmn(or another suitable indexed column) and delete where less than min(ident column ) + 2000 depending on the seed.

    I guess I'm old school but I wouldn't like to rely on a top always bringing back the rows I wanted to delete, to be 100% and use a variable in the where clause, just in case ansi or the optimiser changes how the top statement works.

    It might be that the date column you want to delete on in the where criteria is not ordered by the identity column. So you might end up deleting rows that shouldn't be deleted.

  • divyesh.khimasia 4679 (9/4/2015)


    Cant we use TRUNCATE?

    No we can't. You cannot specify WHERE clause in TRUNCATE, therefore you cannot specify which rows you want to remove - it just removes all of them.

Viewing 15 posts - 61 through 75 (of 95 total)

You must be logged in to reply to this topic. Login to reply