Multi Column Sort

  • 36376 ms? What are you running this test on, Ramesh - a twin-floppy 80's luggable? 😀

    876 ms is awesome:cool:

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Peso (4/8/2009)


    Using Ramesh's test cases, I get these timings:

    Ramesh' Version : 526 ms

    ChrisM' Version 3 ("condensed" version 2) : 23480 ms

    Peso : 236 ms

    236 ms! looks great! Can I see the corrected version, please?

    --Ramesh


  • Chris Morris (4/8/2009)


    36376 ms? What are you running this test on, Ramesh - a twin-floppy 80's luggable? 😀

    876 ms is awesome:cool:

    Not that old enough:-D...

    I am working on wretched servers with Windows 2003 SP2, SQL 2005 EE SP3, 1 GB RAM & 3.4 GHz Core 2 Duo Processor

    --Ramesh


  • Same as before.

    And I am little suspicous about your algorithms. For both you and Chris all 6250 records in the final resultset are ordered by original ID.

    My suggestion are unordered by ID, but follows the "divided by 4" path.

    1-2-5-3-4

    6-7-10-8-9

    11-12-15-13-14


    N 56°04'39.16"
    E 12°55'05.25"

  • Peso (4/8/2009)


    Using Ramesh's test cases, I get these timings:

    Ramesh' Version : 526 ms

    ChrisM' Version 3 ("condensed" version 2) : 23480 ms

    Peso : 236 ms

    Ooookaaaaayyyy proof of the pudding...run against randomised data:

    --- Expected order of output

    SELECT *

    INTO #test2

    FROM #test

    GO

    DROP TABLE #test

    GO

    SELECT *

    INTO #Test

    FROM #test2

    ORDER BY NEWID()

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • For Number < 40000 I get these results

    Ramesh' Version : 2683 ms

    Peso : 1546 ms


    N 56°04'39.16"
    E 12°55'05.25"

  • Amazing topic..it is still going...

  • Peso (4/8/2009)


    Same as before.

    And I am little suspicous about your algorithms. For both you and Chris all 6250 records in the final resultset are ordered by original ID.

    My suggestion are unordered by ID, but follows the "divided by 4" path.

    1-2-5-3-4

    6-7-10-8-9

    11-12-15-13-14

    See these results, the row 5 should be followed after row 4 as the col2 value of row 5 is higher than that of row 4.

    -- Actual Results

    ID col1 col2 col3 col4 col5 col6 col7 col8

    1 2.35 3.01 3.49 4.25 4.79 5.36 5.82 6.31

    2 NULL 3.59 4.32 NULL 5.21 NULL NULL 7.12

    5 5.12 5.66 6.31 6.92 NULL 7.87 8.34 NULL

    3 NULL NULL NULL 5.23 5.41 6.33 6.89 7.99

    4 NULL 5.31 5.46 5.92 NULL 6.87 7.34 NULL

    --Ramesh


  • Peso (4/8/2009)


    For Number < 40000 I get these results

    Ramesh' Version : 2683 ms

    Peso : 1546 ms

    Here's a randomised set of 15 rows, Peso - wanna give them both a try? They should order by ID, obviously!

    DROP TABLE #qtest

    DROP TABLE #test

    DROP TABLE #test2

    --

    CREATE TABLE #qtest (id int, col1 numeric(6,2), col2 numeric(6,2), col3 numeric(6,2), col4 numeric(6,2), col5 numeric(6,2), col6 numeric(6,2), col7 numeric(6,2), col8 numeric(6,2))

    INSERT INTO #qtest (id, col1, col2, col3, col4, col5, col6, col7, col8)

    SELECT 1, 2.35, 3.01, 3.49, 4.25, 4.79, 5.36, 5.82, 6.31

    UNION SELECT 2, null, 3.59, 4.32, null, 5.21, null, null, 7.12

    UNION SELECT 3, NULL, null, null, 5.23, 5.41, 6.33, 6.89, 7.99

    UNION SELECT 4, null, 5.31, 5.46, 5.92, null, 6.87, 7.34, null

    UNION SELECT 5, 5.12, 5.66, 6.31, 6.92, null, 7.87, 8.34, null

    ---- 15 rows

    SELECT ROW_NUMBER() OVER (ORDER BY Batch, id) AS [id],

    col1, col2, col3, col4, col5, col6, col7, col8

    INTO #test2

    FROM (SELECT CAST(1 AS INT) AS [Batch], *

    FROM #qtest

    UNION ALL

    SELECT 1+(number/4), id, col1+number, col2+number, col3+number, col4+number, col5+number, col6+number, col7+number, col8+number

    FROM #qtest, dbo.Numbers

    WHERE number%4 = 0 AND number < 12 ---<= (20000-4) -- 25,000

    ) d

    --

    SELECT *

    INTO #Test

    FROM #test2

    ORDER BY NEWID()

    --

    SELECT * FROM #test

    -- Now run query

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Ramesh (4/8/2009)


    See these results, the row 5 should be followed after row 4 as the col2 value of row 5 is higher than that of row 4.

    -- Actual Results

    ID col1 col2 col3 col4 col5 col6 col7 col8

    1 2.35 3.01 3.49 4.25 4.79 5.36 5.82 6.31

    2 NULL 3.59 4.32 NULL 5.21 NULL NULL 7.12

    5 5.12 5.66 6.31 6.92 NULL 7.87 8.34 NULL

    3 NULL NULL NULL 5.23 5.41 6.33 6.89 7.99

    4 NULL 5.31 5.46 5.92 NULL 6.87 7.34 NULL

    Yes, but Col1 value of record 5 is lesser than any value of record 4.

    I have interpreted the problem statement OP wants to sort by minumum ColX value of record, and if there are duplicate minimum ColX value, sort by second lowest value.


    N 56°04'39.16"
    E 12°55'05.25"

  • Chris Morris (4/8/2009)


    Peso (4/8/2009)


    Using Ramesh's test cases, I get these timings:

    Ramesh' Version : 526 ms

    ChrisM' Version 3 ("condensed" version 2) : 23480 ms

    Peso : 236 ms

    Ooookaaaaayyyy proof of the pudding...run against randomised data:

    --- Expected order of output

    SELECT *

    INTO #test2

    FROM #test

    GO

    DROP TABLE #test

    GO

    SELECT *

    INTO #Test

    FROM #test2

    ORDER BY NEWID()

    Well, Chris, I don't think so you would be happy to see the results, sorry about that.

    1 2 3 4 5

    Ramesh's Version 726 700 836 800 613

    Chris' Version 36096 36046 40010 40780 36166

    --Ramesh


  • Ramesh (4/8/2009)


    See these results, the row 5 should be followed after row 4 as the col2 value of row 5 is higher than that of row 4.

    -- Actual Results

    ID col1 col2 col3 col4 col5 col6 col7 col8

    1 2.35 3.01 3.49 4.25 4.79 5.36 5.82 6.31

    2 NULL 3.59 4.32 NULL 5.21 NULL NULL 7.12

    5 5.12 5.66 6.31 6.92 NULL 7.87 8.34 NULL

    3 NULL NULL NULL 5.23 5.41 6.33 6.89 7.99

    4 NULL 5.31 5.46 5.92 NULL 6.87 7.34 NULL

    With that logic, why isn't record 3 sorted after record 2 and before record 5?


    N 56°04'39.16"
    E 12°55'05.25"

  • Peso (4/8/2009)


    Ramesh (4/8/2009)


    See these results, the row 5 should be followed after row 4 as the col2 value of row 5 is higher than that of row 4.

    -- Actual Results

    ID col1 col2 col3 col4 col5 col6 col7 col8

    1 2.35 3.01 3.49 4.25 4.79 5.36 5.82 6.31

    2 NULL 3.59 4.32 NULL 5.21 NULL NULL 7.12

    5 5.12 5.66 6.31 6.92 NULL 7.87 8.34 NULL

    3 NULL NULL NULL 5.23 5.41 6.33 6.89 7.99

    4 NULL 5.31 5.46 5.92 NULL 6.87 7.34 NULL

    With that logic, why isn't record 3 sorted after record 2 and before record 5?

    Hey Peso, sorry for the confusion. I reckon ChrisB's definition of the problem is the clearest:

    (rule 1) compare rowx,coly against rowx,colz (each column against another column in the same row - horizontal comparison)

    (rule 2) compare rowx,coly against rowz,coly (each column against the same column in another row - vertical comparison)

    Horizontally you do not need to do anything anymore since the tuples are already sorted.

    But vertically you need to make sure each columm is sorted individually. If there is a null, you ignore it.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • One problem is that you guys are testing against Easy data. Now why don't you try some HARD data:

    CREATE TABLE SortedColumnsTestData

    (

    id INT NOT NULL PRIMARY KEY CLUSTERED,

    col1 Float NULL,

    col2 Float NULL,

    col3 Float NULL,

    col4 Float NULL,

    col5 Float NULL,

    col6 Float NULL,

    col7 Float NULL,

    col8 Float NULL,

    CorrectSequence INT NOT NULL

    )

    GO

    Alter proc spSortedColums_MakeData( @RowCnt as int = 40000)

    AS

    Delete From SortedColumnsTestData

    ;WITH cteTally as (

    Select TOP 40000

    ROW_NUMBER() Over(Order By c1.id) as N

    From master..syscolumns c1, master..syscolumns c2

    Order by c1.id

    ), cteNumX as (

    Select N

    , Cast(N as Float) as X

    , ABS(CHECKSUM(NEWID()))/Power(2.0,31) as R

    From cteTally

    ), cteData as (

    Select TOP (@RowCnt)

    N as CorrectSequence

    , ROW_NUMBER() Over(Order by R) as ID

    , Log(X + ABS(CHECKSUM(NEWID()))/Power(2.0,31)) as Col1

    , Log(X + 1 + ABS(CHECKSUM(NEWID()))/Power(2.0,31)) as Col2

    , X + ABS(CHECKSUM(NEWID()))/Power(2.0,31) as Col3

    , X + 1 + ABS(CHECKSUM(NEWID()))/Power(2.0,31) as Col4

    , Power(X + ABS(CHECKSUM(NEWID()))/Power(2.0,31),1.1) + 2 as Col5

    , Power(X + 1 + ABS(CHECKSUM(NEWID()))/Power(2.0,31),1.1) + 2 as Col6

    , Power(1.0016947,X + ABS(CHECKSUM(NEWID()))/Power(2.0,31)) + (x*2.19) + 3 as Col7

    , Power(1.0017834,X + ABS(CHECKSUM(NEWID()))/Power(2.0,31)) + (x*2.18) + 19 as Col8

    From cteNumX

    Order By N

    ), cteSorted as (

    Select TOP 40000 *

    , Power(CorrectSequence,1.5)%8 +1 as CurrCol

    , Power(CorrectSequence-1,1.5)%8 +1 as PrevCol

    From cteData

    Order By ID

    )

    INSERT into SortedColumnsTestData

    Select ID

    , Case When (CurrCol=1 or PrevCol=1) Then Col1 Else Null End

    , Case When (CurrCol=2 or PrevCol=2) Then Col2 Else Null End

    , Case When (CurrCol=3 or PrevCol=3) Then Col3 Else Null End

    , Case When (CurrCol=4 or PrevCol=4) Then Col4 Else Null End

    , Case When (CurrCol=5 or PrevCol=5) Then Col5 Else Null End

    , Case When (CurrCol=6 or PrevCol=6) Then Col6 Else Null End

    , Case When (CurrCol=7 or PrevCol=7) Then Col7 Else Null End

    , Case When (CurrCol=8 or PrevCol=8) Then Col8 Else Null End

    , CorrectSequence

    From cteSorted

    Select * From SortedColumnsTestData

    order by correctsequence

    Notes:

    1) Good only up to 40000

    2) I strongly recommend that you test validity at smaller number first

    3) A CorrectSequence column is included to facilitate validation.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Fair warning: this is some truly wicked data. Pretty close to worst case.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 106 through 120 (of 180 total)

You must be logged in to reply to this topic. Login to reply