Query a DateTime column in a SUPERFAST way !

  • ( you could read this but skip to the third reply on this to see how to improve datetime queries........ )

    hello,

    I wonder when I query a table like this

    select * from customers where birthday = '2002-12-05' everything works fine

    But is the Birthday column actually converted before doing the comparison.

    ie. if i have 5.000.000 rows will this mean 5 million conversions.

    And if i do something like this

    select * from customers where birthday = convert(datetime,'2002-12-05')

    is this done for all the 5 million rows

    Or should i use something like

    select * from customers where convert(datetime,'2002-12-05')=birthday

    I really don't know can anybody explain this to me ?

    I am optimizing a query and it is 100 times faster in every way but not for dates. The Stored proc accepts 10 parameters. A lot are optional and it is very fast but not for dates, i don't understand it.

    I build a sting dynamically and execute that one with SP_Executesql....

    Any help would be appreciated

    Edited by - well0549 on 12/05/2002 01:33:40 AM

  • Do you have an index on your birthday column?

    Regards,

    Andy Jones

    .

  • Tested it myself....

    And was STUNNED !!!!!!!!!!!!

    a query on a datetime field was 20 seconds faster when it was like this

    select xxx from yyywhere birthday = '2002-12-04'

    the other variant i tried was :

    set @DATEVAR = getdate()

    select xxx from yyy where birthday = @DATEVAR

    so the where clause was faster when I used a STRING..... Tested it on a table with 5 million records. First variant took 35 seconds, the second variant took 50 seconds ????????

    Does anybody has similar experiences ?????

    This would be a NICE performance TRICK on any large database with a date in the where clause....

    test it yourself and give a reply on this topic... I am very curious if it is ALWAYS and on EVERY server like this ????

    Please help me with this...

    And if you know why ... PLEASE LET ME KNOW !

    Edited by - well0549 on 12/05/2002 12:57:54 AM

    Edited by - well0549 on 12/05/2002 01:40:22 AM

  • I ran a test on a simple table with 1 datetime column containing 5 million records. Both queries ran at less than 1 second (using a table scan - no index), does your query do other filtering / joins?

    Regards,

    Andy Jones

    .

  • Yes in my example this datefield is the last field of a

    nonclustered index.

    The index consist of 4 fields........

    This in reply to your first post.....

    Edited by - well0549 on 12/05/2002 05:54:59 AM

  • Look at your execution plan for each and make sure are the same. Also, consider other loads and a stored plan can effect overall performance on queries. I have never seen a difference of that large on any query of that type before. Also, you state the column is the last in nonclustered index. Keep in mind that the query plan chooses based on the statistics of the first column and even if you have the field in combined index doesn't mean it use it or that it seeked, may have been an index scan. Take a look at your multicolumn index and make sure the item with the most unique values is first, then the next most unique and so on to the least unique, this will increase the statistics for the index and increase it's likelyhood of being used.

  • Well Antares689,

    Never seen a difference, well find a table with 5 million rows (and not ONLY one column).

    And run a query on a datetime field. One with a datetime variable and one with a string. And just tell me if it happens on your databases.......

    Because if you would ask me this three days ago I would have given the same answer as you, but now I really don't know why it is so much faster.......

    ( maybe because it is....)

  • I'd have to agree with Andy and Antares, never seen this either. I just tested it on a table with several columns, returning most of them, and with the where-clauses you suggest. Table has about 5.5 million rows, and just as Andy said, queries run in under 1 second, wheteher I have an index or not (of course if the index would have the date column as only column then a seek would occur and times would be much quicker). The plans are equal except that the first one does a convert on the string to datetime, but this is not noticeable in execution times. Have you tested it several times? Is there absolutely nothing else that might affect the query times?

    --

    Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

    http://www.sql.nu

  • I know it is hard to believe..... But this is the real QUERY...

    declare @Bdate datetime

    set @Bdate = convert(datetime,'2000-01-01')

    select count(*) from XXX where BtDate = '2000-01-01 00:00:00.000'

    select count(*) from XXX where BtDate = @Bdate

    Statistics ( Executed immediately after eachother..........)

    SQL Server Execution Times:

    CPU time = 16424 ms, elapsed time = 46513 ms.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 16704 ms, elapsed time = 54228 ms.

    Remember.... The Second query should profit from some caching of SQL Server

    Did a Showplan here is the result :

    StmtText

    --------------------------------------------------------------------------------------------------------------------------------

    |--Stream Aggregate(DEFINE:([Expr1002]=Count(*)))

    |--Index Scan(OBJECT:([YYY].[dbo].[XXX].[IX_G2C_ZZZ_PPP_Status_BegDat]), WHERE:([XXX].[begindatum]=Jan 1 2000 12:00AM))

    StmtText

    ---------------------------------------------------------------------------------------------------------------------

    |--Stream Aggregate(DEFINE:([Expr1002]=Count(*)))

    |--Index Scan(OBJECT:([YYY].[dbo].[XXX].[IX_G2C_ZZZ_PPP_Status_BegDat]), WHERE:([XXX].[begindatum]=[@datum]))

    I had to replace field and table names with XX etc....

    But i think it is clear it is executing exactly the same query........

    Still there is a huge difference... ( it is using the same index.......)

    Wich is a composite index of 4 fields.....

  • Hi did also a test on my development pc (p3 128 mb ). The table has 2 milion rows response times for both queries < 1 ms. There is a index on the date column.

  • Hi, I ran another test here is how I set up the test data: -

    create table T1 (c1 int)

    go

    create table T2 (c1 int)

    go

    create table T3 (c1 int)

    go

    create table T4 (c1 varchar(10),c2 varchar(10), c3 int, c4 datetime)

    go

    declare @i int

    select @i = 0

    while @i < 250

    begin

    insert T1 values (@i)

    select @i = @i + 1

    end

    select @i = 0

    while @i < 250

    begin

    insert T2 values (@i)

    select @i = @i + 1

    end

    select @i = 0

    while @i < 80

    begin

    insert T3 values (@i)

    select @i = @i + 1

    end

    go

    insert T4 select char(T1.c1),char(T2.c1),T1.c1,T1.c1 from T1,T2,T3 --Cross join to insert 5million records

    create index [ixT4] on [dbo].[T4]([c1],[c2],[c3],[c4]) on [primary]

    go

    (T4 is the table to query, the others were just to set up the data)

    Then I ran the queries with the following results: -

    declare @dt datetime

    select @dt = convert(datetime,'1900-01-01 00:00:00.000')

    select count(*) from T4 where c4 = @dt

    SQL Server Execution Times:

    CPU time = 2594 ms, elapsed time = 689 ms.

    StmtText

    --------------------------------------------------

    |--Stream Aggregate(DEFINE:([partialagg1003]=Count(*)))

    |--Table Scan(OBJECT:([DbTest].[dbo].[T4]), WHERE:([T4].[c4]=[@dt]))

    select count(*) from T4 where c4 = '1900-01-01 00:00:00.000'

    SQL Server Execution Times:

    CPU time = 2546 ms, elapsed time = 685 ms.

    StmtText

    --------------------------------------------------

    |--Stream Aggregate(DEFINE:([partialagg1003]=Count(*)))

    |--Table Scan(OBJECT:([DbTest].[dbo].[T4]), WHERE:([T4].[c4]='Jan 1 1900 12:00AM'))

    The elapsed times are virtually the same in both queries. The interesting difference in the Showplan is your query did an index scan whereas mine did a table scan i.e. the index wasn't used. Try adding a new distinct index just on the datetime column, then run the query.

    Regards,

    Andy Jones

    .

  • I think I see the problem. You are assuming "elapsed time" is the amount of time the query took to execute. This is is the connections cumlative time and if you look you will get a before and after "elapsed time" ex.

    SELECT COUNT(*) FROM tblX

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 147614 ms.

    -----------

    5423

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 147980 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 148008 ms.

    The first is the milliseconds that had passed for the cumulative time of the connection.

    Then my output from my query.

    Then the millisecounds for the cumulative of the connection when the query completed.

    The last is the end of the execution plan out based on the cumulative time of the connection

    So based on the first and second values the total runtime of the query is figured like so.

    Second - First = runtime in milliseconds

    or

    147980 - 147614 = 366 ms actual run time for query.

    Make sure you are using the right values.

  • How can you have so good response time?

    When I run a Select count(*) on a 7 million rows table with a where clause on a datetime column (no index on it) the CPU time is always 15000-16000 ms. The elapsed time is various from 5s (we have 4 cpu) to 120s.

    (server is sql7 on windows nt)

  • quote:


    How can you have so good response time?

    When I run a Select count(*) on a 7 million rows table with a where clause on a datetime column (no index on it) the CPU time is always 15000-16000 ms. The elapsed time is various from 5s (we have 4 cpu) to 120s.

    (server is sql7 on windows nt)


    Take a look at your execution plan in QA. Most likely because there is no index on datetime column you are seeing a Table Scan as it trys to find the dates you specify. I suggest, especialy if used very often, add an index to the datetime column. These are usually really good indexes as they have highly unique values depending on how the datetime is inserted or controled.

  • Thanks Antares686, I know I should add an index on this column, but it was just a test to compare to what Andyj93 said:

    "I ran a test on a simple table with 1 datetime column containing 5 million records. Both queries ran at less than 1 second (using a table scan - no index)"

    I was very surprised by this result, so I did this test and my results are very very different.

    how could you explain this difference?

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply