Query a DateTime column in a SUPERFAST way !

Question

Post reply

Query a DateTime column in a SUPERFAST way !

well0549

SSCrazy

Points: 2259
More actions
December 4, 2002 at 8:37 am

#159076

( you could read this but skip to the third reply on this to see how to improve datetime queries........ )

hello,

I wonder when I query a table like this

select * from customers where birthday = '2002-12-05' everything works fine

But is the Birthday column actually converted before doing the comparison.

ie. if i have 5.000.000 rows will this mean 5 million conversions.

And if i do something like this

select * from customers where birthday = convert(datetime,'2002-12-05')

is this done for all the 5 million rows

Or should i use something like

select * from customers where convert(datetime,'2002-12-05')=birthday

I really don't know can anybody explain this to me ?

I am optimizing a query and it is 100 times faster in every way but not for dates. The Stored proc accepts 10 parameters. A lot are optional and it is very fast but not for dates, i don't understand it.

I build a sting dynamically and execute that one with SP_Executesql....

Any help would be appreciated

Edited by - well0549 on 12/05/2002 01:33:40 AM

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Andy Jones, DBA Hall of Fame Points: 3641 More actions · Answer 1

Andy Jones, DBA

Hall of Fame

Points: 3641

December 4, 2002 at 9:13 am

#443100

Do you have an index on your birthday column?

Regards,

Andy Jones

.

well0549 SSCrazy Points: 2259 More actions · Answer 2

Tested it myself....

And was STUNNED !!!!!!!!!!!!

a query on a datetime field was 20 seconds faster when it was like this

select xxx from yyywhere birthday = '2002-12-04'

the other variant i tried was :

set @DATEVAR = getdate()

select xxx from yyy where birthday = @DATEVAR

so the where clause was faster when I used a STRING..... Tested it on a table with 5 million records. First variant took 35 seconds, the second variant took 50 seconds ????????

Does anybody has similar experiences ?????

This would be a NICE performance TRICK on any large database with a date in the where clause....

test it yourself and give a reply on this topic... I am very curious if it is ALWAYS and on EVERY server like this ????

Please help me with this...

And if you know why ... PLEASE LET ME KNOW !

Edited by - well0549 on 12/05/2002 12:57:54 AM

Edited by - well0549 on 12/05/2002 01:40:22 AM

Andy Jones, DBA Hall of Fame Points: 3641 More actions · Answer 3

Andy Jones, DBA

Hall of Fame

Points: 3641

December 5, 2002 at 4:06 am

#443250

I ran a test on a simple table with 1 datetime column containing 5 million records. Both queries ran at less than 1 second (using a table scan - no index), does your query do other filtering / joins?

Regards,

Andy Jones

.

well0549 SSCrazy Points: 2259 More actions · Answer 4

well0549

SSCrazy

Points: 2259

December 5, 2002 at 5:53 am

#443257

Yes in my example this datefield is the last field of a

nonclustered index.

The index consist of 4 fields........

This in reply to your first post.....

Edited by - well0549 on 12/05/2002 05:54:59 AM

Antares686 SSC Guru Points: 125444 More actions · Answer 5

Look at your execution plan for each and make sure are the same. Also, consider other loads and a stored plan can effect overall performance on queries. I have never seen a difference of that large on any query of that type before. Also, you state the column is the last in nonclustered index. Keep in mind that the query plan chooses based on the statistics of the first column and even if you have the field in combined index doesn't mean it use it or that it seeked, may have been an index scan. Take a look at your multicolumn index and make sure the item with the most unique values is first, then the next most unique and so on to the least unique, this will increase the statistics for the index and increase it's likelyhood of being used.

well0549 SSCrazy Points: 2259 More actions · Answer 6

Well Antares689,

Never seen a difference, well find a table with 5 million rows (and not ONLY one column).

And run a query on a datetime field. One with a datetime variable and one with a string. And just tell me if it happens on your databases.......

Because if you would ask me this three days ago I would have given the same answer as you, but now I really don't know why it is so much faster.......

( maybe because it is....)

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 7

I'd have to agree with Andy and Antares, never seen this either. I just tested it on a table with several columns, returning most of them, and with the where-clauses you suggest. Table has about 5.5 million rows, and just as Andy said, queries run in under 1 second, wheteher I have an index or not (of course if the index would have the date column as only column then a seek would occur and times would be much quicker). The plans are equal except that the first one does a convert on the string to datetime, but this is not noticeable in execution times. Have you tested it several times? Is there absolutely nothing else that might affect the query times?

--

Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

well0549 SSCrazy Points: 2259 More actions · Answer 8

I know it is hard to believe..... But this is the real QUERY...

declare @Bdate datetime

set @Bdate = convert(datetime,'2000-01-01')

select count(*) from XXX where BtDate = '2000-01-01 00:00:00.000'

select count(*) from XXX where BtDate = @Bdate

Statistics ( Executed immediately after eachother..........)

SQL Server Execution Times:

CPU time = 16424 ms, elapsed time = 46513 ms.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 16704 ms, elapsed time = 54228 ms.

Remember.... The Second query should profit from some caching of SQL Server

Did a Showplan here is the result :

StmtText

--------------------------------------------------------------------------------------------------------------------------------

|--Stream Aggregate(DEFINE:([Expr1002]=Count(*)))

|--Index Scan(OBJECT:([YYY].[dbo].[XXX].[IX_G2C_ZZZ_PPP_Status_BegDat]), WHERE:([XXX].[begindatum]=Jan 1 2000 12:00AM))

StmtText

---------------------------------------------------------------------------------------------------------------------

|--Stream Aggregate(DEFINE:([Expr1002]=Count(*)))

|--Index Scan(OBJECT:([YYY].[dbo].[XXX].[IX_G2C_ZZZ_PPP_Status_BegDat]), WHERE:([XXX].[begindatum]=[@datum]))

I had to replace field and table names with XX etc....

But i think it is clear it is executing exactly the same query........

Still there is a huge difference... ( it is using the same index.......)

Wich is a composite index of 4 fields.....

Klaas-Jan SSCrazy Points: 2297 More actions · Answer 9

Klaas-Jan

SSCrazy

Points: 2297

December 6, 2002 at 7:19 am

#443459

Hi did also a test on my development pc (p3 128 mb ). The table has 2 milion rows response times for both queries < 1 ms. There is a index on the date column.

Andy Jones, DBA Hall of Fame Points: 3641 More actions · Answer 10

Hi, I ran another test here is how I set up the test data: -

create table T1 (c1 int)

go

create table T2 (c1 int)

go

create table T3 (c1 int)

go

create table T4 (c1 varchar(10),c2 varchar(10), c3 int, c4 datetime)

go

declare @i int

select @i = 0

while @i < 250

begin

insert T1 values (@i)

select @i = @i + 1

end

select @i = 0

while @i < 250

begin

insert T2 values (@i)

select @i = @i + 1

end

select @i = 0

while @i < 80

begin

insert T3 values (@i)

select @i = @i + 1

end

go

insert T4 select char(T1.c1),char(T2.c1),T1.c1,T1.c1 from T1,T2,T3 --Cross join to insert 5million records

create index [ixT4] on [dbo].[T4]([c1],[c2],[c3],[c4]) on [primary]

go

(T4 is the table to query, the others were just to set up the data)

Then I ran the queries with the following results: -

declare @dt datetime

select @dt = convert(datetime,'1900-01-01 00:00:00.000')

select count(*) from T4 where c4 = @dt

SQL Server Execution Times:

CPU time = 2594 ms, elapsed time = 689 ms.

StmtText

--------------------------------------------------

|--Stream Aggregate(DEFINE:([partialagg1003]=Count(*)))

|--Table Scan(OBJECT:([DbTest].[dbo].[T4]), WHERE:([T4].[c4]=[@dt]))

select count(*) from T4 where c4 = '1900-01-01 00:00:00.000'

SQL Server Execution Times:

CPU time = 2546 ms, elapsed time = 685 ms.

StmtText

--------------------------------------------------

|--Stream Aggregate(DEFINE:([partialagg1003]=Count(*)))

|--Table Scan(OBJECT:([DbTest].[dbo].[T4]), WHERE:([T4].[c4]='Jan 1 1900 12:00AM'))

The elapsed times are virtually the same in both queries. The interesting difference in the Showplan is your query did an index scan whereas mine did a table scan i.e. the index wasn't used. Try adding a new distinct index just on the datetime column, then run the query.

Regards,

Andy Jones

.

Antares686 SSC Guru Points: 125444 More actions · Answer 11

I think I see the problem. You are assuming "elapsed time" is the amount of time the query took to execute. This is is the connections cumlative time and if you look you will get a before and after "elapsed time" ex.

SELECT COUNT(*) FROM tblX

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 147614 ms.

-----------

5423

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 31 ms, elapsed time = 147980 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 148008 ms.

The first is the milliseconds that had passed for the cumulative time of the connection.

Then my output from my query.

Then the millisecounds for the cumulative of the connection when the query completed.

The last is the end of the execution plan out based on the cumulative time of the connection

So based on the first and second values the total runtime of the query is figured like so.

Second - First = runtime in milliseconds

or

147980 - 147614 = 366 ms actual run time for query.

Make sure you are using the right values.

aribi Old Hand Points: 315 More actions · Answer 12

aribi

Old Hand

Points: 315

December 10, 2002 at 2:06 am

#443897

How can you have so good response time?

When I run a Select count(*) on a 7 million rows table with a where clause on a datetime column (no index on it) the CPU time is always 15000-16000 ms. The elapsed time is various from 5s (we have 4 cpu) to 120s.

(server is sql7 on windows nt)

Antares686 SSC Guru Points: 125444 More actions · Answer 13

quote:

How can you have so good response time?

When I run a Select count(*) on a 7 million rows table with a where clause on a datetime column (no index on it) the CPU time is always 15000-16000 ms. The elapsed time is various from 5s (we have 4 cpu) to 120s.

(server is sql7 on windows nt)

Take a look at your execution plan in QA. Most likely because there is no index on datetime column you are seeing a Table Scan as it trys to find the dates you specify. I suggest, especialy if used very often, add an index to the datetime column. These are usually really good indexes as they have highly unique values depending on how the datetime is inserted or controled.

aribi Old Hand Points: 315 More actions · Answer 14

Thanks Antares686, I know I should add an index on this column, but it was just a test to compare to what Andyj93 said:

"I ran a test on a simple table with 1 datetime column containing 5 million records. Both queries ran at less than 1 second (using a table scan - no index)"

I was very surprised by this result, so I did this test and my results are very very different.

how could you explain this difference?