row_num/rank over date dimension table

Question

Post reply

row_num/rank over date dimension table

mohammed.a

SSC Veteran

Points: 246
More actions
September 1, 2009 at 12:43 pm

#137851

I have a date dimension table with calendar date, year_num and month_num columns. This table is populated with calendar dates for 100 years.

I am executing below query and not getting what i wanted in fourth column.

select day_date

,year_num

,month_num

,row_number() over (PARTITION BY year_num, month_num order by month_num)

from date_dim

where day_date between '2009-11-1' and '2010-1-31'

IN FOURTH COLUMN I AM EXPECTING 1 FOR NOV 2009, 2 FOR DECEMBER 2009 ADN 3 FOR JAN 2010 RECORDS.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 1

Lynn Pettis

SSC Guru

Points: 442447

September 1, 2009 at 12:46 pm

#1047749

You tell us what you want, but you left out what you are getting currently.

The Dixie Flatline SSC Guru Points: 53353 More actions · Answer 2

PARTITION BY means to start counting all over again from 1 when this value changes. It isn't required.

Try this:

,row_number() over (ORDER BY day_date)

But, if you have one row per day, instead of one row per month, ROW_NUMBER() isn't going to do what you want at all. Let me know if that's the case, and I'll post a solution for you.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

mohammed.a SSC Veteran Points: 246 More actions · Answer 3

Here is what I am expecting. I am listing only 2-3 records per month, in real I get 30/31 records per month.

(day_date) (year_num) (month_num) (month_sequence)

---------------------------------------------------------

(2009-11-1) (2009) (11) (1)

.

(2009-11-30) (2009) (11) (1)

(2009-12-1) (2009) (12) (2)

.

(2009-12-31) (2009) (12) (2)

(2010-1-1) (2010) (1) (3)

.

(2010-1-31) (2010) (1) (3)

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 4

Lynn Pettis

SSC Guru

Points: 442447

September 1, 2009 at 1:10 pm

#1047768

Look at the following, and try using the dense_rank function.

select *, dense_rank() over (order by MyYear, MyMonth) from @TestTable;

justanewone Ten Centuries Points: 1242 More actions · Answer 5

justanewone

Ten Centuries

Points: 1242

September 1, 2009 at 1:15 pm

#1047772

If i've understood your requirements correctly you might want to try DENSE_RANK()



SELECT 

day_date,

year_num,

month_num,

DENSE_RANK() OVER(ORDER BY year_num, month_num)

FROM date_dim

WHERE day_date BETWEEN '2009-11-01' AND '2010-01-31'

Dave

The Dixie Flatline SSC Guru Points: 53353 More actions · Answer 6

Isn't DENSE_RANK going to force a sort?



declare @startDate datetime

set @startDate = '11/1/2009'

select DATEDIFF(month,0,@startdate)-1317  -- this datediff calculation is the solution

-- proof

;with tally (N) as (select row_number() over(order by id) from master..syscolumns)

,cte1 as (select N, dateadd(day,N-1,@startDate) as someDate from tally where N <= 120)

select N,somedate,DATEDIFF(month,0,somedate)-1317 as Column4

from cte1

Okay, Lynn: I admit it. I didn't even think of DENSE_RANK.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

mohammed.a SSC Veteran Points: 246 More actions · Answer 7

mohammed.a

SSC Veteran

Points: 246

September 1, 2009 at 1:21 pm

#1047781

Thanks for everyone who is invloved.

dense_rank did work.

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 8

Here is the code I wrote to test:



declare @TestTable table (

    MyDate datetime,

    MyYear int,

    MyMonth int

);

insert into @TestTable

select dateadd(mm, 2, dateadd(dd, datediff(dd, 0, getdate()),0)) + (N - 1), year(dateadd(mm, 2, dateadd(dd, datediff(dd, 0, getdate()),0)) + (N - 1)), month(dateadd(mm, 2, dateadd(dd, datediff(dd, 0, getdate()),0)) + (N - 1))

from dbo.Tally

where N < 93;

select *, dense_rank() over (order by MyYear, MyMonth) from @TestTable;

The Dixie Flatline SSC Guru Points: 53353 More actions · Answer 9

HAH! I was right!! (cruel chuckle)

The DENSE_RANK solution adds a sort and segment to the execution plan, so the DATEDIFF solution runs twice as fast in the test below. BUT, DENSE_RANK slows down a lot more as the rows get wider, so the difference in efficiency widens as well. Of course, it takes a million rows or so to really notice the difference.



declare @startDate datetime

set @startDate = '1/1/2009'

-- populate a million row test table from pregenerated tally table

if OBJECT_ID(N'TempDB..#test') is not null drop table #test

select N,DATEADD(DAY,N-1,@startDate) as someDate,

		DATEPART(Year,DATEADD(DAY,N-1,@startDate)) as someYear,

		DATEPART(Month,DATEADD(DAY,N-1,@startDate)) as someMonth

		--, 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' as filler

into #test

from tally

if OBJECT_ID(N'TempDB..#catch1') is not null drop table #catch1

if OBJECT_ID(N'TempDB..#catch2') is not null drop table #catch2

print '----- DENSE RANK SOLUTION'

set statistics time on;

select *, dense_rank() over (order by someYear, someMonth) as Column4

into #catch1

from #test;

set statistics time off;

print '----- DATEDIFF SOLUTION'

set statistics time on;

select *,DATEDIFF(month,0,somedate)-1317 as Column4

into #catch2

from #test

set statistics time off;

[Without Filler]

----- DENSE RANK SOLUTION

SQL Server Execution Times:

CPU time = 3198 ms, elapsed time = 3268 ms.

(1000000 row(s) affected)

----- DATEDIFF SOLUTION

SQL Server Execution Times:

CPU time = 1575 ms, elapsed time = 1568 ms.

(1000000 row(s) affected)

[WITH Filler]

----- DENSE RANK SOLUTION

SQL Server Execution Times:

CPU time = 4508 ms, elapsed time = 7628 ms.

(1000000 row(s) affected)

----- DATEDIFF SOLUTION

SQL Server Execution Times:

CPU time = 2106 ms, elapsed time = 2411 ms.

(1000000 row(s) affected)

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 10

Lynn Pettis

SSC Guru

Points: 442447

September 1, 2009 at 2:21 pm

#1047816

Cool!