Changing dates to Integers

  • Is there any value to change date and time fields into integers.

  • what do you mean by a value?? Are you looking for something similar to this SQL statement below in ssis.

    Select Convert(Varchar,Getdate(),109)

  • depends on what "kind" of integer you want:

    select convert(int,getdate())

    --40332 days since 01-01-1990, SQL's internal start date

    select YEAR(getdate()) * 10000

    + MONTH(getdate()) * 100

    + DAY(getdate())

    --an integer "20100604"

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • advantage. Is it easier to use? Take up less space? Create better indexes?

    so instead of storing '06/04/2010 14:06' as a date time or small date time or date and time.

    converting the 06/04/2010 to 20100604 and putting it into an int column and 14:06 to 1406 and placing it in a int column.

  • lowell the second one 20100604 as an integer over just a datetime column

  • there is absolutely no advantage to storing dates as anything except dates.

    never store datetime as any other datetype except datetime.

    what is the date of the integer 20105404?

    internally, SQL stores it as numbers anyway...how it is displayed is up to your regional settings.

    if you save it as anything except datetime, you'll have to convert back and forth, and it will not be easy to use the builtin tools like DATEADD DATEDIFF and the BETWEEN operator without converting stuff.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • You really don't need to turn you dates into integer "date ids". SQL Server is highly optimized to use dates as indexes. Internally, they're stored as integers anyways, so the comparison is very quick. Any conversion you add is just extra manipulation that will have to occur.

    --J

  • our bi team insists on converting all dates to integers and they say they are easier to use. I as the DBA thought what you just wrote but I wondered if I was missing something.

    Our bi is based on Kimball Method.

  • The Kimball method is great (and it's probably what most of us use), but it's general purpose and system independant. SQL Server has been optimized to the point where this is not necessary. Trust me, there's nothing easy about using date ids. The fact that you're on a forum trying to figure out how to do it supports this.

    Personally, I used date ids in a data warehouse I set up some time ago, and I regret it. Not only is it a hassle on the ETL side, it's also a hassle to use. I'm sure it can't possibly be more efficient to have to join just to get a date. Also, I do sometime query the warehouse directly. When I do, I have to use conversion functions on my dates before making comparisons (and pretty much every query against a data warehouse has a date in the WHERE clause). If I had it to do over again, I'd use smalldatetime.

    As we said, dates are stored as integers any ways. Furthermore, they're stored as sequential numbers; so tomorrow will be stored as a larger number than today. Internally, SQL Server can perform comparison operations and BETWEEN without converting the data in any way.

    I think you should make the case that they're not necessary. Do a little research and I think you'll find a lot of information out there that supports what we are saying. You and your team will be much happier if you just use the datetime or smalldatetime fields.

    --J

  • It's real simple.... a thousand professional opinions aren't worth a hoot. A single well written test will tell all. Have the BI "experts" prove their hypothesis using a small million row example. Only the code knows the truth. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • That was my first reaction - show me the benchmark.

    😎

  • jvanderberg (6/4/2010)


    The Kimball method is great (and it's probably what most of us use), but it's general purpose and system independant. SQL Server has been optimized to the point where this is not necessary. Trust me, there's nothing easy about using date ids. The fact that you're on a forum trying to figure out how to do it supports this.

    Personally, I used date ids in a data warehouse I set up some time ago, and I regret it. Not only is it a hassle on the ETL side, it's also a hassle to use. I'm sure it can't possibly be more efficient to have to join just to get a date. Also, I do sometime query the warehouse directly. When I do, I have to use conversion functions on my dates before making comparisons (and pretty much every query against a data warehouse has a date in the WHERE clause). If I had it to do over again, I'd use smalldatetime.

    As we said, dates are stored as integers any ways. Furthermore, they're stored as sequential numbers; so tomorrow will be stored as a larger number than today. Internally, SQL Server can perform comparison operations and BETWEEN without converting the data in any way.

    I think you should make the case that they're not necessary. Do a little research and I think you'll find a lot of information out there that supports what we are saying. You and your team will be much happier if you just use the datetime or smalldatetime fields.

    --J

    What if your date is actually a surrogate key (for example, a surrogate key in a fact table that links with the time dimension). Do you still use the datetime datatype? Or just a random integer? (Or a meaningful integer, which is the topic of this thread)

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • [font="Courier New"]Hello,

    I have found storing the date as an integer to be useful as an integer surrogate key to a date dimension table with attributes you can not easily get from tsql, for example business events tied to specific dates. (ex. a HolidayFlag or something descriptive text about the day) (Kimball Method as mentioned by others above)

    I concede that the date storage is as efficient as an integer, but propose that converting to an integer (Year(getDate())*100000+month(getDate()*100+day(getdate()) is not any more difficult than truncating the date to reflect a consistent date key surrogate value (dateadd(dd,0, datediff(dd,0, getDate())).

    The consistency of always using integer surrogate keys for dimension tables seems beneficial...

    One practice is the use of -1 as a key for an 'Unknown' date...

    There is additional value in using a smart surrogate key for a date dimension as documented by Kimball et al.

    Are there any opposing views on this usage?

    Thank you

    [/font]

  • Patrick Seegers (6/9/2010)


    [font="Courier New"]Are there any opposing views on this usage?

    Thank you

    [/font]

    Sure... read the rest of the posts on this thread.

    Hmmmm.... "unknown date".... sounds like the perfect use for NULL unless it's an EndDate in which case the "unknown date" should be 9999-12-31. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • There is additional value in using a smart surrogate key for a date dimension as documented by Kimball et al.

    Are there any opposing views on this usage?

    Patrick, on the internet there are always opposing views, but I'm with you on the integer surrogate key practice. All of the benefits that the native functions bestow are pretty much irrelevant when you're talking about a key field, and in a date dimension you'll also have date fields where those functions can be used.

    Cris E

    [font="Arial"]Are you lost daddy? I asked tenderly.
    Shut up he explained.
    [/font]
    - Ring Lardner

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply