The Number that shouldn't be a number

  • sparky-407434 (1/31/2015)


    On the plus side, storing telephone number as int uses less storage than a varchar. Ok you won't need to use any numeric functions on it, but you probably won't need to use any string functions on it either.

    Good luck phoning any South African number you store that way...

    eg South African number: 011 88x 8yyyy (masked so that I don't accidentally give a real number). Some have two leading zeros, in case you just want to add a 0 in front.

    And how many phone numbers are you storing before the ~10 bytes you save per number is significant?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • bdcoder (1/31/2015)


    Using MS SQL Server -- I'll be working on a project in the near future where we need to store latitude and longitude ( NO -- we do NOT need to use the geography data type, so please exclude from any suggestions ) -- but I've seen FLOAT, DECIMAL(9,6), DECIMAL(19,16) and many other used -- what it YOUR choice for storing latitude and longitude values; and for the sake of argument -- let's say to a maximum of 6 digits of precision ?

    Given that neither latitude nor longitude can exceed three digits on the left of the decimal place, with fixed 6 decimal places of precision that's a Numeric(9,6) or Decimal(9,6) (they're the same data type), or if you want to avoid any possibility of losing data in rounding, Numeric(11,8)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • jwiseh (1/31/2015)


    I remember the pain of working with a system where a developer had chosen the TIME data type to store the number of hours worked in a day. Anyone want to figure out how you add two TIME columns together? I'll wait…

    Okay, at risk of noob humiliation, why not store these as a TIME data type? Doesn't seem too difficult to calculate to me:

    [font="Courier New"]-- Set up a couple times to work with...

    DECLARE @Start AS TIME = '08:05:00 AM', @End AS TIME = '01:22:00 PM';

    -- Find the number of hours worked (CASTing DATEDIFF int output data type)...

    SELECT CAST(DATEDIFF(MINUTE,@Start,@End) AS NUMERIC)/60 AS 'HoursWorked';[/font]

    What's more, you'd want to convert this to a numeric format for future calculations, such as HoursWorked * PayRate = PaycheckAmount

    Not Start and End times.

    The developer used Time to store the number of hours worked in a day, so, simplified table structure (untested):

    CREATE TABLE TimeSheet (

    EmployeeID INT,

    EntryDate DATE,

    RegularHoursWorked TIME,

    OverTimeHoursWorked TIME

    )

    Insert into TimeSheet Values (1, '2015/02/02', '08:00', '02:30')

    Now, write me a query that returns the total number of hours I worked in a week.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • jarick 15608 (2/1/2015)


    How about storing all of the data in XML in a compressed binary string and using and using am image datatype to store it in. To read the data, they had to use a custom function to un-compress the XML string and then parse through that to look up data.

    I know of a large insurance company who stores data like that. Well, they used to, they're busy 're-architecting' 🙂

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (2/1/2015)


    sparky-407434 (1/31/2015)


    On the plus side, storing telephone number as int uses less storage than a varchar. Ok you won't need to use any numeric functions on it, but you probably won't need to use any string functions on it either.

    Good luck phoning any South African number you store that way...

    eg South African number: 011 88x 8yyyy (masked so that I don't accidentally give a real number). Some have two leading zeros, in case you just want to add a 0 in front.

    And how many phone numbers are you storing before the ~10 bytes you save per number is significant?

    Finnish numbers also have leading zero for example 040 7317644 and international way is +358407317644. So one actually needs to do string functions when showing them.

  • Well now I feel a little ashamed.

    I put a duration as a time on a recent project. I had no idea the issues it would cause! We did encounter another limitation which is that a TIME can't store more than 24 hours.

    Phone numbers are definitely strings, and I've always avoided floats and imprecise numerics like the plague.

    I'm sure you already have this, but one solution to the adding the times is as follows.

    CREATE TABLE #TimeSheet (

    EmployeeID INT,

    EntryDate DATE,

    RegularHoursWorked TIME,

    OverTimeHoursWorked TIME

    )

    Insert into #TimeSheet Values

    (1, '2015/02/02', '08:00:00', '02:30')

    , (1, '2015/02/03', '08:00:00', '01:00')

    , (1, '2015/02/04', '07:30:00', '00:00')

    , (1, '2015/02/05', '08:00:00', '01:30')

    , (1, '2015/02/06', '08:00:00', '00:30');

    WITH times AS (

    SELECT

    EmployeeID

    , EntryDate

    , DATEDIFF(MINUTE, '00:00', RegularHoursWorked) AS RegularMinutes

    , DATEDIFF(MINUTE, '00:00', OverTimeHoursWorked) AS OverTimeMinutes

    FROM #TimeSheet)

    SELECT

    EmployeeID

    , SUM(RegularMinutes)

    , SUM(OverTimeMinutes)

    FROM times

    GROUP BY EmployeeID

    DROP TABLE #TimeSheet

  • Back in the day, when learning about databases, a rule-of-thumb I was taught was that if you have a column that contains 'numeric' data then only set it's data type as 'numeric' - i.e. integer, decimal, float, etc - if you will be carrying out calculations on it, otherwise use a character data type. So for the telephone area codes and numbers the best data type would be char or varchar (or their unicode equivelants).

  • I once worked with a timekeeping system who shall remain anonymous (although I should name names as a warning to all). It was allegedly built for job and task tracking, as long as your job and task numbers were integers. Which ours weren't, of course. I had to write a program to convert our alphanumeric job and task numbers to integers so we could use the system.

    And did I mention that the job number was NOT a primary key on the job table? So, yes, it was possible to have the same job number in the table multiple times, which made the job tracking part a bit sketchy. Ditto with the task numbers - not a primary key as one would expect.

    Database 101 fail...

  • A bit off topic, but it's already been brought up in this thread. What is best practice for storing time intervals in a table? This is relevant to a system that I'm working on now.

    For example, consider a retry interval. The retry interval has to be less than 24 hours due to system design. The interval should be more than some minimal amount (say 15 seconds) or NULL if retry is not applicable. The retry interval is stored in a configuration table, and later added to the datetime of failure when it happens and stored in a different table.

    CREATE TABLE test_retry (

    retry_interval TIME(0) CHECK(retry_interval >= '00:00:15'),

    retry_interval_seconds INT CHECK (retry_interval_seconds >= 15 AND retry_interval_seconds < 86400),

    retry_interval_minutes SMALLINT CHECK (retry_interval_minutes > 0 AND retry_interval_minutes < 3600));

    GO

    INSERT test_retry VALUES ('00:02:00', 120, 2), (NULL, NULL, NULL);

    DECLARE @failure_date DATETIME = GETDATE();

    SELECT DATEADD(SECOND, DATEDIFF(SECOND, '', retry_interval), @failure_date) datetime_from_time,

    DATEADD(SECOND, retry_interval_seconds, @failure_date) datetime_from_seconds,

    DATEADD(MINUTE, retry_interval_minutes, @failure_date) datetime_from_minutes

    FROM test_retry;

    DROP TABLE test_retry;

    Looking at the SQL above, I definitely prefer retry_interval_seconds, but the system is already written with TIME(0) for most intervals. Is TIME a valid choice for intervals or not a good idea?

  • I recently was asked to concert some Oracle data to a SQL Server database and was directed to change all "quantities" into floats. These quantities could be currency values, but they also could be discount rates, tax rates, etc. I think the reason for doing this is that currency values are often multiplied by fractions, resulting in a value that has more than two decimal places. It was also easier to convert to float than to try to figure out how many decimals were appropriate.

  • dmbreth (2/2/2015)


    I recently was asked to concert some Oracle data to a SQL Server database and was directed to change all "quantities" into floats. These quantities could be currency values, but they also could be discount rates, tax rates, etc. I think the reason for doing this is that currency values are often multiplied by fractions, resulting in a value that has more than two decimal places. It was also easier to convert to float than to try to figure out how many decimals were appropriate.

    Aaarrrrrrrggggggggghhhhhh

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (2/2/2015)


    dmbreth (2/2/2015)


    I recently was asked to concert some Oracle data to a SQL Server database and was directed to change all "quantities" into floats. These quantities could be currency values, but they also could be discount rates, tax rates, etc. I think the reason for doing this is that currency values are often multiplied by fractions, resulting in a value that has more than two decimal places. It was also easier to convert to float than to try to figure out how many decimals were appropriate.

    Aaarrrrrrrggggggggghhhhhh

    Aargh ditto (I prefer your spelling ;-))

    An interesting post for two reasons:

    1) This really will SEEM to work. It will APPEAR to avoid some nasty little problems and INSTEAD allow them into the data where they can fester and multiply. When I was writing FORTRAN (!) one of the things we used to avoid was testing if one floating point number was equal to another. You could never be absolutely sure it would work as you expected.

    2) Code and Database migrations are two of the places when bad things can be done if we are not careful.

    Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]

  • How about using the dbtimestamp data-type as the Identity Column Value and Primary Key for Claim and Policy tables in an insurance database?

    Can you guess what happened when they tried to insert more than two records a second on SQL 2000?

    Still wonder if they had to re-write the database application and middleware when they migrated to SQL 2005.

    Why is Health Care screwed? Insurance is filled with people that know nothing about everything and that goes double for health care.

  • Oh I still see tons of developers who love to argue that that is a best practice to make your primary keys all uniqueidentifier data types.

    It's the same group that believes in the code first approach.

  • jarick 15608 (2/2/2015)


    Oh I still see tons of developers who love to argue that that is a best practice to make your primary keys all uniqueidentifier data types.

    It's the same group that believes in the code first approach.

    There are cases where using a GUID as the primary key is valid. The problem comes if you also make it your clustered index.

  • Viewing 15 posts - 31 through 45 (of 113 total)

    You must be logged in to reply to this topic. Login to reply