varchar(2000) or varchar(200)

  • If not knowing what the max size of a varchar field, should we use the size as large as possible (like varchar(2000)) or as accurate as possible (like varchar(200) if we guess this will be the max which maybe wrong in the future)?

    Is there any performance effect if using larger size?

  • If you don't know what the maximum size is, how would you know that varchar(2000) is large enough?

     

     

     

  • Disclaimer: I'm 95% certain that the following is correct, but there could be aspects of the issue I haven't hit yet.

    Assuming a situation where most of your text strings would be about 100 characters, with some around 150, and a very few getting close to (but not past) 200 characters, having a size of varchar(2000) should not have an impact on performance.

    A varchar variable of len X will occupy X + 2 bytes storage space (disk or memory--not entirely sure how NULLs are dealt with outside of table storage), where the extra two bytes indicate how many (other) characters are in the variable. Thus, a 100-byte string will occupy 102 bytes of storage (disk or memory). The fact that it could be larger is stored somewhere in metadata and should have minimal, if any, impact on the actual usage of the data.

    This would apply to varbinary and nvarchar datatypes as well.

    To mention, if I can't nail down precisely how long a text string might be (personal names, street names, etc.), and if it doesn't make sense to make it a text or varchar(max), I research what I can, make an estimate on average length, likely long length, and extremely long lengths. I then round this up to the nearest "reasonable" number, which for me means 10, 25, 50, 100, 200, 500, or 1000.

       Philip

     

  • The first thing that jumps to mind is that if your total row length is greater than 8K (eg if you have 5 VARCHAR(2000) fields = 10,000 bytes), you may have issues. In SQL Server 2000, this would create the table with a warning, but generate an error only when your total surpasses 8060 bytes.

    In SQL Server 2005 you won't get an error, but you'll start getting "row overflow", which will have a small performance hit. See article http://qa.sqlservercentral.com/columnists/awarren/2862.asp for more details, or look up "row-overflow data" in the index on Books Online 2005.

    HTH,

    Jim

  • In such cases when I am not sure about future growth of records length, I use varchar(max). It may have performance issues since SQL Server has to create another allocation unit for them, but by my opinion it is not as bad as to fix field length every time it hits the ceiling.

     

  • Don't forget that a lot of front end string variables have a length of 32K. So whatever is in the database as a varchar(max) more then 32K will be corrupted the next time the UI is used.

    Regards,Yelena Varsha

  • Yelena:

    I don't quite understand your point. So you are saying that if I insert a string with length of for example 35000 characters into varchar(max) data type column it will corrupt ? In this case (max) is not really max but rather 32000 ?

    I made some test where I generate a 40000 characters long string, insert it into a table and able to query that table:

    create table test_max

    (col1 varchar(max))

    go

    declare

    @var1 varchar(max),

    @STR varchar(max),

    @cnt int

    set @var1 = 'this is my very long string ............ '

    set @STR = ''

    set @cnt = 1

    while @cnt <1000

    begin

        set @STR = @STR + @var1

        set @cnt = @cnt + 1

    end

    select len(@str)

    insert into test_max

    values(@str)

    select * from test_max

    select len(col1)

    from test_max

     

     

  • It is not what I mean.

    You have 35000 characters as a value. You application reads this value into say, Visual Basic variable of the type String. This now can hold 2 billion characters, you will be OK. But if you have older VB or VB-Like languages it would be much less. For example one of the industry leader CRMs have String databtype for the applications much shorter: "Strings are used to hold sequences of characters, each character having a value between 0 and 255. Strings can be any length up to a maximum length of 32767 characters."

    Then you application reads these 32767 characters from the database, posts it to the textbox, the user does not update the textbox but saves the screen because he updated something else. Guess what: only 32767 characters are written back to the database. 

    In this case you would ask how these data can be more then 32767 for the first place, but it could be due to the back end data loads.

    This is just an example, but something I have to deal with once in a while.

    Regards,Yelena Varsha

  • There will be a performance hit if field length is not choosen properly. Because deletes, inserts and updates cause lot of page breaks and data/index pages will be scattered. The database engine has to do more work to retrieve data by switching extents (Scan Density [Best Count:Actual Count]).

     

     

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply