Overhead of implicit data conversions with Unicode

  • We are working on a large commercial site that is going to be supporting Unicode in the future.    Part of the architecture is to have countries requiring Unicode to be in one Unicode environment and countries not requiring it to be in a nonUnicode environment.

    But the issue is around a common code base between the two.    .NET code is a nonissue since the application will have the correct data types when using the specific database during build.    However, ASP code needs to either be 100% Unicode or two code bases to support either environment.

    This is a high transaction processing web environment.    Assuming the ASP code was Unicode and variables within it were defined as Unicode.    If they were to call a nonUnicode stored procedure in a SQL Server database... where would the implicit type conversion take place.    In the ASP and on the web/application server or in the stored procedure on the database server.   We have proven that when a Unicode ASP does call a nonUnicode stored procedure that there is no impact within the stored procedure because the variables get converted at the point of the call to nonUnicode types.

    But being paranoid in a high-volume system, we want to specifically identify where this conversion would take place and then run tests to see what the overhead is to see if it could cause performance issues.

    Thanks

  • I am not sure where the conversion takes place. But IMHO, the architecture may result in big problems in the future.

    1) It's the client side locale instead of country that determines the lanuage a client is using;

    2) The same user can potentially use same/different machines with different locales at different time. By this design, the profiles for the user are saved in different environments/databases.

    3) Database and site maintanance will be a headache.

    For a website or applcation globally used, unicode support in all tiers is the key point.

     

  • True... and that discussion of 'Unicode everything' was had... but unfortunately that is a decision that should be made early and can be (and is) difficult to implement after the fact.    

    We are dealing with a very large web site where 10% or less of the traffic and data require Unicode, extremely high volume, much of the data is character/text and the databases in the architecture are already over 1 TB.    Sometimes the complexity of having Unicode and nonUnicode environments is easier to reduce the performance and storage costs incurred... especially when more than 90% of the current customers do not need Unicode.

    My belief is that the implicit conversion happens within the database at the point the variables 'enter' the stored procedure.    But we are talking about an environment where there could be 5000 ASP calls a second so the accummulation of this overhead might be measurable... even on larger servers.    

  • why not in the loading of the asp page, determine what type (unicode or other) it uses and make all conversions client side?? think scalability - what if instead of 5000 calls a min you scale to 20000 calls??

  • We do 5000 database requests per second.    Adding in new .NET code we handle 8000 database requests per second during the heaviest loads.

    We're fairly scalable.       The question is around looking at a trade off of diverging a code base (ASP code particularly) to support this architecture or to take the performance hit of one of the environments (the higher volume one) to keep a common code base regardless of the database server used in the environment... one 100% Unicode and one 99% not Unicode.    I'm just trying to do some due diligence on the actual impacts of the performance it to support a decision one way or another.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply