loading french xml files with html tags

  • I am trying to load over 600 xml files into sql server 2005

    half of these files contain french text.

    I am using the openrowset(BULK with either SINGLE_BLOB or SINGLE_CLOB.

    The issue I have is if I use SINGLE_CLOB the french characters and accents get messed up.

    If I use SINGLE_BLOB the french characters are fine but the html embedded in the body get translated from <STRONG> to <STRONG&qt; etc. and does not display correctly on our web page.

    After the read of the xml file I use sp_xml_preparedocument From OPENXML but the data has already been modified one way or the other by the openrowset load.

    My database collation is SQL_Latin1_General_CP1_CI_AI

    Any suggestions?

  • In a web application you have two part problem your need to use system.text.encoding to unicode encode your xml in the application layer while you use two collation in SQL Server. That is use your current collation for the database and add French collation for the column used for the French data and use collation precedence that will remove the character conversion you are getting. You should also know in SQL Server 2005 sp2 and up you could add collation to the XML DML as in the link below.

    http://msdn.microsoft.com/en-us/library/ms179886.aspx

    Kind regards,
    Gift Peddie

  • Ok I figgured this out myself.

    I went with correct language and did a few replace() statements. so SINGLE_BLOB into an XML field.

    I noticed all html formating had an extra & in front of the formating. I removed the amp; leaving the & and that corrected most of the issues.

    I still had < and > but they were simple replaces again.

    Thanks for your response.

  • bwilliams-1049831 (12/9/2009)


    Ok I figgured this out myself.

    I went with correct language and did a few replace() statements. so SINGLE_BLOB into an XML field.

    I noticed all html formating had an extra & in front of the formating. I removed the amp; leaving the & and that corrected most of the issues.

    I still had < and > but they were simple replaces again.

    Thanks for your response.

    Replace result is collation dependent and I am glad your issue is resolved.

    Kind regards,
    Gift Peddie

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply