Perry-300990

Forum Replies Created

Viewing 12 posts - 46 through 57 (of 57 total)

RE: Script identification

I'm probably stating the obvious, but some short phrases or words are going to be necessarily ambiguous -- that is, you can't really determine a language for such a small...

February 10, 2006 at 11:37 am

#620248
RE: Script identification

> So, yes I do need to be able to tell if the Cyrillic word is Bulgarina, Russian, Romanian and so on.

Ah, wait, this sounds different from (how I read)...

February 10, 2006 at 11:32 am

#620245
RE: Script identification

This may sound silly, but first I think you need to define your meaning of "script".

Is "American English" a script, distinct from the "Italian" script? Native Italian script lacks...

February 9, 2006 at 2:02 pm

#619937
RE: General Questions - Globalization and Unicode

(Post-scriptum caveat: I'm far from an expert. I'm just reporting some simple things of which I myself was not aware some years ago, in case they are of any help...

February 9, 2006 at 1:55 pm

#619934
RE: Collation

I think the default SQL Server 2000 collation will collate CJK characters based on Unicode codepoint, and that is not a stroke order collation.

So you'll want to collate in the...

February 9, 2006 at 1:50 pm

#619931
RE: Distinct languages in a database

Also, as you mentioned Chinese, be aware that the assumption that every Unicode character (codepoint) can be represented as two bytes, is an old assumption from a ten-year old Unicode...

February 9, 2006 at 1:48 pm

#619930
RE: Validating Unicode Data in StoredProcedure

> Character , it gets stored as "Æ½" (Unicode Equivalent )

That looks like the UTF-8 equivalent, not a Unicode codepoint. Unicode is a character set, not an encoding, so there...

February 9, 2006 at 1:42 pm

#619925
RE: is collate just the language alphabet?

Another point that I'm not sure was mentioned above is that collation affects what characters are considered equal. This includes case-sensitivity (are "a" and "A" equal?) and accent sensitivity (are...

February 9, 2006 at 1:34 pm

#619921
RE: is collate just the language alphabet?

Let me correct an earlier oversimplification.

Unicode is a set of codepoints, and can be encoded in a variety of ways. So, Unicode does not use 2-bytes per character, but there...

February 9, 2006 at 1:30 pm

#619917
RE: Data Encoding Problem.. Need help from you guys!

If you write the raw data into a text file, you can use iconv to do a simple conversion.

Check that your iconv supports the proprietary Microsoft Arabic codepage, if you...

February 9, 2006 at 1:24 pm

#619916
RE: NULLS not behaving as expected

ANSI_NULLs can be set at the connection level -- perhaps it is not set as you expect in your connection?

For example, executing this in SQL Server 2000 Query Analyzer, returns...

February 9, 2006 at 1:16 pm

#619914
RE: DTS Japanese text file to SQL

Microsoft has a technical article on a bug in their product which generates that message. You might check it:

http://support.microsoft.com/kb/q241761/

(or search for that q# -- Microsoft shuffles their links...

February 9, 2006 at 1:13 pm

#619912

Viewing 12 posts - 46 through 57 (of 57 total)