Re: [dsdl-discuss] Re: Does Part 7 need built-in definitions forIANA sets?

From: Rick Jelliffe <ricko@allette.com.au>
Date: Sun Nov 14 2004 - 16:08:50 UTC

> I might be biased, since I care DBCS but do not care SBCS. However,
> if IANA charsets are useful only for SBCS, I do not see big advantages
> in using them.

:-) Unfortunately, we need to make a good solution for everyone.
Westerners need to respect Asian requirements, and vice versa, where
they are legitimate.

> After all, SBCS can represent at most 256 characters. A block escape
> "IsBasicLatin" already covers 128 and another block escape
> "IsLatin-1Supplement"
> also covers 128. By combining these blocks, code points, code ragnes, the
> union
> operator (i.e., posCharGroup) and the difference operator (i.e.,
> charClassSub),
> it is very easy to define any SBCS.

Westerners are ignorant about character sets. Charsets are not part of
our usual daily experience, the way they are for many CJK workers.
256 characters does not sound much for a Japanese, but I have been
surprised to find that one is considered a great expert in the West
if one can handle even 196 characters.

As I mentioned, the advantage of this is convenience and inititial
coverage. Just because the language Schema could do everything was
not enough to make it successful: its lack of standard libraries to
help normal users proved fatal. I am sure you see the analogy.

But I still believe this is useful for CJK work. I recently typeset some
Chinese laws. We needed to quickly check whether they contained any
characters that were in Unicode but not in GB 2312, because the
fonts were GB2312 repertoire. The built-in transcoders would be completely
adequate for that: we needed a rough check fast.

Cheers
Rick Jelliffe

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Sun Nov 14 17:06:13 2004

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:28 UTC