[dsdl-discuss] Re: A revised draft for Part 7

From: Rick Jelliffe <rjelliffe@allette.com.au>
Date: Tue Nov 20 2007 - 02:30:33 UTC

On Sat, 2007-11-17 at 09:55 +0100, Keld Jørn Simonsen wrote:
 
> I think we should be general an be able to describe all aspects of an
> encoding, including the whole of C0, but only on a character level.
> Conceptually this does not change anything.

Again, I strongly urge that there should be no attempt to handle or
support in any way control characters (C0 and C1) except for the current
white-space related characters. Neither as code points in Unicode nor
with their control semantics.

This is because they belong to a completely different level than markup
languages operate on.

X-ON, X-OFF, print head backspace, BEL, and so on are irrelevant, if not
outright bogus in document-related standards. NULL is the worst, because
it overloads the end-of-string mark used by standard C libraries.

Furthermore, allowing control characters reduces spare code-point
redundancy which is useful for wrong encoding detection. XML 1.1 adopted
this approach when, even though it allows NCRs from C1 range, it does
not allow C1 characters directly expressed.

Furthermore, using the C0 (and C1) characters goes against the W3C I18n
WG recommendation on characters suitable for markup, which has been
adopted also by the Unicode consortium, and largely enshrines XML's take
on control characters.

We do not need to describe a whole encoding, we only need to describe
the graphical characters, because non-graphical (e.g. control and
function characters) have no place in markup based documents: we use
markup for that. Control characters, where they have some function,
would be added and stripped by some underlying protocol and not appear
to the XML processor.

Cheers
Rick Jellife

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Tue Nov 20 03:25:24 2007

This archive was generated by hypermail 2.1.8 : Tue Nov 20 2007 - 19:13:02 UTC