[dsdl-discuss] Re: First draft of Part 7

From: Martin Bryan <martin@is-thought.co.uk>
Date: Fri Dec 19 2003 - 08:13:19 UTC

Gerth

Many thanks for getting a draft of Part 7 out in time for the WG1
meeting.The following comments were made in Philadelphia regarding your
initial draft:

----
Part 7 should be restricted to creating named character sets using block
names, ranges, and properties as defined in Unicode database, and should not
be concerned with things like sorting, UC/LC mapping, etc. For private use
characters the text should only refer to the code points, not define any
properties of the characters.
We need a mechanism in Part 10 to invoke character set normalization as a
transform prior to validation.
We need a mechanism to association both datatype and character set names
with elements that are being validated. The role of the RestrictCharSet
element should be fully documented (especially DescendantRestriction) and
should have a form that is matched to that of associating datatypes with
elements.
Need to be able to invoke Part 7 on its own so that a whole document, or a
simple XPath specification of part of a document, contains only characters
in a named set. (RestrictCharSet does this, but its use is not fully
explained at present)
Mechanism to be provided to link Part 3 to Part 7, as an example of using
Query Language Bindings. Example of using bindings to invoke named character
sets to be defined in Part 3.
----
In practice only the first concerns what you currently have in Part 7, the
other comments refer to how we can invoke it from other parts, and how it
will be applied (the missing part of Part 7)
The basic criticicism was that we do not need to define all the properties
of private characters, only what named set they are considered part of and
their code point. Mechanisms for defining private characters are outside the
remit of SC34 and must be defined by SC2, who are responsible for ISO 10646.
The only reason we should be referencing Unicode properties is to select a
set of characters deemed by Unicode to be related. If we have private
characters with similar characteristics they can be added to a set created
by selecting Unicode characters with that property. Things like the sorting
order and upper/lowercase mapping of private characters should not affect
validation (though I'm not convinced that a case could not be made for
wanting case-independent validation, and if you want to make this case and
explicitly state why it is included in terms of validation requirements then
I will be happy to see it retained).
I hope this helps gives you useful guidance for completing your work. The
next meeting of WG1 will hopefully be one you can make. It will be held in
Amsterdam (I think at RAI) on 17th and 18th April 2004. If you could get a
new draft out a month before that so that we can review it before the
meeting then we will have a good chance of moving the text forward for
formal voting by the National Standards bodies next May.
Martin Bryan
IS-Thought: Thinkers for the Information Society
29 Oldbury Orchard, Churchdown, Glos. GL3 2PU, UK
Phone: +44 1452 714029 Fax: +44 1452 859991
E-mail: martin@is-thought.co.uk
--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Fri Dec 19 10:09:42 2003

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:28 UTC