Martin,
> > Note: W3C assumes that everything is normalized in advance.
> > http://www.w3.org/TR/charmod-norm/
>
> I would suggest that Part 7 should require normalization using the standard
> rules in charmod-norm
charmod-norm is apparently created by smart people. But I am not sure
if it is practical. See C302 in 3.4 of charmod-norm, which is
shown below:
A text-processing component that receives suspect text SHOULD NOT
perform any normalization-sensitive operations unless it has first
confirmed through inspection that the text is in normalized form, and
MUST NOT normalize the suspect text. Private agreements MAY, however,
be created within private systems which are not subject to these rules,
but any externally observable results SHOULD be the same as if the
rules had been obeyed.
This means two things.
1) Part 7 MUST NOT normalize the input.
2) Part 7 SHOULD conform that the input is in Normalization Form C.
Is this practical? How do you feel? > Rick
> > Note: Unicode Technical Standard #18 ("Unicode Regular Expressions")
> > introduces the second mode, but XML Schema Part 2 does not.
>
> It would seem natural to introduce Unicode Regular Expressions into the DSDL
> model if we are using normalization and graphemes. How much extension to the
> regular expression patterns defined for use with XSLT 2 would this require?
I agree to borrow some mechanisms of Unicode Regular Expressions in DSDL Part 7,
although I oppose to allow the full set of Unicode Regular Expressions .
Unicode Technical Standard #18 defines a notation for representing graphemes.
The notation is "\q{" CODE_POINT + "}". For each grapheme, we have to explicitly
specify the code point sequence. This is very simple and should not be difficult
to implement.
Cheers,
-- MURATA Makoto <murata@hokkaido.email.ne.jp> -- DSDL members discussion list To unsubscribe, please send a message with the command "unsubscribe" to dsdl-discuss-request@dsdl.org (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)Received on Mon Oct 11 07:34:38 2004
This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:28 UTC