[dsdl-discuss] Re: Draft of Part 7 Character Repertoire Valdiation

From: Rick Jelliffe <ricko@allette.com.au>
Date: Tue Apr 13 2004 - 08:04:50 UTC

MURATA Makoto wrote:
>>So I don't believe that the Schematron-based approach is any less
>>powerful.
>
>
> I think that your approach is very powerful.

I think it was your or James' suggestion.

>>Oh, my wording needs to be better. My prososal also allows those.
>>Indeed, the first example
>>tries to show that this is the intent:
>
>
> How about local names, PI targets, CDATA sections, and comments? (I'm
> just curious.)

The more recent draft clears this up: I go through each node type in
XPath and say yes or no. So this allows names, targets and comments,
but not special treatment of CDATA sections.

>>>Third, in the future, I would like to extend RELAX NG so that its <text>
>>>and <mixed> can reference to descriptions of character repetoire constraints,
>>>which are described in Part 7. Are such applications in the scope of your draft?
>>>
>>>
>>
>>You tell me!
>
>
> I believe that such applications should be in the scope of Part 7. It should
> also be possible to reference character repetoire constraints from an extension
> of DTDs.
>
> I have one concern. When we extend RNG by referencing character repetoire constraints,
> I do not think implementors should be forced to implement the full set of Schematron.
> Which part of your current draft can be separated from Schematron? In other words,
> it should be possible to borrow the below mechanisms without implementing Schematron:

> - a mechanism for describing constraints such as "\p{IsBasicLatin}\p{IsLatin-1Supplement}
> &#x132;&#x133;\p{IsGeneralPunctuation}\p{IsCurrencySymbols}",
> - a mechanism for naming such constraints and referencing to it from
> any part of DSDL (and other schema languages), and
> - some INCLUDE mechanism for describing a huge list of characters.

I have been thinking of putting some text in to allow <sch:assert>
statements in other vocabularies, such as RELAX NG. The the rules
for supplying the strings to be tested would
be provided by the host language. E.g. imagine a document with

<sch:assert xmlns:sch="http://www.ascc.net/xml/schematron"
        id="dutch" test=
        "\p{IsBasicLatin}\p{IsLatin-1Supplement}
        &#x132;&#x133;\p{IsGeneralPunctuation}\p{IsCurrencySymbols}">
        The text should only contain typical Dutch characters </sch:assert>

Then this could be included in Schematron schemas by an
<sch:include> at the appropriate spot.

And <relaxng:include> could also do the same thing.

The ID could be used for more specific linking to elements
within a schema, rather than requiring bit to be split out.

>>One way for RELAX NG would be that the first type is handled by
>>a non-recursice datatyping mechanism, applying to all immediate content
>>but not grandchild content. Then the second type would be handled
>>by a recursive test, declared on the grammar itself: a global constraint
>>on all (element?) content.
>
>
> I agree with the general direction. But I am inclined to introduce
> an inherited attribute (just like "ns") for referencing to character
> repertoire constraints. This allows me to easily create a schema
> for multi-lingual documents.

That seems a little like a hack, in that I don't think it is very
general: for a start, language of a document should usually be
unrelated to the usual schema of a document: whether a document
must be Japanese should be a parallel constraint in a parallel schema
rather than requiring a customized "derived" schema.

On the other hand, it is possible that

Why not just use Part 7? You it is very easy to have an ns-like
attribute in the instance which constrains its children to
a certain repertoire.

<sch:pattern>
   <sch:rule context="*[ancestor-or-self::*[@magic-lang-att='ja_jp']]">
        <sch:include
        href="http://www.eg.co.jp/japanese-repertoire.sch"/>
   </sch:rule>
<sch:pattern>

That rule gives branch-scoped constraint to all descendents: subsequent
constraints would be tested in parallel.

If you want the attribute to reset to the most recent value:

<sch:pattern>
   <sch:rule context=
        "*[ancestor-or-self::*[@magic-lang-attribute][1]]
        [@magic-lang-att='ja_jp']">
        <sch:include
        href="http://www.eg.co.jp/japanese-repertoire.sch"/>
   </sch:rule>
</sch:pattern>

Cheers
Rick

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Tue Apr 13 10:05:20 2004

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:28 UTC