[dsdl-discuss] Re: Rationale for Rick's draft Part 7 Character Repertoire Valdiation

From: Eric van der Vlist <vdv@dyomedea.com>
Date: Thu Apr 08 2004 - 10:01:52 UTC

On Thu, 2004-04-08 at 11:26, Rick Jelliffe wrote:
> Eric van der Vlist wrote:
>
> >Hi Rick,
> >
> >On Wed, 2004-04-07 at 10:56, Rick Jelliffe wrote:
> >
> >
> >>Please find attached a complete XML draft of Part 7 Character Repertoire
> >>Validation.
> >>
> >>It is a kind of Schematron, except that assertions tests expect
> >>"Character Class" as used in XML Schemas, Perl, Java, etc.
> >>
> >>
> >
> >Very clever, this sounds like a neat idea.
> >
> >I am confused about the way to apply this to mixed content models,
> >though.
> >
> >In that case (mixed content models), how would you read "The assertion
> >test is interpreted according to production 13 of XML Schemas Datatypes"
> >knowing that XML Schemas Datatypes only applies to attribute and simple
> >contents?
> >
> In section 4:
>
> "Text nodes which are children of of the nodes that match the contexts
> shall conform
> to the repertoire."

I guess I missed that piece!

> For example, given the document,
> <x>aaaa<y>bbbb</y>c</x>
> and the schema
> <rule context="x"><assert test="ac"/><assert test="^b"/></rule>
> <rule context="y"><assert test="b"/></rule>
> <rule context="*[ancestor-or-self::x]"><assert test="abc"/></rule>
> then the document is valid against that schema.
>
> In the first assertion, the subject node is x,
> its children text nodes are "aaaa" which tests OK and "c" which
> tests OK.
> In the second asssertion, the subject node is b, which tests OK.
> In the third assertion, the subject nodes are x and y, the all text
> nodes of each of
> them test OK.
>
> I will put a note in to explicate this.
>
>
> >The only instance of the word "mixed" in your proposal is in the use
> >case "ensuring that a Dutch document contains characters only used in
> >typical Dutch documents; the constraint applies to mixed content and
> >element content;" and I think that the spec should clearly show how that
> >can be done.
> >
> >
> I put in the use-cases and examples specifically to show that mixed
> content was intended
> to be coped with.
>
> >In fact, I think that it's the semantic of test="pattern" when the
> >context node isn't a simple type element that should be clarified.
> >
> Yes. When a node has text nodes children (element, attribute, PI,
> comment) then those
> text nodes are tests. Else if the node is convertable to a string (in
> particular, for names
> of elements and attributes, PI targets) the string is tested. Anything
> else (i.e. documents?)
> would be an error, though it is a matter of negotiation.
>
> >
> >The other point is to know if we're happy to rely on XPath 1.0 (which
> >among other things isn't streamable) to qualify the binding between
> >nodes and character repertoires.
> >
> >
> We rely on it formally, for the spec, but you see in part 4 I also
> reserve the query language
> binding name "stx-charrep". The idea that implementations could support
> a streamable
> Xpath with this; if there has been a streamable XPath defined by someone
> that we
> can refer to in an ISO spec, we could also formally define it too. But
> since it seems there
> is no suitable streaming version of XPath (I don't think the XML
> Schema's subset is
> really useful) I think it is best to allow nature to take its course:
> provide a mechanism
> (@langauge) and reserve and promote the appropriate term, but don't
> formally define it.
>
> In other words, I just want to provide the bare minimum that
> implementers need, without
> restricting them from innovating. For example, there is no requirement
> that a conforming
> implementation even support "xslt-charrep": that is the starting point.
> Lets not prematurely
> standardize: lets allow implementers and users work out what is
> convenient.

Yes, that makes sense.

Thanks,

Eric

-- 
Lisez-moi sur XMLfr.
                       http://xmlfr.org/index/person/eric+van+der+vlist/
Upcoming XML schema languages tutorial:
 - Amsterdam   -half day- (18/04/2004)        http://masl.to/?P220516D7
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(ISO) RELAX NG   ISBN:0-596-00421-4 http://oreilly.com/catalog/relax
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------
--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Thu Apr 8 12:01:53 2004

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:28 UTC