[dsdl-comment] Conforming DSDL processor. Extending RELAX NG to traverse links. (was Re: Re: Allowing CREPDL from RELAX NG)

From: <rjelliffe@allette.com.au>
Date: Thu Jan 22 2009 - 04:11:13 UTC

> I'm wondering whether it would be possible to unify character repertoire
> constraints and datatype libraries.

I don't see any strong reason to unify them, just a strong need to make
sure that they all can be used together. The point of having parts is to
allow modularity, e.g. so that Schematron can use CRPEDL without having to
implement RELAX NG (and so on for every permutation).

So perhaps the issue is for RELAX NG is to make sure it can invoke and
combine the different data-type and reference constraint little languages
satisfactorily, and that the allocation of tasks between RELAX NG and NVDL
is clear.

I agree that a conforming RELAX NG implementation should not require
CREPDL or datatyope support. I wonder whether we should be looking to
defining a "conforming single-document streaming DSDL processor" which
requires NVDL, RELAX NG, RCS, CREPDL, DSRL, XSD primitive simple types
(NS-aware DTDs?) but not necessarily schematron or complex referential
constraints. And also a "conforming multi-document in-memory DSDL
processor" which supports NVDL, Schematron, CREPDL, DTLL, XSD primitive
simple types, DSRL but not necessarily RELAX NG etc.

I think these reflect a real implementation boundary, of validators that
can be readily implemented on a SAX stream and validators that can be
readily implemented by translation to XSLT2.

And, as I have mentioned before, I think the pressing challenge for DSDL
etc is to cope with compound documents: the age of the single big XML file
is pretty much dead now, because of websites, SOA and the XML-in-ZIP
formats. This means that constraints that could be checked before by
examining only a single large document are now escaping validatability
using standard languages.

So I would say that there is an interesting question for RELAX NG: can it
be extended to cope with multiple documents? Can I have a grammar that
says (if you'll pardon the syntax extension:

external-person-reference =
  element person-ref {
      attribute href { xsd:anyURI;
         locates person-ns;
         uses "person-grammar.xsd";
      }

I.e. that the person-ref element has an href attribute which should locate
an element in a document with a person namespace, and this should be
validated using the person-grammar.xsd grammar. I think some mechanism
like this would be intellectually cogent for RELAX NG, because it fits
into the grammar mold (the uses grammar would be treated as another
particle of the grammar, analogous to the way that attributes are treated,
with some kind of circular reference detection), and it would allow many
kinds of linking and reference structures to be coped with.

NVDL also has this single-document-focus problem, which risks making it
obsolete before it starts: it is no use solving yesterday's problems!

Now obviously I am not saying that single-document focus schemas are
rubbish, or that standards that only allow single documents are rubbish.
But we are at the stage now of making sure that the parts of DSDL
integrate together to solve real problems: I don't think it is good enough
merely to say that "this is a problem for XProc" or whatever and avoid
reality.

In Schematron, I have long wondered what the missing parameter for
<pattern> was: I was convinced the pattern represented a psychologically
real category, but it was clearly missing something: adding the
pattern/@documents attribute I am proposing seems to hit the nail on the
head. If we can come up with something similar for RELAX NG, that fits in
with hedge grammars and the kinds of implementation, it would be a good
value-add for validating modern documents.

Cheers
Rick

--
DSDL comments
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-comment-request@dsdl.org
(mailto:dsdl-comment-request@dsdl.org?Subject=unsubscribe)
Received on Thu Jan 22 05:11:24 2009

This archive was generated by hypermail 2.1.8 : Thu Jan 22 2009 - 07:53:03 UTC