[dsdl-discuss] Re: Namespace Routing Language

From: Rick Jelliffe <ricko@topologi.com>
Date: Wed Jun 11 2003 - 05:37:55 UTC

From: "James Clark" <jjc@jclark.com>

> - Inline schemas. ... Typically, schema
> implementations not designed specifically for use with this type of
> language cannot start reading the schema from a subelement; they expect
> the schema to be a complete XML document. You can serialize the inline
> schema, but then the line numbers bear no relation to the real line numbers.

In my editor, you can select a range of text and attempt to validate it.
We renumber the error messages as appropriate (factoring in that
we may also have insert a DOCTYPE declaration, and we do something
with namespaces too IIRC).

To generalize this to work with VCSL and third-party file:line:column:message tools,
you would need to store mapping the start location in the original files
for each validation subject (and the start and end locations of sections
stripped out) and mapping error locations back.

We do the simple renumbering for general XML, but we don't do
it for detecting schema errors in embedded schemas (i.e. Schematron
in RELAX NG and WXS.) (The real reason that it is not needed is
that Schematron elements are pretty simple, they often have name
attributes, and in any case errors tend to occur in the XPaths not the
elements and Xpath validation is a whole different area.)

So I think the same kind of mechanism required to support location
mapping of errors in validation canditates by 3rd party validating tools
might also handle remapping embedded schemas.

Of course, embedded schemas and validation candidates are not exactly
the same, in particular because the embedded schemas may have to be
extracted and combined in a different order and with various top-level
glue elements (for example, in the embedded Schematrons
we extract the namespace declarations from the schema and make
<ns> elements for them, and we only allow embedded pattern elements
so we have to generate the top-level element automatically.).

So this is a potential use-case for validation that we cannot currently
handle: when one notional document has been fragmented throughout
another document and possibly reordered. I suggest that this can be
best solved just by allowing some more general transformation layer,
in particular XSLT, prior to a VCSL. We don't need to complicate
VCSL for it. If we allowed embedded, fragmented schemas,
then DSDL would need a XSLT stage to be able to validate itself.

> - Schema inclusion. ... Do we give each DSDL
> language an include mechanism and then force the user combine the
> modules separately for each language? This seems very clumsy.

For Schematron schemas, as far as I can tell there are two main constituencies:
  - the traditional SGML industrial publishers, who work with SGML and validate
   with SGML DTDs, but normalize to XML for validation and output processing
  - the XML web-based sites who often embed Schematron.

In both cases, Schematron is used as an adjunct. This suggests to me that people
will soon start to write Schematron as part of a VCSL system: so they will adjust
their use of phases or multiple schemas to fit in with whatever capabilities VCSL
has: if they know that dummy elements are being used sometimes, they will
have a phase that doesn't report dummy elements as errors.

I don't like the WXS <appinfo> kind of embedded-schemas approach, even though
Schematron is probably the main user of it. The main driving factor for it is
that people using schema-management applications have no-where else to
put their extra constraints.

I wonder whether this is best solved at the language level or merely by suggesting
(non-normative annex?) a useful directory (sorry, "relative URI") convention, such as:--
  - all ISO DSDL schemas should support the same include mechanism
    (the ISO RELAX NG one of including all the subschemas then validating)
    in their own namespaces
  - every module has a relative URL, a directory
  - every module contains various schemas in different schema languages
  - the base schemas combine the schemas, each schema type in parallel

e.g.
    /xhtml.dsdl
    /xhtml.vcsl
    /xhtml-base.rnc
    /xhtml-base.sch
    /xhtml-base.dtd
    /rdf.rnc
    /xhtml-inline/xhtml-inline-01.rnc
    /xhtml-inline/xhtml-inline-01.sch

> With dummy elements, users may get error
> messages (e.g. saw element "dummy" but expected "foo" or "bar" or "baz")
> that are incomprehensible unless users understand the internal workings
> of NRL, which shouldn't be necessary for merely using a validator.

At worst, a filter on error messages to get rid of spurious complaints?
But, as I said, I think people will write the schemas to fit VCSLs, so I
don't see this as a problem for Schematron at least.

Cheers
Rick Jelliffe
 

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Wed Jun 11 07:32:06 2003

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC