[dsdl-discuss] Re: Concurrent (overlapping) structures

From: Rick Jelliffe <ricko@topologi.com>
Date: Wed Jun 04 2003 - 14:34:43 UTC

From: "Alex Brown" <alexb@griffinbrown.co.uk>
 
> Is it not a valid contribution to
> articulate to this list what I think their concerns/requirements might be,
> for discussion?
 
Sure, and you are very welcome.

DSDL has been following a definite architecture, which are primarily designed
to keep us focussed and to get the particular technologies of the various contributors
developed and out the door as fast as possible, within the strictures of the ISO
review process which is of course necessary for quality. Here is my understanding.

The primary architectural feature is that the design should be able to be modelled as
a set of processes which are XML-in, XML-out (either real XML or XML infosets).
The reasons are threefold:
  - First, because we want to avoid W3C's PSVI approach, which some of us think is
   is a trap: the issue is with addition of new kinds of information items which cannot
   be serialized out without disrupting the original document. (Now, actually, anything
   could be serialized out by making a PSVI as a second document which links to the
   primary document, but that is beside the point.
  - Second, because we want to avoid SGML-isms, in particular to avoid information
    items not available in XML, such as inline <!USEMAPs etc. This is not because
    SGML is dead (I am working on a project at the moment with a quarter of a million
    pages that will become SGML *and* XML) but because we see other SGML
    syntaxes as being data capture formats that notionally will be resolved into XML.
  - Third to allow pipeline power, where small modules (validators, transformers) could
    be combined in different ways as required.

The second architectural feature is that we decided that we would allow various kinds
of (XML-in, XML-out) transformations on documents, however we would not
support transformations of schemas. In other word, a separation of the data channel
and the schema channel, so that schemas are not schemas in one place and data in
another (in the same DSDL pipeline.)

Now we do allow something like taking an XML file in, extracting and parsing a data
value into XML, then passing that XML to another validator or component. (This is
the "regular fragmentations" idea, which we want to allow but we are not standardizing
at this stage AFAIK. Note that Schematron can validate substrings.)

The third feature is that unless there is some good reason (e.g. for Schematron)
the various components should be streamable: for example to fit over a SAX stream.
The reasons for this will be obvious to anyone working with large documents: we
don't want to prevent one major implementation approach (i.e. streaming).

So where would alternate syntaxes fit in? The same place that other SGML
syntaxes fit in: at the front-door, to be notionally transformed into XMLs.
Note, of course, that even though we speak of pipelines and transformations,
an efficient implementation might be modular and work using pointers, to tie errors
to the original locations. (Other people involved in DSDL may see other, better possibilities.)

That is my understanding of how ISO DSDL is proceeding. These are very different
architectural constraints than the W3C XML Schemas WG used, for example.

Of course, we are keen to make ISO DSDL useful: but we should avoid standardizing
anything (except the glue) based on "this would be nice", I think many stakeholders have
a high premium on standardizing technologies which have active user communities and
code already out there. CONCUR is one thing which caused a lot of mistaken ridicule
for SGML (i.e. people couldn't understand that an optional feature doesn't *have* to be
implemented by a developer).

Finally, don't forget the option that perhaps what you are asking for is not DSDL, but
an improvement to SGML. That is a different issue, and not as strange as it may sound.

Cheers
Rick Jelliffe

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Wed Jun 4 16:29:01 2003

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC