[dsdl-discuss] Framework: transformer/selector using APEX architectural forms processor

From: Rick Jelliffe <ricko@topologi.com>
Date: Wed Jun 12 2002 - 07:17:04 UTC

For the Framework, we have agreed that (under whatever terminology)
there are selectors (e.g. relax namespace meets xpath + wrapper) and terminal validators
(i.e. the things in the separate specs), and perhaps transformers (e.g. tokenizer,
regualr fragmentations, xslt?)

I think it is looking increasingly like primitive datatyping is something that should
be specified inside a schema languages (e.g. following XSD and RELAX NG)
rather than outside (i.e. by defining an element-to-datatype mapping language
independent of an existing schema language.)

And I think it is getting more apparant that there is value in providing some
transformers to allow complex value validation, e.g. for un-munging dates.

Another possibility in this area is to bring in a simple architectural forms processor.
There are three justifications for this. First, to allow SGML applications that used
AF to move over to XML+DSDL if they wish. Second, because there has been
some more experimental code which has reduced the complexity: Dave Megginson's
XAF and Josh Lubell's APEX, which is under active maintenance. But most importantly,
we will need a mechanism for providing default values: if APEX can be implemented
by XSLT, that is a big bonus for ready implementation.
 
I emailed Josh to ask him his current opinion, and forward his answer here. I think
he is pretty unpartisan in his approach to schema languages.

Cheers
Rick Jelliffe

----- Original Message -----
From: "Rick Jelliffe" <ricko@allette.com.au>
To: "Josh Lubell" <lubell@cme.nist.gov>
Sent: Tuesday, June 11, 2002 12:27 AM
Subject: Re: [xml-dev] ANN: update to XSLToolbox

> Hi Josh
>
> I think we met at the Toronto XSE conference last year.

We did indeed meet at XSE last year. I liked your schema languages
tutorial very much.

> I am interested in APEX as a possible component for ISO DSDL.
>
> Speaking realistically, how useful do you think it is? Is the
Architectural
> Form mechanism just too abstract?

In order to answer your question about the usefulness of the
architecture mechanism, I think it's important first to realize that the
architectures mechanism as defined in the ISO/IEC 10744:1997
Architectural Form Definition Requirements (AFDR) has two basic ideas to
it.

First there is the idea of using metadata attributes to specify inline
how data is processed. I believe this idea is intuitive,
easy-to-implement, and useful (many applications do this, but in an
ad-hoc manner - the architecture mechanism provides a standard
semantics for these metadata attributes so that any data using such
attributes can be processed by a generic "architecture engine").

The second idea is the use of SGML notation attributes (or for XML,
processing instruction syntax) to identify which attributes in the data
are to be interpreted as metadata for the purpose of architecture
processing (this is call an architecture use declaration in the AFDR). I
think this idea is ill-conceived. The syntax is confusing, XML tools
don't support it, and it flies in the face of what developers expect.

The approach APEX takes is to run with the first idea but reject the
second idea. Rather than using some arcane syntax for identifying the
metadata attributes, APEX uses XSLT stylesheet parameters. If APEX were
written in a programming language (such as Java, Perl, Python) instead
of in XSLT, I think I would provide an API for identifying the metadata
attributes. I think a typical XML developer would be more comfortable
with an architecture usage API than with some strange-looking PI syntax.

Another problem with the architecture mechanism is that it needs
rebranding. The word "architecture" is too overloaded. It needs a better
name for distinguishing it from all the other interoperability
approaches out there. Any ideas?

Another issue is that the architecture mechanism needs to be more
schema-language-agnostic. By that I mean that it should be equally
useful for data whether that data is valid w.r.t. a DTD, XSD schema,
RELAX NG schema, or Schematron schema. David Megginson tweaked the AFDR
a little to help make this happen, and APEX uses those "tweaks". The
tweaks basically make it possible for an architecture engine to do
useful things without having to read the architecture's schema.

Along the same lines, the architecture mechanism needs a handy way to
specify default attributes. Because most of the metadata values are
invariant w.r.t. a schema, it makes sense for them to specified as
defaults in the schema using the architecture rather than explicitly in
the input data corresponding to that schema. This can be easily done
with DTDs. It can be done for RELAX NG using the DTD compatibility
facilities. But how would you do this in Schematron? This was my
original motivation for creating the ATTS stylesheet.

Having said all this, I do think the architecture mechanism is useful. I
use it myself for practical applications, such as my "pythonpoint"
example in the XSLToolbox distribution. At least from my personal
experience, a small subset of the metadata attribute types defined in
the AFDR is sufficient for most applications. You don't really need the
more esoteric stuff like bridge forms, #MAPTOKEN, etc.

I think the architecture mechanism fits in well with the modular,
pipelined approach of DSDL, assuming the issues I raised are properly
addressed.

>
> Cheers
> Rick

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Wed Jun 12 03:05:13 2002

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC