Hi,
Thinking about how the framework could help to define datatypes, I came
to the idea of bringing the framework inside schemas using framework
elements as extension.
The attached strawman is really only rough unpolished ideas serialized
as XML to illustrate what I mean, but I think that this might be an
interesting direction to dig into and I am eager to hear your comments!
I would also like to know how we could bring external people in this
discussion (I am thinking of people like KAWAGUCHI Kohsuke for his
experience implementing W3C XML Schema and Simon St.Laurent whose
regular expressions I have borrowed). Is it possible to invite them on
this list or should we wait untill things are more polished and bring
the discussion to dsdl-comment?
Thanks
Eric
--
See you in San Diego.
http://conferences.oreillynet.com/os2002/
------------------------------------------------------------------------
Eric van der Vlist http://xmlfr.org http://dyomedea.com
http://xsltunit.org http://4xt.org http://examplotron.org
------------------------------------------------------------------------
Eric van der Vlist
May 25, 2002
The "natural" way to define our interoperability framework seems to be "outside" schemas and pre-validation transformations, using a push mechanism such as defined by XPipe. Assuming that we use the namespace prefix "if", this could lead to constructs such as:
<ie:define name="canonicalValidation">
<ie:process type="http://www.w3.org/1999/XSL/Transform" href="myC14n.xsl">
<ie:process type="http://relaxng.org/ns/structure/1.0" href="mySchema.rng">
</ie:process>
</ie:define>
to define a Relax NG validation performed after an XSLT canonicalization.
The big benefit of such an external framework is to be compatible with
any existing tool. However, being non intrusive it may become heavy when
the transformations and the validation get mixed together as it is the case
for the transformation between the parsed and the lexical space.
If we wanted to define the "pre-lexical" transformation (which may depend
on the text node or attribute under validation) using an external framework,
we would need to split the validation in two phases: a first phase stoped
before the pre-lexical information producing an anotated document, the pre-lexical
transformation using these annotations to do its job and the final validation
performed on the result of the pre-lexical transformation and this whole
process seems messy and intrusive.
The other solution which is the subject of this strawman would be to include
framework elements within the schemas to define transformations to be performed
on the nodes during the validation.
The use cases presented in this strawman are:
Note: all the examples are presented assuming the default namespace
is Relax NG. The XSLT and XPath snipets have not been tested but I
hope that there are few enough errors to give a good understanding of what
I mean!
A specific transformation could be used for whitespace processing unless
XPath was being used. The following construct pattern could then define that
spaces should be normalized before doing further tests:
<element name="foo">
<data>
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116" value="normalize-space()">
<data type="bar"/>
</if:process>
</data>
</element>
Alternative syntax using a value element:
<element name="foo">
<data>
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116">
<value>normalize-space()</value>
<data type="bar"/>
</if:process>
</data>
</element>
The meaning of such a construct would be "apply the processing defined
here before doing further tests" to the current node.
Note that in this first case, there is a "built in" fallback mechanism for Relax NG processors which do not support the framework: per the Relax NG specification, such processors should just ignore the "if:process" element and validate any text node.
This other pattern:
<element name="foo">
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116" value="normalize-space()">
<data type="bar"/>
</if:process>
</element>
is also valid but would apply "normalize-space()" to the current element
(instead of its first text node as in the previous example). Applying the
datatype "bar" to the normalized value of current element would mean converting
the elements which may be embedded into a single normalized text value. A
Relax NG processor not supporting the framework would expect empty "foo"
elements in this second pattern and there might be a need in this case to
provide a fallback mecanism, for instance:
<element name="foo">
<choice>
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116" value="normalize-space()">
<data type="bar"/>
</if:process>
<zeroOrMore if:process="ignore">
<ref name="anyElement"/>
<text/>
</zeroOrMore>
</choice>
</element>
The first alternative of the choice would be ignored by non framework compliant Relax NG processors, while the if:process="ignore" attribute of the second alternative could be used to instruct framework compliant Relax NG processor to ignore it.
Could be done as:
<element name="foo">
<data>
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116" value="normalize-space(translate(., ',', ' '))">
<data type="bar"/>
</if:process>
</data>
</element>
<element name="foo">
<data>
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116" value="normalize-space(translate(., ',', '.'))">
<data type="bar"/>
</if:process>
</data>
</element>
This one would probably deserve its own library. However, it would be
verbose but could be done using XSLT:
<define name="foo">
<if:process type="http://www.w3.org/1999/XSL/Transform">
<if:value>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="">
<xsl:template match="/foo">
<xsl:choose>
<xsl:when test="contains(., 'janvier')">
<xsl:value-of select="concat(normalize-space(substring-after(., 'janvier'), ' '), '-01-', normalize-space(substring-before(., 'janvier'))"/>
</xsl:when>
.../...
<xsl:when test="contains(., 'décembre')">
<xsl:value-of select="concat(normalize-space(substring-after(., 'décembre'), ' '), '-12-', normalize-space(substring-before(., 'décembre'))"/>
</xsl:when>
</xsl:choose>
</xsl:template>
</xsl:transform>
</if:value>
<data type="xs:date"/>
</if:process>
</define>
Or using a regexp library:
<define name="foo">
<element name="foo">
<if:process type="http://www.perldoc.com/perl5.6.1/pod/perlre.html" value="s/([0-9]*) janvier ([0-9]*)/$2-01-$1/">
.../...
<if:process type="http://www.perldoc.com/perl5.6.1/pod/perlre.html" value="s/([0-9]*) décembre ([0-9]*)/$2-12-$1/">
<data type="xs:date"/>
</if:process>
.../...
</if:process>
</element>
</define>
<define name="foo">
<if:process type="http://simonstl.com/ns/fragments/">
<if:value>
<fragmentRules xmlns="http://simonstl.com/ns/fragments/">
<fragmentRule pattern="([0-9]*)-([0-9]*)-([0-9]*)">
<applyTo>
<element nsURI="" localName="foo"/>
</applyTo>
<produce>
<element nsURI="" localName="year" prefix="" />
<element nsURI="" localName="month" prefix="" />
<element nsURI="" localName="day" prefix="" />
</produce>
</fragmentRule>
</fragmentRules>
</if:value>
<element name="foo">
<element name="year">
<data type="xs:integer"/>
</element>
<element name="month">
<data type="xs:integer"/>
</element>
<element name="day">
<data type="xs:integer"/>
</element>
</element>
</if:process>
</define>
<define name="foo">
<if:process type="http://www.w3.org/1999/XSL/Transform">
<if:value>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="">
<xsl:template match="/foo">
<xsl:element name="year">
<xsl:value-of select="substring-before(.,'-')"/>
</xsl:element>
<xsl:element name="month">
<xsl:value-of select="substring-before(substring-after(., '-'),'-')"/>
</xsl:element>
<xsl:element name="day">
<xsl:value-of select="substring-after(substring-after(., '-'),'-')"/>
</xsl:element>
</xsl:template>
</xsl:transform>
</if:value>
<element name="year">
<data type="xs:integer"/>
</element>
<element name="month">
<data type="xs:integer"/>
</element>
<element name="day">
<data type="xs:integer"/>
</element>
</if:process>
</define>
Alternatively, the transformation may be kept external:
<define name="foo">
<if:process type="http://www.w3.org/1999/XSL/Transform">
<if:value href="splitDate.xsl"/>
<element name="foo">
<element name="year">
<data type="xs:integer"/>
</element>
<element name="month">
<data type="xs:integer"/>
</element>
<element name="day">
<data type="xs:integer"/>
</element>
</element>
</if:process>
</define>
<element name="foo">
<if:process type="http://www.w3.org/TR/1999/REC-xpath-19991116" value="concat(year, '-', month, '-', day)">
<data type="xs:date"/>
</if:process>
</element>
<element name="foo">
<if:process type="http://www.w3.org/1999/XSL/Transform">
<if:value>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="">
<xsl:template match="*[xlink:@type]">
<xsl:element name="xlink:@type">
<xsl:copy-of select="xlink:@*"/>
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match="*[xlink:@*]">
<xsl:element name="undefined">
<xsl:copy-of select="xlink:@*"/>
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match="*">
<xsl:apply-templates select="*"/>
</xsl:template>
</xsl:transform>
</if:value>
<ref name="xlinkElements"/>
</if:process>
</element>
The transformation would transform the content of the foo element into elements having the name of their xlink:type attribute and their xlink:attributes. A generic Relax NG pattern would then be quite easy to write which could capture the constraints of the XLink vocabulary (such as mandatory xlink:type attribute, xlink:href, xlink:role and xlink:arcrole being URI references, the structure of complex links, ...).
-- DSDL members discussion list To unsubscribe, please send a message with the command "unsubscribe" to dsdl-discuss-request@dsdl.org (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)Received on Sat May 25 11:25:09 2002
This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC