[dsdl-discuss] Re: Part 10 Scenario

From: Martin Bryan <martin@is-thought.co.uk>
Date: Wed Feb 16 2005 - 20:07:14 UTC

For those of you with Real Player installed you might find the attached
version, with an XML extension, easier to call into your favourite XML
editor

Martin

----- Original Message -----
From: "Erik Bruchez" <ebruchez@orbeon.com>
To: <dsdl-discuss@dsdl.org>
Sent: Tuesday, February 08, 2005 11:43 PM
Subject: [dsdl-discuss] Re: Part 10 Scenario

> Eric van der Vlist wrote:
>
> >>Lets subscribe Erik to dsdl-discuss (and hope he remains an active
> >>participant!)
> >
> > Done.
> >
> > Erik, you are very welcome to post your proposal to the list!
>
> Thanks for welcoming me on this list!
>
> I will start with saying that I am not quite up to date regarding DSDL
> and the related languages, except Relax NG, but Eric forwarded to me
> the use case discussed on this list a few weeks ago, and I set to
> propose a simple example implementing the use case with XPL.
>
> XPL stands for "XML Pipeline Language". It was developed by my
> company, Orbeon, since 2002. An implementation of the language is
> available in the open source Orbeon PresentationServer project.
>
> By the way, we announced today that we are joining ObjectWeb, and
> PresentationServer is in the process of being moved from
> SourceForge.net to the ObjectWeb Forge:
>
> http://www.orbeon.com/company/pr-objectweb
>
> Back to XPL, we recently wrote a fairly formal draft specification of
> XPL which we hope will lead to an XPL 1.0 specification. We are
> looking to build interest in the specification and in XML pipelines in
> general, because as far as we can tell there is nothing quite like it
> at this point out there.
>
> So I guess I'll just go ahead and attach an implementation of the use
> case using XPL pre-1.0, that is XPL as it runs today. It uses a
> "Validation" processor which is half hypothetical, half real
> (PresentationServer has a Validation processor that supports Relax NG
> and W3C XML Schema). It also uses some hypothetical processors for
> some of the DSDL languages, and then uses an XSLT processor to build
> the final document. What's really important is that all those
> processors are connected together thanks to XPL.
>
> Some comments by myself and Eric are at the top of the document. The
> syntax of XPL is I hope more or less self-explanatory. The draft spec
> is not online yet, but there is some information here:
>
> http://www.orbeon.com/ois/doc/reference-xpl-pipelines
>
> Let's take it from there, and please let me know if you have any
> questions!
>
> -Erik
>
>
>

----------------------------------------------------------------------------

----
> <p:config xmlns:p="http://www.orbeon.com/oxf/pipeline"
>           xmlns:oxf="http://www.orbeon.com/oxf/processors">
>
>     <p:param name="source-document" type="input"/>
>     <p:param name="result-document" type="output"/>
>
>     <!--
>         First pass at writing a pipeline to implement the DSDL Test
Scenario.
>
>         Questions / issues:
>
>         o = questions by Erik Bruchez
>         v = answers by Eric van der Vlist
>
>         o What is expected of the output of validators? Is the flow
supposed to be interrupted when
>           a validation error occurs?
>
>             (v) Both questions are controversial :-) ...
>                 An overall principle for DSDL is that DSDL is only about
validation and do not carry
>                 any kind of PSVI information. Following this principle,
the result of a DSDL validation
>                 should be "valid" or "invalid".
>
>                 Now, this scenario seems to prove that this might not be
the case for part 10
>                 (Validation Management) and this is also why I am now
thinking that XPL may be interesting
>                 while if that was only about "valid"/"invalid" XPL
wouldn't have been such a good fit
>                 IMO.
>
>                 My answer to your first question seems thus to be "a
validation report containing at least a "yes/no"
>                 answer plus adhoc content.
>
>                 My personal answer to the second question would be "that
depends". On the XMLfr publication
>                 process for instance, I have two kind of validations: a
RNG schema that returns errors and a
>                 Schematron that validate good practises and returns
warnings. If the first one could interrupt
>                 the flow, the second one shouldn't do it.
>
>                 If we wanted to bring that notion in Validation
Management, that could mean that instead of
>                 "yes/no"; we could have an error level and that when
invoking a validator we could define
>                 the level of the error to raise in case the validation
fails.
>
>                 The default behaviour could be to stop at the first error
(as you've implied in the pipeline),
>                 but an optional "config" input could be added that would
allow to specify an error level. If the
>                 error level is positive, no exception would be raised.
>
>                 Now, a validator might have several outputs. What about
defining several outputs for a validator:
>
>                     - the data output (useful only for schema languages
that, like DTDs or WXS augment the infoset
>                        with stuff such as default values).
>                     - the report output with a yes/no (or level)
information, error messages or (for Schematron) the
>                        validation report.
>                     - to that, we could add a PSVI output in the case of
W3C XML Schema (assuming we had a
>                        XML format for the PSVI).
>
>                 When a validator would be configured with a positive error
level, error detection could be done
>                 by checking the report output.
>
>                 All these answers are personal and should be checked with
DSDL Working group.
>
>         o For simplicity, I assumed that the NVDL processor here would
produce outputs with those
>           particular names. This would be possible only if the NVDL
processor could be configured to
>           map those output names to namespaces. Practically, this
processor could either:
>
>           o Have predefined output names, like document-1, document-2,
etc.
>           o Produce a single XML document with all the streams aggregated
>
>           I do not know NVDL well enough to see what would be natural
here.
>
>                 (v) None of them are natural :-) ...
>
>                     Right now, NVDL is currently for validation only and
takes care of invoking the different validators
>                     to return a single "yes/no" answer.
>
>                     Using it to split a document like mentioned in that
scenario is thus an extrapolation
>                     of what does the current NVDL implementation.
>
>                     However, given the fact that NVDL splits documents
according to their namespaces, I wonder
>                     if the aggregating streams would be different enough
from the original document ;-) ...
>
>                     Thus, I am wondering if predefined names wouldn't be
the best solution. Maybe, instead of using
>                     document-i, we could map namespaces URIs on names
(like they are maped to namespaces prefixes).
>
>         o I have used a single validation processor that supports W3C XML
Schema, Relax NG,
>           Schematron, and DTDs (here DTDs would either have to be
encaspulated into a root element,
>           or referred externally). You could of course propose one
processor per schema type. The
>           PresentationServer validation processor currently supports
transparently W3C Schema and
>           Relax NG.
>
>         o I proposed using XSLT to recombine the final document in the
end.
>
>         o Otherwise, the pipeline is very simple. Nothing is against
parallel execution on XPL.
>           Without exception support, the processing would just stop if
there is a validation error.
>           With exception support, it could resume, locally (per branch) if
needed, or just propose a
>           global fallback. Everything that is possible with exceptions.
>
>     -->
>
>     <!--
>         1. Use NVDL to split out the parts of the document that are
encoded using HTML, SVG and
>         MathML from the bulk of the document, whose tags are defined using
a user-defined set of
>         markup tags.
>      -->
>     <p:processor name="oxf:nvdl">
>         <p:input name="document" href="#source-document"/>
>         <p:input name="rules">
>             <rules>
>                 NVDL rules
>             </rules>
>         </p:input>
>         <p:output name="html-stream" id="html-stream"/>
>         <p:output name="svg-stream" id="html-stream"/>
>         <p:output name="mathml-stream" id="html-stream"/>
>      <!-- 
>
>         (v) typo: the ids should be "svg-stream" &amp; "mathml-stream"...
>
>         -->
>         <p:output name="other-stream" id="other-stream"/>
>     </p:processor>
>
>     <!--
>         2. Validate the HTML elements and attributes using the HTML 4.0
DTD (W3C XML DTD).
>     -->
>     <p:processor name="oxf:validation">
>         <p:input name="data" href="#html-stream"/>
>         <p:input name="schema">
>             <!-- Reference to DTD for HTML -->
>             <dtd href="..."/>
>         </p:input>
>         <p:output name="data" id="html-stream-validated"/>
>     </p:processor>
>
>     <!--
>         3. Use a set of Schematron rules stored in check-metadata.xml to
ensure that the metadata
>         of the HTML elements defined using Dublin Core semantics conform
to the information in the
>         document about the document's title and subtitle, author, encoding
type, etc.
>     -->
>     <p:processor name="oxf:validation">
>         <p:input name="data" href="#html-stream-validated"/>
>         <!-- Reference to Schematron schema for HTML metadata -->
>         <p:input name="schema" href="check-metadata.xml"/>
>         <p:output name="data" id="html-stream-schematronized"/>
>        <!-- 
>
>         (v) Note that in the case of Schematron, the data output is
identical to the data input.
>
>         -->
>     </p:processor>
>
>     <!--
>         4. Validate the SVG components of the file using the standard W3C
schema provided in the
>         SVG 1.2 specification.
>     -->
>     <p:processor name="oxf:validation">
>         <p:input name="data" href="#svg-stream"/>
>         <!-- Reference to W3C Schema for SVG -->
>         <p:input name="schema" href="svg-1.2.xsd"/>
>         <p:output name="data" id="svg-stream-validated"/>
>     </p:processor>
>
>     <!--
>         5. Use the Schematron rules defined in SVG-subset.xml to ensure
that the SVG file only uses
>         those features of SVG that are valid for the particular SVG viewer
available to the system.
>     -->
>     <p:processor name="oxf:validation">
>         <p:input name="data" href="#svg-stream-validated"/>
>         <!-- Reference to Schematron schema for SVG subset -->
>         <p:input name="schema" href="SVG-subset.xml"/>
>         <p:output name="data" id="svg-stream-schmatronized"/>
>     </p:processor>
>
>     <!--
>         6. Validate the MathML components using the latest version of the
MathML. schema (defined
>         in RELAX-NG) to ensure that all maths fragments are valid. The
schema will make use the
>         datatype definitions in check-maths.xml to validate the contents
of specific elements.
>     -->
>     <p:processor name="oxf:validation">
>         <p:input name="data" href="#mathml-stream"/>
>         <!-- Reference to Relax NG shema for MathML -->
>         <p:input name="schema" href="mathml-1.0.rng"/>
>         <p:output name="data" id="mathml-stream-validated"/>
>     </p:processor>
>
>     <!--
>         7. Use MathML-SVG.xslt to transform the MathML segments to
displayable SVG and replace each
>         MathML fragment with its SVG equivalent.
>     -->
>     <p:processor name="oxf:xslt">
>         <p:input name="data" href="#mathml-stream-validated"/>
>         <p:input name="config" href="MathML-SVG.xslt"/>
>         <p:output name="data" id="mathml-as-svg"/>
>     </p:processor>
>
>     <!--
>         8. Use the DSRL definitions in convert-mynames.xml to convert the
tags in the local nameset
>         to the form that can be used to validate the remaining part of the
document using
>         docbook.dtd.
>     -->
>     <p:processor name="oxf:dsrl">
>         <p:input name="data" href="#other-stream"/>
>         <p:input name="config" href="convert-mynames.xml "/>
>         <p:output name="data" id="docbook-stream"/>
>     </p:processor>
>
>     <p:processor name="oxf:validation">
>         <p:input name="data" href="#docbook-stream"/>
>         <!-- Reference to DTD Docbook -->
>         <p:input name="schema">
>             <dtd href="..."/><!-- Reference to W3C DTD -->
>         </p:input>
>         <p:output name="data" id="docbook-stream-validated"/>
>     </p:processor>
>
>     <!--
>         9. Use the CRDL rules defined in mycharacter-checks.xml to
validate that the correct
>         character sets have been used for text identified as being Greek
and Cyrillic.
>     -->
>     <p:processor name="oxf:crdl">
>         <p:input name="data" href="#docbook-stream-validated"/>
>         <p:input name="config" href="mycharacter-checks.xml "/>
>         <p:output name="data" id="docbook-stream-validated-2"/>
>     </p:processor>
>
>     <!--
>         10. Convert the Docbook tags to HTML so that they can be displayed
in a web browser using
>         the docbook-html.xslt transformation rules.
>     -->
>     <p:processor name="oxf:xslt">
>         <p:input name="data" href="#docbook-stream-validated-2"/>
>         <p:input name="config" href="docbook-html.xslt"/>
>         <p:output name="data" id="docbook-as-html"/>
>     </p:processor>
>
>     <!--
>         After completion of step 10 the HTML (both streams), and SVG (both
streams) should be
>         recombined to produce a single stream that can fed to a web
browser.
>     -->
>     <p:processor name="oxf:xslt">
>         <p:input name="data" href="#html-stream-schematronized"/>
>         <p:input name="html-2" href="#docbook-as-html"/>
>         <p:input name="svg-1" href="#svg-stream-schmatronized"/>
>         <p:input name="svg-2" href="#mathml-as-svg"/>
>         <p:input name="config"
href="stylesheet-to-aggregate-everything.xsl"/>
>         <p:output name="data" ref="result-document"/>
>     </p:processor>
>
> </p:config>
>

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Wed Feb 16 21:07:24 2005

This archive was generated by hypermail 2.1.8 : Fri Feb 18 2005 - 17:23:01 UTC