Erik
I finally found time to read your excellent submission this evening and have
some comments to make on it, but before I do so I want to be able to justify
my concept of NVDL being able to separate out validatable segments of the
document that can be "passed" to subsequent processes, which is key to my
proposed scenario. To do this I need to re-read Part 4 in detail, which I am
too tired to do tonight so will leave until tomorrow evening.
Does anyone else want to comment on the validity of my proposed scenario? Is
it valid to expect different segments identified by NVDL to be used a the
start point for processing chains that need to be realigned after different
sets of validations followed by transformations? (The question really boils
down to whether or not we are to allow transformations to be included in the
validation process.)
Martin
----- Original Message -----
From: "Erik Bruchez" <ebruchez@orbeon.com>
To: <dsdl-discuss@dsdl.org>
Sent: Tuesday, February 08, 2005 11:43 PM
Subject: [dsdl-discuss] Re: Part 10 Scenario
> Eric van der Vlist wrote:
>
> >>Lets subscribe Erik to dsdl-discuss (and hope he remains an active
> >>participant!)
> >
> > Done.
> >
> > Erik, you are very welcome to post your proposal to the list!
>
> Thanks for welcoming me on this list!
>
> I will start with saying that I am not quite up to date regarding DSDL
> and the related languages, except Relax NG, but Eric forwarded to me
> the use case discussed on this list a few weeks ago, and I set to
> propose a simple example implementing the use case with XPL.
>
> XPL stands for "XML Pipeline Language". It was developed by my
> company, Orbeon, since 2002. An implementation of the language is
> available in the open source Orbeon PresentationServer project.
>
> By the way, we announced today that we are joining ObjectWeb, and
> PresentationServer is in the process of being moved from
> SourceForge.net to the ObjectWeb Forge:
>
> http://www.orbeon.com/company/pr-objectweb
>
> Back to XPL, we recently wrote a fairly formal draft specification of
> XPL which we hope will lead to an XPL 1.0 specification. We are
> looking to build interest in the specification and in XML pipelines in
> general, because as far as we can tell there is nothing quite like it
> at this point out there.
>
> So I guess I'll just go ahead and attach an implementation of the use
> case using XPL pre-1.0, that is XPL as it runs today. It uses a
> "Validation" processor which is half hypothetical, half real
> (PresentationServer has a Validation processor that supports Relax NG
> and W3C XML Schema). It also uses some hypothetical processors for
> some of the DSDL languages, and then uses an XSLT processor to build
> the final document. What's really important is that all those
> processors are connected together thanks to XPL.
>
> Some comments by myself and Eric are at the top of the document. The
> syntax of XPL is I hope more or less self-explanatory. The draft spec
> is not online yet, but there is some information here:
>
> http://www.orbeon.com/ois/doc/reference-xpl-pipelines
>
> Let's take it from there, and please let me know if you have any
> questions!
>
> -Erik
>
>
>
----------------------------------------------------------------------------
---- > <p:config xmlns:p="http://www.orbeon.com/oxf/pipeline" > xmlns:oxf="http://www.orbeon.com/oxf/processors"> > > <p:param name="source-document" type="input"/> > <p:param name="result-document" type="output"/> > > <!-- > First pass at writing a pipeline to implement the DSDL Test Scenario. > > Questions / issues: > > o = questions by Erik Bruchez > v = answers by Eric van der Vlist > > o What is expected of the output of validators? Is the flow supposed to be interrupted when > a validation error occurs? > > (v) Both questions are controversial :-) ... > An overall principle for DSDL is that DSDL is only about validation and do not carry > any kind of PSVI information. Following this principle, the result of a DSDL validation > should be "valid" or "invalid". > > Now, this scenario seems to prove that this might not be the case for part 10 > (Validation Management) and this is also why I am now thinking that XPL may be interesting > while if that was only about "valid"/"invalid" XPL wouldn't have been such a good fit > IMO. > > My answer to your first question seems thus to be "a validation report containing at least a "yes/no" > answer plus adhoc content. > > My personal answer to the second question would be "that depends". On the XMLfr publication > process for instance, I have two kind of validations: a RNG schema that returns errors and a > Schematron that validate good practises and returns warnings. If the first one could interrupt > the flow, the second one shouldn't do it. > > If we wanted to bring that notion in Validation Management, that could mean that instead of > "yes/no"; we could have an error level and that when invoking a validator we could define > the level of the error to raise in case the validation fails. > > The default behaviour could be to stop at the first error (as you've implied in the pipeline), > but an optional "config" input could be added that would allow to specify an error level. If the > error level is positive, no exception would be raised. > > Now, a validator might have several outputs. What about defining several outputs for a validator: > > - the data output (useful only for schema languages that, like DTDs or WXS augment the infoset > with stuff such as default values). > - the report output with a yes/no (or level) information, error messages or (for Schematron) the > validation report. > - to that, we could add a PSVI output in the case of W3C XML Schema (assuming we had a > XML format for the PSVI). > > When a validator would be configured with a positive error level, error detection could be done > by checking the report output. > > All these answers are personal and should be checked with DSDL Working group. > > o For simplicity, I assumed that the NVDL processor here would produce outputs with those > particular names. This would be possible only if the NVDL processor could be configured to > map those output names to namespaces. Practically, this processor could either: > > o Have predefined output names, like document-1, document-2, etc. > o Produce a single XML document with all the streams aggregated > > I do not know NVDL well enough to see what would be natural here. > > (v) None of them are natural :-) ... > > Right now, NVDL is currently for validation only and takes care of invoking the different validators > to return a single "yes/no" answer. > > Using it to split a document like mentioned in that scenario is thus an extrapolation > of what does the current NVDL implementation. > > However, given the fact that NVDL splits documents according to their namespaces, I wonder > if the aggregating streams would be different enough from the original document ;-) ... > > Thus, I am wondering if predefined names wouldn't be the best solution. Maybe, instead of using > document-i, we could map namespaces URIs on names (like they are maped to namespaces prefixes). > > o I have used a single validation processor that supports W3C XML Schema, Relax NG, > Schematron, and DTDs (here DTDs would either have to be encaspulated into a root element, > or referred externally). You could of course propose one processor per schema type. The > PresentationServer validation processor currently supports transparently W3C Schema and > Relax NG. > > o I proposed using XSLT to recombine the final document in the end. > > o Otherwise, the pipeline is very simple. Nothing is against parallel execution on XPL. > Without exception support, the processing would just stop if there is a validation error. > With exception support, it could resume, locally (per branch) if needed, or just propose a > global fallback. Everything that is possible with exceptions. > > --> > > <!-- > 1. Use NVDL to split out the parts of the document that are encoded using HTML, SVG and > MathML from the bulk of the document, whose tags are defined using a user-defined set of > markup tags. > --> > <p:processor name="oxf:nvdl"> > <p:input name="document" href="#source-document"/> > <p:input name="rules"> > <rules> > NVDL rules > </rules> > </p:input> > <p:output name="html-stream" id="html-stream"/> > <p:output name="svg-stream" id="html-stream"/> > <p:output name="mathml-stream" id="html-stream"/> > <!-- > > (v) typo: the ids should be "svg-stream" & "mathml-stream"... > > --> > <p:output name="other-stream" id="other-stream"/> > </p:processor> > > <!-- > 2. Validate the HTML elements and attributes using the HTML 4.0 DTD (W3C XML DTD). > --> > <p:processor name="oxf:validation"> > <p:input name="data" href="#html-stream"/> > <p:input name="schema"> > <!-- Reference to DTD for HTML --> > <dtd href="..."/> > </p:input> > <p:output name="data" id="html-stream-validated"/> > </p:processor> > > <!-- > 3. Use a set of Schematron rules stored in check-metadata.xml to ensure that the metadata > of the HTML elements defined using Dublin Core semantics conform to the information in the > document about the document's title and subtitle, author, encoding type, etc. > --> > <p:processor name="oxf:validation"> > <p:input name="data" href="#html-stream-validated"/> > <!-- Reference to Schematron schema for HTML metadata --> > <p:input name="schema" href="check-metadata.xml"/> > <p:output name="data" id="html-stream-schematronized"/> > <!-- > > (v) Note that in the case of Schematron, the data output is identical to the data input. > > --> > </p:processor> > > <!-- > 4. Validate the SVG components of the file using the standard W3C schema provided in the > SVG 1.2 specification. > --> > <p:processor name="oxf:validation"> > <p:input name="data" href="#svg-stream"/> > <!-- Reference to W3C Schema for SVG --> > <p:input name="schema" href="svg-1.2.xsd"/> > <p:output name="data" id="svg-stream-validated"/> > </p:processor> > > <!-- > 5. Use the Schematron rules defined in SVG-subset.xml to ensure that the SVG file only uses > those features of SVG that are valid for the particular SVG viewer available to the system. > --> > <p:processor name="oxf:validation"> > <p:input name="data" href="#svg-stream-validated"/> > <!-- Reference to Schematron schema for SVG subset --> > <p:input name="schema" href="SVG-subset.xml"/> > <p:output name="data" id="svg-stream-schmatronized"/> > </p:processor> > > <!-- > 6. Validate the MathML components using the latest version of the MathML. schema (defined > in RELAX-NG) to ensure that all maths fragments are valid. The schema will make use the > datatype definitions in check-maths.xml to validate the contents of specific elements. > --> > <p:processor name="oxf:validation"> > <p:input name="data" href="#mathml-stream"/> > <!-- Reference to Relax NG shema for MathML --> > <p:input name="schema" href="mathml-1.0.rng"/> > <p:output name="data" id="mathml-stream-validated"/> > </p:processor> > > <!-- > 7. Use MathML-SVG.xslt to transform the MathML segments to displayable SVG and replace each > MathML fragment with its SVG equivalent. > --> > <p:processor name="oxf:xslt"> > <p:input name="data" href="#mathml-stream-validated"/> > <p:input name="config" href="MathML-SVG.xslt"/> > <p:output name="data" id="mathml-as-svg"/> > </p:processor> > > <!-- > 8. Use the DSRL definitions in convert-mynames.xml to convert the tags in the local nameset > to the form that can be used to validate the remaining part of the document using > docbook.dtd. > --> > <p:processor name="oxf:dsrl"> > <p:input name="data" href="#other-stream"/> > <p:input name="config" href="convert-mynames.xml "/> > <p:output name="data" id="docbook-stream"/> > </p:processor> > > <p:processor name="oxf:validation"> > <p:input name="data" href="#docbook-stream"/> > <!-- Reference to DTD Docbook --> > <p:input name="schema"> > <dtd href="..."/><!-- Reference to W3C DTD --> > </p:input> > <p:output name="data" id="docbook-stream-validated"/> > </p:processor> > > <!-- > 9. Use the CRDL rules defined in mycharacter-checks.xml to validate that the correct > character sets have been used for text identified as being Greek and Cyrillic. > --> > <p:processor name="oxf:crdl"> > <p:input name="data" href="#docbook-stream-validated"/> > <p:input name="config" href="mycharacter-checks.xml "/> > <p:output name="data" id="docbook-stream-validated-2"/> > </p:processor> > > <!-- > 10. Convert the Docbook tags to HTML so that they can be displayed in a web browser using > the docbook-html.xslt transformation rules. > --> > <p:processor name="oxf:xslt"> > <p:input name="data" href="#docbook-stream-validated-2"/> > <p:input name="config" href="docbook-html.xslt"/> > <p:output name="data" id="docbook-as-html"/> > </p:processor> > > <!-- > After completion of step 10 the HTML (both streams), and SVG (both streams) should be > recombined to produce a single stream that can fed to a web browser. > --> > <p:processor name="oxf:xslt"> > <p:input name="data" href="#html-stream-schematronized"/> > <p:input name="html-2" href="#docbook-as-html"/> > <p:input name="svg-1" href="#svg-stream-schmatronized"/> > <p:input name="svg-2" href="#mathml-as-svg"/> > <p:input name="config" href="stylesheet-to-aggregate-everything.xsl"/> > <p:output name="data" ref="result-document"/> > </p:processor> > > </p:config> > -- DSDL members discussion list To unsubscribe, please send a message with the command "unsubscribe" to dsdl-discuss-request@dsdl.org (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)Received on Tue Feb 22 23:11:45 2005
This archive was generated by hypermail 2.1.8 : Wed Feb 23 2005 - 08:53:02 UTC