[dsdl-discuss] Re: Emailing: dsdl-8.pdf

From: MURATA Makoto <murata@hokkaido.email.ne.jp>
Date: Mon Apr 30 2007 - 09:28:24 UTC

Martin,

> >XML has two types of entities: parsed and unparsed. It should be made
> clear that unparsed entities are outside the scope of DSRL.
>
> The DOM Level 2 spec specifically states:
> "Interface Entity This interface represents an entity, either parsed or
> unparsed, in an XML document. "
>
> The DSRL processing model states as its first step:
> "Parse the source document to create a Document Object Model (DOM Level 2)
> stream that contains entity reference nodes"
>
> This to me clearly implked that references to unparsed entities will be
> available for processing within DOM (though not for renaming directly
> unfortunately!)

I am very confused. First, can entity renaming as specified in Part8
handle renaming of unparsed entities? This should be very clearly
stated. If it is covered, the definition in 3.2 is confusing, since
it says "general".

> The XDM model specifically states:
> Because the data model requires that all general entities be expanded, there
> will never be unexpanded entity reference information item children.
>
> Therefore XDM cannot be used as the data model for DSRL processing any more
> than the XML Information Set can.

Please very clearly state this in the beginning of this part!

> The problem I have is that the most popular XSLT processor, Saxon, supports
> DOM Level 2. I used Saxon 8 to implement the options in Annex B. I will
> state this specifically in the text

Yes, this is highly confusing to me. Since Annex B relies on XSLT2, I
immediately started to look for the definition of entities in XDM.

> The DOM Level 2 spec published by ISO goes on to state, as recorded in the
> note associated with the processing model:
>
> >XML does not mandate that a non-validating XML processor read and process
> entity declarations made in the external subset or declared in external
> parameter entities. This means that parsed entities declared in the external
> subset need not be expanded by some classes of applications, and that the
> replacement value of the entity may not be available. When the replacement
> value is available, the corresponding Entity node's child list represents
> the structure of that replacement text. Otherwise, the child list is empty.

I do not believe that XML parsers are obliged to provide replacement
texts for all parsed entities.

> >Furthermore, parsed entities are either internal or external. In the
> definition of the XML recommendation, references to internal parsed
> entities are always recognized whenever they are syntactially recognized
> and the entity replacement texts are included. It should be made
> clear that internal parsed entities are also outside the scope of DSRL.
>
> Why? If the DOM records them why cannot they be referred to and processed?

You restrict the scope of this part to those XML parsers which handle DOM Level 3
streams. DSRL as defined in this part per se cannot be implemented on those XML
parsers which handle the other data models such as XDM or the XML
Information Set. This limitation should be very clearly stated in the
beginning of this part.

However, even when we impose this very severe limitation, the DOM does
not always record entity references. Dom Level 3 Core clearly say

        "Moreover, the XML processor may completely expand references to
        entities while building the Document, instead of providing
        EntityReference nodes"

Thus, DSRL is implementable on only thowse XML parsers which
handle DOM Level 3 streams, preserve entity references, and provide
replacement text for entities. Which XML parser satisfies these
requirements??

> >I do not understand "the corresponding Entity node's child list within
> the DOM stream" in the first note in Section 4.
>
> This text is copied directly from the W3C DOM Level 2 text.

You are talking about the following paragraph.

        XML does not mandate that a non-validating XML processor read
        and process entity declarations made in the external subset or
        declared in external parameter entities. This means
        that parsed entities declared in the external subset need not be
        expanded by some classes of applications, and that the
        replacement text of the entity may not be available. When the
        replacement text is available, the corresponding Entity node's
        child list represents the structure of that replacement text.
        Otherwise, the child list is empty.

First, it should be clearly stated that "Entity" in your text mentions
an interface "Entity" as defined in Dom Level 2 and 3. Without this
clearification, this text is meaningless.

I would still argue that all XML parsers expand references to internal
parse entities and that no DOM level 3 streams preserve such references.
In my opinion, the first sentence in the above quotation justifies my
interpretation.

> >Who creates the file "MappedEntities.ent" shown in the Annex B? Is this
> created by some program that implements the second step in the
> processing model in Section 4?
>
> The contents of MappedEntities.ent are created as part of the process of
> converting the DSRL map into an XSLT for converting the instance. See the
> three xsl:output declarations and the definition for the parameter
> MappedEntities on page 15 and subsequent references to the mapped-entities
> files on p25.

The contents of MappedEntities.ent created from the DSRL example in page
29 should be shown in Appendix B.

>
> >I believe that the only way to implement the second step in the
> processing model is to modify exsiting XML processors. This step is not
> impossible, but I strongly doubt if it is implemented. For this reason,
> Japan has always casted a negative vote to DSRL. At this state of the
> game, however, I would like to suggest a compromise. The second step
> should be optional: conformant implementations are allowed to ignore the
> second step.
>
> All steps are optional. This is why the text explicitly states immediately
> after the processing model:
> "This processing model in no way constrains how a particular application
> should implement DSRL."

I am afraid that I was not clear. I would like to make entity renaming
and entity definitions of DSRL optional. In other words, a DSDL
processor should be allowed to ignore all dsrl:entity-name-map and
dsrl:define-entity elements.

> An application could equally well undertake part of this process, the
> renaming of existing entities, by directly modifying the DOM stream as . But
> if they do so there is no way for them to process any entities defined in
> the DSRL map. For these I see no option other than to create an entity set
> that can be referenced as an external entity set from the document instance.
> I can't see how I can safely make the second step optional in the sense that
> it does not need to be complied with at all.

You can say that the processing model works only when the underlying XML
processors preserve entity references and provide replacement text for
all parsed entities.

> >I do not understand how the XSLT 2.0 stylesheet shown in Appendix B
> recognizes entity references. Section 4 of XSLT 2.0 clearly says:
>
> >Features of a source XML document that are not represented in the XDM tree
> will have no effect on the operation of an XSLT stylesheet.
> Examples of such features are entity references, CDATA sections,
> character references, whitespace within element tags, and the
> choice of single or double quotes around attribute values.
>
> See my notes above. What is important, to me is what Saxon recognizes, not
> what XSLT 2.0 spec gratuitously insists on by forcing a specific processing
> model on XSLT 2.0 developers. I must admit I have not tried the application
> using Saxon 8.9, which may be XDM compliant. It still provides a DOM
> interface, which as far as I can tell is still Level 2 compliant. As XSLT
> 2.0 is only referenced, at your insistence, in the informative annex, I see
> no reason why I have to be restricted to using XDM as the processing model
> when using it!

As long as this complex situation is very clearly explained, I will not
ague agains the use of XSLT2 in this non-normative appendix.

Cheers,

-- 
MURATA Makoto <murata@hokkaido.email.ne.jp>
--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Mon Apr 30 11:28:29 2007

This archive was generated by hypermail 2.1.8 : Mon Apr 30 2007 - 19:03:04 UTC