Martin,
Martin Bryan wrote:
>Eric
>
>
>
>>I'd say that, more generally, any technique such as JITTSs
>>
>>
>(http://sbl-site2.org/Extreme2002/) which can build several XML infosets
>through transformation of a source document should be easy to integrate
>using our Validation Management (ex framework).
>Should we add this as a use case?
>
>While I think this is a useful case to be added to the list, I don't think
>its the same case as Alex is talking about. Remember that for applications
>such as loose-leaf publication you need to keep two concurrent structures,
>one for the logical data and one for the presented data. In this case what
>we need is to be able to take two subsets of information out of a single
>dataset, without any transformation. OK, so we can do this by assigning
>separate namespaces to the different datamodels, but what happens if these
>two datamodels do not overlap cleanly. SGML had an answer to this problem.
>XML does not have an answer, and neither does RELAX NG. What we really need
>is a mechanism that says "ignore any markup in namespace X and parse against
>my first model, then ignore any markup in namespace Y and parse against my
>second model".
>
>
Actually I think JITTs not only answers Alex's case but avoids having to
change XML syntax (something James and I am sure others would be
reluctant to do) or even use namespaces as originally envisioned by 8879.
Rather than change the syntax, JITTs proposes changing the way we
process that syntax. The file, using standard XML syntax presents what
are known as overlapping hierarchies. In parsing the file, a schema or
DTD that represents only one of the hierarchies in the document is
passed to the parser, which then recognizes as markup only that markup
specified in the schema or DTD. During the parsing process, unrecognized
markup could be discarded, or even retained in the file, if you wanted
to build a DOM Lite tree that only fired element events to a certain
depth in the file. In the latter case, upon retrieval of a portion of
the document, the previously unrecognized markup could be recognized for
rendering purposes. (In simulation, the gains from processing multiple
copies of Bosak's encoding of Romeo and Juliet by such a method results
in 25-35 times quicker processing of large XML files.)
This approach does not require the use of namespaces, although there are
cases where that would be helpful. Consider the case where the file
presents nesting <div> elements for example. If they were in different
namespaces, as understood in 8879, that would be trivial to process with
a JITTs parser. If they are in the same namespace, the parser would need
to track its location in the unrecognized portion of the tree.
It should be noted that the use of a JITTs parsing model may have the
answer to robust versioning of XML documents.
I think that the JITTs parsing model would offer substantial advantages
over the standard XML parsing that is in vogue in the W3C. A robust
solution to the versioning problem alone, to say nothing of being able
to partially validate files (as opposed to stopping at every error)
would be of substantial commercial interests.
At the most it would require a different PI for files to indicate that a
JITTs parser and a dtd/schema are required to parse the file.
Hope you are having a great day!
Patrick
-- Patrick Durusau Director of Research and Development Society of Biblical Literature Patrick.Durusau@sbl-site.org Co-Editor, ISO 13250, Topic Maps -- Reference Model Topic Maps: Human, not artificial, intelligence at work! -- DSDL members discussion list To unsubscribe, please send a message with the command "unsubscribe" to dsdl-discuss-request@dsdl.org (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)Received on Wed Jun 4 15:24:53 2003
This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC