A little bird told me that people are interested in Bob Lyon's paper at
Extreme
http://www.extrememarkup.com/extreme/2003/thursday.asp#Thursday430-1
It is indeed a worthwhile issue to consider. I have been aware of
Robert's interest
in this area for more than six months (he published his Schematron
analysis quite
a while ago and I worked through it then), and no-one should panic and
regard it as some
new thing that has not been considered. Any formal analysis is very
welcome: it
helps us understand the nature of the beast, etc.
Please note that, on a related issue, probably any grammar-based schema
language
that allows fixed keys (such as ID #REQUIRED) is also NP hard. This means
DTDs and W3C XML Schemas. See
http://lists.xml.org/archives/xml-dev/200303/msg00839.html
In the case of Schematron, my response is
1) Schematron is being defined as a framework: if the query language
used is
hard or NP then the Schematron schemas using that language will be. So a
general statement "it is not possible to write a general-purpose program
which
can decide, for any Schematron schema, whether a conforming instance can be
created." is factually not true because a Schematron may use another query
language. The analysis only holds for the default use of XSLT: I wish Bob
would make this distinction.
2) Schematron's default of XSLT is a catch-all system. It is the
specific intent
of Schematron to provide a way to express constraints that other systems
cannot;
for example, the ZVON site for Schematron quotes me "Schematron is a feather
duster that reaches areas other schema languages cannot. I have always
marketed
Schematron in this fashion, and I believe that is the very thing that
makes it
popular among its users. Hence it is appropriate that a general purpose
query
language be used. It is a property of low-level general-purpose
languages that they
cannot be proved, but it is a cost that is entirely appropriate.
3) Bob's analysis needs to be carefully understood. Let us look at the
statement "it is not possible to write a general-purpose program which
can decide,
for any Schematron[sic] schema, whether a conforming instance can be
created."
The "any" there is *not* used in the plain English sense, but in the narrow
technical sense (i.e. "all"). It is entirely possible to look at any
particular
Schematron (using XSLT) and say "a document can be generated that satisfies
this": it is the reverse that is difficult. Let us look at this schema:
<schema ...>
<pattern>
<rule context="x">
<assert test="y">An X should have a y</assert>
</rule>
</pattern>
</schema>
To say that one cannot generate a document to satisfy this is complete crap.
But it is *not* what Bob is saying.
It is completely possible to derive a subset of XPath expressions and to
then
test whether these are feasible or not. For example, if only the child
axis were
used with no functions, then the schema can be represented using a simple
logic language and solved (e.g. using prolog). As in the case with XML
Schemas, it is certain features that are pathological: it is certainly
possible
that someone can create a schema checker that first checks if there are
pathological features or provenly difficult constructs, tests the schema
if not or warns the user other wise.
I don't see this in any way as being different from RELAX NG and data
binding:
it is possible to define a testable subset of RELAX NG that can be used
for data binding. Similarly, it is like the furphy (irrelevant story) that
any schema language needs to be able to generate user interfaces: the
presence
of ANY complicates that to the point that the requirement becomes irrelent.
Schema languages have many properties that are nice: and it is more
important
that we support a range of languages with different properties and get a
wide
coverage than we fix on any one property in particular to the exclusion of
others.
That being said, I do hope that Bob's work will have an impact on the
development
of new schema languages with various desirable properties. (But that is far
off. Certainly it is irrelevent for RELAX NG and Schematron, but it is
something
that we should discuss for VCSL and any new languages that come up, IMHO.)
Cheers
Rick Jelliffe
P.S. I apologise to anyone who received a virus-infected message from
me. The
virus was only active for 2 seconds but sent a lot of email. I have
moved over
to Linux now; enough is enough.
-- DSDL members discussion list To unsubscribe, please send a message with the command "unsubscribe" to dsdl-discuss-request@dsdl.org (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)Received on Fri Aug 8 12:55:02 2003
This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC