Eric van der Vlist wrote:
>On Wed, 2004-04-07 at 13:39, Eric van der Vlist wrote:
>
>
>
>>In fact, I think that it's the semantic of test="pattern" when the
>>context node isn't a simple type element that should be clarified.
>>
>>
>
>Still thinking loudly about that, with Schematron/xpath, when we write:
>
><sch:rule context="foo">
> <sch:assert test="@bar > 1">
> Attribute bar should be greater than one
> </sch:assert>
></sch:rule>
>
>We use XPath to set the context node (foo) and XPath again to set the
>context on which to evaluate a test (@bar) and the test itself.
>
>With this new proposal, when we write:
>
><sch:rule context="foo">
> <sch:assert test="\p{IsBasicLatin}">
> Should be BasicLatin
> </sch:assert>
></sch:rule>
>
>We use XPath to set the context node but miss a way to set the context
>of the test "\p{IsBasicLatin}" (does it apply to attribute bar, to all
>its text children, to all its descendants, to its text value, ... ?).
>
>
Yes, it is the job of the Query language Binding to specify in what way
the asssertion test
is applied to the subjects. So we don't need the mechanism of your
alternative, because
it is taken care of by definition, in the language binding.
In my suggested Part 7's case:
1) whitespace is stripped from the @test attribute
2) for each text node that is an immediate child of the element,
or the value of the attribute or PI, or the contents of a comment,
or the string value of a name {
for each character in the string, {
check that it conforms to the pattern.
}
A way of implementing this is to wrap the stripped value of the @test in
"[" and "]*" and bingo it becomes a normal regular expression.
What is most important to understand is that we are not using regular
expressions
but character classes. We don't test strings, we test each character in
a string without
reference to others in the same string.
>In fact, we would need an additional XPath function "match(nodeset,
>pattern)" or "match(string, pattern)" rather than a pattern only.
>
>
I originally worked on the idea that the test would work recursively on
all the text nodes
of its descendents, but I rejected this because, as you point out, then
we need all sorts of
extra complicted infratstructure to special case things. Worse, we
either have to
make up our own syntaxes for tests, or we have to rely on the features
of particular
query languages (XQuery).
Instead, Schematron opts for a three level model:
1) Rules select subjects (information items, nodes, whatever)
2) Language bindings specify how to go from subjects to the
objects that the assertions can test (here, from node to string)
3) Assertion tests check against those objects.
And the idiom *[ancestor-or-self::x] is perfectly good enough to match
unknown
elements without laboriously enumerating them. And because a node only
matches
once per pattern, you can merely put special cases in an earlier rule.
I originally had a more complicated syntax for tests, but I have not
found a single
case yet where it is needed.
>That being said, both XPath 2.0 and STX have such functions, see
> * http://www.w3.org/TR/xquery-operators/#func-matches
> * http://stx.sourceforge.net/documents/#string-functions
>
>In other words, I think that you've proved that we can do character
>repertoire validation using Schematron hosting either XPath 2.0 or STX.
>
>
>Now, Schematron hosting either XPath 2.0 or STX can do much more than
>character repertoire validation and it might be considered as overly
>complex (and costly) to do so when a simple language that could be
>implemented as a trivial SAX filter could meet most of the use cases.
>
>Eric
>
>
I think the point is that we want to allow implementers to piggy back on
existing libraries
as much as possible: the thing I think we need to avoid is NIH when
there is so much
low-hanging fruit around.
Cheers
Rick Jelliffe
-- DSDL members discussion list To unsubscribe, please send a message with the command "unsubscribe" to dsdl-discuss-request@dsdl.org (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)Received on Thu Apr 8 11:50:14 2004
This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:28 UTC