[dsdl-discuss] Re: Potential use cases for complex value validation from ebXML

From: Martin Bryan <mtbryan@sgml.u-net.com>
Date: Tue Jun 18 2002 - 09:52:18 UTC

Rick

>The issue of standard controlled vocabulary is interesting, because if the
controlled vocabulary is dynamic, there is more chance that some schemas
will not be updated and therefore fail spuriously. This suggests the me
that there might be good value in adding some notion of "date of issue"
to controlled vocabularies, so that when validation reports that
there is no such currency as the Euro, the user/developer can immediately
see
that it is because of using an old list.

The question comes as to whether an enumeration is actually a datatype, or
whether it sits in a different validation phase. Do enumeration schemes need
to be datatype validated, or can we presume that validation of entries is
the responsibility of the list manager? One of the problems we encountered
in ebXML was the fact that you might need to add new values to a listing
while waiting for updates to be approved. So you need a mechanism where a
local enumeration list includes an externally defined one. You also need to
be able to subset enumeration lists. The example I am forever quoting here
is the need to identify the current list of members in the European Union
using ISO 3166 codes. At present we have some additional countries that have
some or all of the EU rights while the bid for accession. Therefore for
certain applications at the European Commission they need to validate
whether the country code is a) a valid EU country b) a currently
acknowledged accession country c) a country, like Switzerland, that has
neither status but has special arrangements within the field under study
(e.g. within IST research projects). So we need to include a subset of ISO
3166 and add two lists that are maintained independently but derived from
ISO 3166 to this subset to create our enumeration, without having to define
any entries per se as all are "standardized country identifiers".

>ISO 3166 is causing headache because of the 2 letter and 3 letter forms.
If we can cope with that (two lexical forms, one value space!) we would be
ahead.

I don't see this as a problem. They are two seperately maintained lists.
What we need, as in the example above, is the ability to create a list that
includes the two lists.

>It would be great if we could have some way to validate that
IDs are unique within a collection of documents.

Is this a real world problem, or should we simply namespace IDs with the URL
of the document containing it to create a UUID of the form
http://www.mysite.org/mydoc.xml?id[abc] which, after all, is the XPath
needed to identify the ID?

>ebXML also has some notion of resource bundles for i18n:
the value used is a key to a locale-specific list (i.e. of strings).
This is a kind of one-to-many link that can appear in
multiple values (so is a many-to-one-to-many link). It might
be useful to provide something to support this kind of thing.

ebXML have not thought this out clearly. XML already has xml:lang, which
should be used to qualify strings with language info. Should we really be
separating out language and locale? We will need to watch how OWL does this,
and compare it with the topic map approach of multiple labels for nodes.

Martin Bryan

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Tue Jun 18 07:43:37 2002

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC