[dsdl-discuss] Re: Papers for Philadelphia meeting

From: Martin Bryan <martin@is-thought.co.uk>
Date: Thu Oct 30 2003 - 09:03:05 UTC

The UK would like to introduce Jeni Tennison's paper entitled Datatype
Library Language (http://www.jenitennison.com/datatypes/) for discussion as
input for Part 5.

Jeni has had a number of suggestions for improvements to this paper, but has
not had time to incorporate them in her document yet. Things she thinks
should be changed include:

1. As I had it, when you parse a string into a value, the value is
represented via in an XML structure. I now think a simpler approach
would be better, in which structured datatypes are defined as
consisting of one or more components, each with a name and a type. The
values of the components might themselves be of a structured datatype,
so you'd get a kind of nested structure that way.

Components could be arranged in order to determine how to do
comparisons between values. Defining a datatype in terms of the
components from which its made up give a more obvious division between
a value and a lexical representation. In effect, populating the
components of a value (by parsing a string) is a process of
normalisation. All comparisons and so on are done on the (components
of the) value. Applications that deal with values would have to keep
around both the original lexical representation and the value
(components).

2. As I had it, the regular expression used to parse the string was
represented in an XML structure. I now think it should use a more
normal regular expression syntax, extended to support naming the
subexpressions within the regex so that they can be referred to in
order to work out the values (of the components in structured
datatypes). Also, I don't think that the reuse of named patterns is a
real requirement when using a normal string-based regular expression
syntax, so that part of the language would go away.

3. I did have two kinds of 'inheritance' between datatypes: a 1:many
supertype:subtype relationship and the possibility of 'using'
validation rules and so on from other datatypes despite not sharing
the same lexical representation. I now think that these should be
combined into a many:many inheritance mechanism in which by default a
subtype inherits the same components and lexical representation as its
supertype, but that technically all a subtype has to do is to define
how values are cast to/from its supertype.

I've also been thinking about the requirement for abstract types,
which don't have lexical representations themselves, but do have a
value specification. Their subtypes will usually specify different
ways to get from a lexical representation to a particular value.

4. The parameterising mechanism that I'd put together was pretty
limited, and subsequent discussion showed that there are lots of other
ways of viewing what parameters might do. See
http://lists.usefulinc.com/pipermail/lextypes/2003-July/000018.html
for an analysis of the parameters that John Cowan described for his
"Holus Bolus" datatype library. I'm still not sure about how to
integrate all these different kinds of parameters into a datatype
library language, but at least it shows what the requirements are.

5. I'd used XSLT to do casting and so on, and XPath to assert
validation constraints. With the simplification of using flat
components as suggested in (1) above, I think it would be feasible to
use MathML (or something based on MathML) instead, which would be
better because it would be based on declarative assertions rather than
procedural construction of values (if you see what I mean).

Martin Bryan
IS-Thought: Thinkers for the Information Society
29 Oldbury Orchard, Churchdown, Glos. GL3 2PU, UK
Phone: +44 1452 714029 Fax: +44 1452 859991
E-mail: martin@is-thought.co.uk

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Thu Oct 30 10:03:51 2003

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC