[dsdl-discuss] More comments on datatypes

From: Rick Jelliffe <ricko@topologi.com>
Date: Sat Feb 01 2003 - 04:24:40 UTC

 (Forwarded, edited, from another direct email. The writer would be
happy with one level of derivation.
-ricko)

> * whether ISO DSDL should adopt WXS datatypes holus bolus, just
> the primitives or builtins, some restricted set, or develop its own;

I think the answer to this question depends on resources. With substantial
resources, develop a better set; with medium resources, develop a subset;
with little to no resources, holus-bolus.

I think I could do a better job than the Schema WG did; probably so do
most people operating independently. Whether it could then be sold is
another matter. The dreaded second-system effect is also hovering nearby.
...
(second-system effect) refers to a section of Fred Brooks's
_The Mythical Man-Month_. The general idea is that the second
large system someone designs is the most likely to lack conceptual
integrity. On the first system, you know you don't know what you are
doing, so you keep it lean and simple. All the wacky stuff gets
saved for the next version, which often becomes baroque beyond
belief. By the third system, you've worked your way through it.

MVS and OS/2 are classic examples of the second-system effect;
in some ways, so is Windows NT (following VMS).

> * whether ISO DSDL should adopt the apparatus of facets;

Yes.

> * whether ISO DSDL should also adopt the apparatus of type derivation
> used with the XML Schemas datatypes;

Not necessary, IMHO.

> * whether ISO DSDL should specify some additional primitives
> to augment where XML Schemas is weak;

Again, resource-dependent.

> > What are references?
>
> ID/IREF
> KEY/KEYREF
> XPOINTER

Ah. I think they should be left out. Identity constraints are
logically distinct from datatypes: there shouldn't be datatypes that
magically imply identity constraints.

> If there is a way to provide multiple lexical forms, what things are left wrong
> with XML Schema's times?

Okay, you want me to play James Clark? Here's a design sketch:
There should be only three datatypes, Duration, TimeInterval, and
RecurrentTimeInterval. The size of TimeInterval and RecurrentTimeInterval
should be a facet; so should the recurrence period of RecurrentTimeInterval.
(I thought about TimeInstant, but an instant is only an interval with a
sufficiently small size.)

In addition, the fuckwitted botch of times without zones should be removed
utterly, so that any two TIs or RTIs can be usefully compared. The
reason this was done, AFAICT, is so that SQL "dates" and "times" can be
spewed directly into XML by low-level routines without taking into account
that they cannot have global meaning unless some time zone is applied.
Having default time zone as a lexical facet seems reasonable.
...
The current botch whereby ints and floats are incommensurable should be
replaced by the well-thought-out Scheme numeric tower, which looks like
this:

Number
Exact Inexact
ExactComplex InexactComplex
ExactReal InexactReal
ExactRational InexactRational
ExactInteger InexactInteger

Of course the existing range-control facets would remain. Note that
the tower is a strict subsetting relationship: an exact 5 is all of
ExactInteger, ExactRational, ExactReal, ExactComplex, Exact, Number.
This is utterly independent of representation, so 5/3 is an ExactRational
and so is 7.2.

There should also be a NumericInterval type.

I've changed my mind about not having user-definable subtyping.
There should, indeed, be a way to name a subtype, defined as a type
with specific facets set. This is not really a new type, but rather
a new name for an existing type, as in C's typedef. It's one of the
few irritations of RNG that there's no way to keep a library of types
defined at the RNG level and then specialize them: you have to keep
them consistent by hand.

There should be a type OctetSequence, and the difference between hex
and base64 should be a lexical representation facet.

QName is a botch, but I don't quite see how to replace it.

Most of the subtypes of String are just applications of the pattern facet.

There's no reason in this day and age not to use full Perl5-ish regexes
in the pattern facet.

--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Sat Feb 1 05:22:13 2003

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC