[dsdl-discuss] Re: Datatypes

From: Eric van der Vlist <vdv@dyomedea.com>
Date: Fri May 24 2002 - 17:04:07 UTC

On Fri, 2002-05-24 at 18:06, Martin Bryan wrote:
>
> Eric
> >
> > * String after whitespace normalization (normalizedString)
> > * String tokenized after whitespace normalization
> > (tokenizedString)
> > * String with no whitespace that conforms to DSDL naming rules
> > (Name)
> > --------
> >
> > I think that W3C XML Schema is especially messy here and that we should
> > not follow it :-) ...
>
> While I agree that the W3C versions of these are messy I feel the first two
> entries have a valid point. There are times when you want strings to be
> checked after they have been normalized rather than in their "native"
> format, and there are some strings that have to be tokenized into individual
> components that need to be checked individually. Hence my first two entries.
> >
> > To me, whitespace processing should not be a property of a datatype but
> > could rather be expressed in term of a transformation between the
> > "parsed space" (ie what is sent by the parser and includes line feed
> > processing and some whitespace processing for attributes) and the
> > lexical space.
>
> I'm not sure this is the problem. The problem is whether or not validation
> of a string should take place in the parsed space or the lexical space. For
> comparisons it may not be practical to use the parsed space.

Yes, I do not dispute that this is a valid requirement but rather the
fact that this should be built into the datatype itself.

I have seen people for instance who want to define integers without
withespace normalization, ie where

<foo>1</foo>

would be valid and

<foo>
        1
</foo>

would be invalid and if they want to design their application with this
requirement I don't see why they couldn't.

To me, the primitive datatypes should be neutral in this aspect
(otherwise we should duplicate each of them) and the users should be
given the ability to define a transformation between the parsed and the
lexical space.

This transformation may be the identity transformation if they want to
validate the value from the parsed space and could include whitespace
normalization if they want it.

Additionaly, a date expressed as "24 mai 2002" could be transformed into
the corresponding ISO format ("2002-05-24") and a number expressed as
1.200,50 could also be transformed into "1200.5".

Making the definition of the transformation available to the user would
really solve lots of issues IMO.

One thing which we could consider as well is whether this kind of
transformation could also change the structure of the document (like
Simon St.Laurent's regular fragmentations) which would allow for
instance to split "2002-05-25" into
"<year>2002</year><month>05</month><day>25</day>".

http://www.simonstl.com/projects/fragment/

>
> > The ability to define such a transformation is key IMO for localization
> > (for instance of date types) and could also take care of the whitespace
> > processing.
>
> String localization should be a key differentiator in DSDL. I know of the
> work on European localization of Unicode and hope to put key facts from that
> into the spec, but have yet to work out where it fits in our framework. Do
> you have any ideas on this?

No, I must confess I am quite ignorant in term of Unicode!
 
> > Also, ASN.1 [1] includes a datatype systems. I have *not* looked at it
> > yet, but this might be something to consider as an input.
> >
> > [1] http://www.itu.int/ITU-T/studygroups/com17/languages/
>
> Thanks for the pointer
>
> > See you in San Diego.
> > http://conferences.oreillynet.com/os2002/
>
> Too many conferences fry the brain - I have three lined up already, which is
> more than enough for this year!

Yes. I am trying to reduce the number of conferences which I am
attending as well!

Eric

> Martin
>
> --
> DSDL members discussion list
>
> To unsubscribe, please send a message with the
> command "unsubscribe" to dsdl-discuss-request@dsdl.org
> (mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
>
>

-- 
See you in San Diego.
                               http://conferences.oreillynet.com/os2002/
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
http://xsltunit.org      http://4xt.org           http://examplotron.org
------------------------------------------------------------------------
--
DSDL members discussion list
To unsubscribe, please send a message with the
command  "unsubscribe" to dsdl-discuss-request@dsdl.org
(mailto:dsdl-discuss-request@dsdl.org?Subject=unsubscribe)
Received on Fri May 24 13:04:13 2002

This archive was generated by hypermail 2.1.8 : Fri Dec 03 2004 - 14:00:27 UTC