Sandro Hawke | 1 Aug 03:07 2007
Picon

Re: RDF's curious literals


I'm not quite sure where you're going.  I think you're arguing about
ways the RDF model could be different, and changes which would make it
better.  That's okay, and sometime interesting.  At the same time, you
seem to be arguing about what URIs and strings *are*, in the abstract
(not related to RDF), and that's kind of confusing.  That's philosophy,
not engineering, I think.

> So let me state this another way: to say that non-literal resources use 
> URIs as identifiers and literal resources use strings as identifiers is 
> a false dichotomy. RDF uses strings for all its identifiers. It's just 
> that for non-literals, these strings conform to a format called URI so 
> as to reduce clashes. There's no reason why we can't have the strings 
> identifying literals conform to this same format as well by 
> pre/postpending the appropriate information--- perhaps a "rdfliteral" 
> prefix and a ";datatype" postfix. Then we use URI-conforming strings for 
> everything.

Conceptually, outside of every URI or String is a single bit flag which
says whether this is to be treated as a string or as a URI.  In N3, it's
the delimeters.  URIs are written like this <...> and strings like this
"..."  This bit flag is important.    

In object oriented terms, you can think of it as two classes, URI and
String, each of which has one data field whose value is a sequence of
characters.  So they are very similar structurally, but the operations
defined for them are different, and you'll get lots of type errors if
you try to use one where the other belongs.  (Of course you can convert
between them, copying the character sequence from one to the other.)
With strings you concatenate them, take substrings, find substrings,
(Continue reading)

Garret Wilson | 1 Aug 04:01 2007

Re: RDF's curious literals


Sandro Hawke wrote:
> I'm not quite sure where you're going.  I think you're arguing about
> ways the RDF model could be different, and changes which would make it
> better.  That's okay, and sometime interesting.  At the same time, you
> seem to be arguing about what URIs and strings *are*, in the abstract
> (not related to RDF), and that's kind of confusing.  That's philosophy,
> not engineering, I think.
>   

I'm sorry to have confused you. I assure you that the philosophical 
discussion was squarely in the "improving the RDF model" camp.

There seems to be a notion that things like the number 123 and the 
boolean value true are some sort of different kind of resource, merely 
because we have become accustomed to identifying these two particular 
resources with strings rather than URIs. I find that distinction to be 
completely arbitrary an unwarranted.

My discussion of URIs and strings was to point out that if URIs were 
invented earlier in the history of humans, we might all be accustomed to 
identifying 123 as the sequence of characters 
"http://example.org/numbers/123" instead of just "123". And I can 
guarantee you that, had this been the case, RDF would not have evolved a 
separate concept of "literals". But just because the sequence of 
characters with which we identify numbers is different doesn't mean that 
the concept of the value 123 is any different.

123 is a resource, just like anything else. If you want to settle on a 
common identifying URI, fine. But the concept of an RDF literals as a 
(Continue reading)

Sandro Hawke | 1 Aug 06:23 2007
Picon

Re: RDF's curious literals


Garret Wilson <garret <at> globalmentor.com> writes:
> Sandro Hawke wrote:
> > I'm not quite sure where you're going.  I think you're arguing about
> > ways the RDF model could be different, and changes which would make it
> > better.  That's okay, and sometime interesting.  At the same time, you
> > seem to be arguing about what URIs and strings *are*, in the abstract
> > (not related to RDF), and that's kind of confusing.  That's philosophy,
> > not engineering, I think.
> >   
> 
> I'm sorry to have confused you. I assure you that the philosophical 
> discussion was squarely in the "improving the RDF model" camp.
> 
> There seems to be a notion that things like the number 123 and the 
> boolean value true are some sort of different kind of resource, merely 
> because we have become accustomed to identifying these two particular 
> resources with strings rather than URIs. I find that distinction to be 
> completely arbitrary an unwarranted.

Absolutely.  It's like the distinction between electronic toys I can
afford and those I can't: it's quite important to me, today, but it says
nothing about the world, and it could change tomorrow.  It could have
changed already without me knowing it. 

Here's an example.  Let's define a new datatype, "eg:usmuni".  The value
space of this datatype is municipalities (cities and towns) in the US.
The lexical space is their names, in a form acceptable to the US Postal
Service.  So the RDF literal "San Francisco, CA"^^eg:usmuni denotes the
City of San Francisco.  By defining this datatype, I have made San
(Continue reading)

Tim Berners-Lee | 1 Aug 17:56 2007
Picon

Re: RDF's curious literals


On 2007-07 -31, at 22:01, Garret Wilson wrote:

There seems to be a notion that things like the number 123 and the boolean value true are some sort of different kind of resource, merely because we have become accustomed to identifying these two particular resources with strings rather than URIs. I find that distinction to be completely arbitrary an unwarranted.

Things and Terms

I think there is a confusion between the types of Things in the universe, and the types of terms in the RDF language. This is one I learned not to make not that long ago, but it is important.  A lot of my earlier coding got muddled.  Now I try to get it right.

First, Things in the universe.  (aka rdf:Resource)  Thing is the 'top' class, the one everything is  a member of.   An infinite subclass of those things are Numbers and so on.  Numbers are abstract things we find very useful. They are very commonly referred to  in a lot of the data out there. 

Now, Terms in the the language. I'll use N3, but other RDF-based languages go similarly.  
Things are identified in the language by symbols. RDF languages use URIs as symbols.
Yes, we could have used specific URIs to identify the numbers, and we could indeed have had
a common shared space like  http://numers.info/int/123 as you suggest.
But, given that (a) it would have meant a lot of consensus-building to pick the URI suffixes
and that (b) very many computer languages used a specific syntax to refer to those Things which
are Numbers, including perl and python and SQL and XQ and so on, we went with the flow and 
put in a syntax for numbers - ints, decimals, floats, and strings. ( Weren't you the one earlier saying how important it was to use synatx people were used to?). 

Now, in the SYNTAX,  the object of a statement can be a symbol using a URI  <http://...$foo>,, or a literal   123, or number 
of shortcuts for URIs, such as prefixed names  cc:license  and so on.
When you look at the SYNTAX productions, literals and symbols will be quite distinct, as they are distinct productions, just as URIs and prefixed names.   That's syntax.   The class we could call N3URISymbol, N3PrefixedSymbol, N3Literal  do not overlap. For example their members begin with different characters.

This does not, though, affect the RDF model.    Different terms in the language an actually identify the same thing in the universe.  This includes numbers.

  ex:n owl:SameAs 123.

So are Numbers (say, or DateTimes, or Stings) a "different kind of Resource"?  Well, they do have certain properties which are particularly convenient.  Most of the information out there involves lots of them.   They end up occurring in different topologies in real data.    It is often more interesting to ask about all statements about  a person ex:Joe than all statements about the number 2, as 2 gets reused so often.  This affects how a store might index them.  But they are Things.   (In OWL DL there is a need for the sake of DL reasons to separate datatype properties and object properties, but that is an artifact of those reasoners and a limitation of OWL DL).
Numbers have the interesting property, for example, that, when you use the conventional notation terms for them, you can tell that two terms (like '123' and '124') identify different things by just operating on those terms.   Typically, people often read and write numbers, but in a good UI shouldn't have to read and write URIs, doing drag and drop of symbol-icons instead. There are lots of ways in which these Numbers, DateTimes, etc are special in practice, from arbitrary other Things in the universe.

Uris and Strings

I think there was a similar confusion when you complained that  "a URI is just a string".
Well, a URI is a string in the universe.   But a symbol is not a literal in the language.
A symbol is the use of a URI string to stand for that which it identifies. In N3 this is denoted by <>.
A literal is a use in the language to stand fro a member of the class of Strings.


"http://dbpedia.org/resource/Tim_Berners-Lee"   ex:length   "43 chars".


The first uses the string 'http://dbpedia.org/resource/Tim_Berners-Lee' to identify me. The third says that if you take that string and do a GET on it you will get back a response code of 303.  Note it applies to that string.  It does not apply to me.  I have other URIs.

A lot of the confusion in the recent threads on the semantic-web <at> w3.org list seems to have been connected
to the confusion between terms in the language and things in the universe.  I know we do tend to use the word 'literal' to refer to both the classes of numbers, etc etc, and also the production in the syntax.  We should probably find another term for one of them.

Hope this helps (and I didn't get it muddled!)

Tim


Garret Wilson | 1 Aug 18:32 2007

Re: RDF's curious literals


Tim,

Thanks for the explanation of the distinction between "things" and 
"terms", and "URIs" and "strings", which are relevant to the issue at 
hand. Now, going back to directly attack the issue at hand, I have a 
couple of responses:

Tim Berners-Lee wrote:
>
> Now, Terms in the the language. I'll use N3, but other RDF-based 
> languages go similarly.  
> Things are identified in the language by symbols. RDF languages use 
> URIs as symbols.
> Yes, we could have used specific URIs to identify the numbers, and we 
> could indeed have had
> a common shared space like  http://numers.info/int/123 as you suggest.
> But, given that (a) it would have meant a lot of consensus-building to 
> pick the URI suffixes
> and that (b) very many computer languages used a specific syntax to 
> refer to those Things which
> are Numbers, including perl and python and SQL and XQ and so on, we 
> went with the flow and 
> put in a syntax for numbers - ints, decimals, floats, and 
> strings. ( Weren't you the one earlier saying how important it was to 
> use synatx people were used to?).

Indeed I was. I fact, I'm advocating this even more than in your 
paragraph above. I don't want to write "123"^^xsd:integer and 
"123"^^xsd:string---I want to use 123 and "123", respectively.

But just as we shouldn't confuse "things" and "terms", we shouldn't 
confuse "syntax in the serialization or query language" and 
"representation in the model". We both agree that, even if I write 123 
in N3, it gets turned into "123"^^xsd:integer and creates something 
called an rdfs:Literal. But if I were to use 
<http://numers.info/int/123>, I'd get a normal resource that wasn't an 
rdfs:Literal.

Assuming I write 123 and "123" in N3, how is the RDF model any better 
with "123"^^xsd:integer than with <http://numers.info/int/123> ? What 
value does rdfs:Literal bring? Didn't just as much consensus-building go 
into building the atrocious string "123"^^xsd:integer as it would have 
went into creating <rdfdata://xsd/integer/123> ?

> So are Numbers (say, or DateTimes, or Stings) a "different kind of 
> Resource"?  Well, they do have certain properties which are 
> particularly convenient.  Most of the information out there involves 
> lots of them.   They end up occurring in different topologies in real 
> data.    It is often more interesting to ask about all statements 
> about  a person ex:Joe than all statements about the number 2, as 2 
> gets reused so often.  This affects how a store might index them.  But 
> they are Things.   (In OWL DL there is a need for the sake of DL 
> reasons to separate datatype properties and object properties, but 
> that is an artifact of those reasoners and a limitation of OWL DL).
> Numbers have the interesting property, for example, that, when you use 
> the conventional notation terms for them, you can tell that two terms 
> (like '123' and '124') identify different things by just operating on 
> those terms.   Typically, people often read and write numbers, but in 
> a good UI shouldn't have to read and write URIs, doing drag and drop 
> of symbol-icons instead. There are lots of ways in which these 
> Numbers, DateTimes, etc are special in practice, from arbitrary other 
> Things in the universe.

So I agree (I agree! I agree!) that it's convenient to use the strings 
123 and "123" to identify an integer and a string, respectively. Please 
believe that I agree with this. Please! I'm begging of you!

But this begs the same questions that no one seems to want to answer 
(other than to say simply, "they are needed"):

   1. Even if we prefer to write 123 and "123", why is identifying a
      resource in the RDF model by "123"^^xsd:integer any better than
      identifying it by <rdfdata://xsd/integer/123> ?
   2. Even if we prefer to write 123 and "123", why should that generate
      an rdfs:Literal in the model just because we used a shortcut for
      writing the resource in N3? What value does rdfs:Literal bring us?
      How would we miss it if it were gone?
   3. Even if we prefer to write 123 and "123", why do we need
      rdfs:datatype when we can simply use rdf:type set to xsd:Integer?

Garret

Bruce D'Arcus | 1 Aug 20:02 2007
Picon

Re: RDF's curious literals


Garret Wilson wrote:

> But this begs the same questions that no one seems to want to answer 
> (other than to say simply, "they are needed"):

It seems to me a number of people in this thread have suggested the 
reasons are not entirely technical in nature, but also social (and 
perhaps even political).

Not having been involved in any of this, I wouldn't be surprised if 
there wasn't a fair bit of pressure from the W3C community to have a 
notion of datatype in RDF that could be more-or-less consistent with 
datatypes used in other W3C efforts (say, XML Schema and XQuery). Surely 
that would be a good enough reason, even if it might not be the best 
technical reason.

Standards work is really hard, and it is so precisely because different 
people bring different assessments of quality and priority to the table.

Bruce

Garret Wilson | 1 Aug 21:38 2007

Re: RDF's curious literals


Bruce D'Arcus wrote:
> Not having been involved in any of this, I wouldn't be surprised if 
> there wasn't a fair bit of pressure from the W3C community to have a 
> notion of datatype in RDF that could be more-or-less consistent with 
> datatypes used in other W3C efforts (say, XML Schema and XQuery). 
> Surely that would be a good enough reason, even if it might not be the 
> best technical reason.

That's probably a good idea---I want to use the way XML Schema models 
integers and booleans and such.

That doesn't mean we have to create some new thing called rdfs:Literal 
in the RDF model. We could just as easily construct a URI identifier 
from the XML Schema datatype URI, combined with the lexical form in 
question. This would allow integers and such to be used just like any 
other RDF resource, but could leverage all the semantics provided by XML 
Schema.

Reusing work done by XML Schema does not mean we can't have a consistent 
RDF data model---we could easily do both. If the hardest problem we're 
having here is coming to consensus on how to combine an XML Scheme 
datatype URI and a lexical form into a combined URI, that's a small 
problem indeed. *Any* such URI would be better than making the RDF data 
model inconsistent, as has been done.

Garret

cr | 1 Aug 18:06 2007

Re: URIs for huge literals


agree that treating literals as first class resources simplifies many things in API / implementation in
engine as well as UI,
but there are some tradeoffs...

i'm pretty sure converting <http://data.info/integer/12> to 12 is more expensive since it involves some
string parsing  - unless you store the plain value too in the backend - at which point you have a URI _and_ a
literal anyways.. - overall i favor is_literal? and to/from_string  methods on a resource class than have
extra literal classes and branched code everywhere, but its something to think about

we'd need a way to refer to strings too large to fit into URIs:

http://data.info/UTF-8/SHA1/e639ea194606d69fb6e0451b1d0ab552dc5ca398 ? 

RDF should be renamed RLDF..

c

Richard Cyganiak | 2 Aug 00:41 2007
Picon

Re: RDF's curious literals


Garret,

On 1 Aug 2007, at 21:38, Garret Wilson wrote:
> Bruce D'Arcus wrote:
>> Not having been involved in any of this, I wouldn't be surprised  
>> if there wasn't a fair bit of pressure from the W3C community to  
>> have a notion of datatype in RDF that could be more-or-less  
>> consistent with datatypes used in other W3C efforts (say, XML  
>> Schema and XQuery). Surely that would be a good enough reason,  
>> even if it might not be the best technical reason.
>
> That's probably a good idea---I want to use the way XML Schema  
> models integers and booleans and such.
>
> That doesn't mean we have to create some new thing called  
> rdfs:Literal in the RDF model. We could just as easily construct a  
> URI identifier from the XML Schema datatype URI, combined with the  
> lexical form in question. This would allow integers and such to be  
> used just like any other RDF resource, but could leverage all the  
> semantics provided by XML Schema.

rdfs:Literal is not what you think. From your writing, it sounds as  
if you believe that rdfs:Literal is the class of all things that are  
represented as "foo"^^ex:bar in the RDF abstract syntax (what you  
call the “RDF model”). But in fact, rdfs:Literal is the value space  
of the lexical-to-value mapping functions of all known datatypes. In  
other words, "1"^^xsd:int is an rdfs:Literal, but "Bob"^^xsd:int is  
not, because it's not well-formed. The lexical-to-value mapping of  
xsd:int doesn't map "Bob" to anything.

In the semantics for your proposed alternative abstract syntax, where  
typed literals are represented like <http://example.com/int/1>, you  
probably still would want to use L2V mappings because they could tell  
you that <http://example.com/int/1> and <http://example.com/int/0001>  
denote the same thing. So there might still be some benefit in having  
an rdfs:Literal class. <http://example.com/int/1> would be an  
rdfs:Literal, but <http://example.com/int/Bob> would not.

To summarize: rdfs:Literal does *not* exist because literals are  
represented differently from other resources in the RDF abstract  
syntax. It does exist because lexical-to-value mappings are used in  
the RDF semantics. (I speculate that the motivation for having L2V  
mappings in RDF semantics was a desire to re-use XSD datatypes, which  
already existed with well-defined L2V mapping functions.)

This doesn't invalidate your main point though, which is about how  
literals ought to be represented in the RDF abstract syntax.

As another interesting aside: Are you aware that all RDF datatypes,  
such as xsd:int, xsd:string and that you might define yourself, are  
also RDFS classes that contain exactly the members of the datatype's  
value space? In other words, the only thing that stops you from saying

     "1"^^xsd:int rdf:type xsd:int .

is the fact that RDF doesn't allow literals in the subject position.  
If we use the workaround and define that

     <http://example.com/int/1> = "1"^^xsd:int .

then it follows that

     <http://example.com/int/1> rdf:type xsd:int .

is true.

This entailment is in RDF Semantics to make rdfs:range work not just  
with “ordinary” classes but also with RDF datatypes, which therefore  
must be defined as classes as well.

> Reusing work done by XML Schema does not mean we can't have a  
> consistent RDF data model---we could easily do both. If the hardest  
> problem we're having here is coming to consensus on how to combine  
> an XML Scheme datatype URI and a lexical form into a combined URI,  
> that's a small problem indeed. *Any* such URI would be better than  
> making the RDF data model inconsistent, as has been done.

I think you have made a decent case that the RDF abstract syntax does  
not necessarily need the distinction between URIs and literals. I  
don't think you have demonstrated that the RDF abstract syntax is  
*inconsistent*.

Best,
Richard

>
> Garret
>
>

Richard Cyganiak | 2 Aug 00:46 2007
Picon

Re: RDF's curious literals


On 1 Aug 2007, at 18:32, Garret Wilson wrote:
>   3. Even if we prefer to write 123 and "123", why do we need
>      rdfs:datatype when we can simply use rdf:type set to xsd:Integer?

Why do you keep railing against rdf:datatype? It is merely an  
artifact of the RDF/XML syntax. It does not exist in the RDF abstract  
syntax (which you call the “RDF model”).

And we wouldn't want anyone to mix up surface serialization syntax  
and abstract model in this thread, wouldn't we? ;-)

(Just kidding -- I think I understood the point you are trying to make.)

Cheers,
Richard

>
>
> Garret
>
>

Garret Wilson | 2 Aug 16:14 2007

Re: RDF's curious literals


Richard,

I'm still trying to put the larger RDF literal conversation on hold for 
a day or two so I can be productive on other things, but one thing I 
want to clear up, for my benefit as well:

Richard Cyganiak wrote:
>
>
> On 1 Aug 2007, at 18:32, Garret Wilson wrote:
>>   3. Even if we prefer to write 123 and "123", why do we need
>>      rdfs:datatype when we can simply use rdf:type set to xsd:Integer?
>
> Why do you keep railing against rdf:datatype? It is merely an artifact 
> of the RDF/XML syntax. It does not exist in the RDF abstract syntax 
> (which you call the “RDF model”).

What? If that were true, there would be no such things as typed literals 
in the model, because once you suck RDF/XML or N3 into the model and 
then re-serialize it, you'd just have plain literals again. (Sort of 
like "erasure" in Java generics.) And there would be no use for typed 
literals in general, because you couldn't query or otherwise use the 
type information. (You query the model, not the serialization, after all.)

Citing "RDF: Concepts and Abstract Syntax" 
<http://www.w3.org/TR/rdf-concepts/#section-Graph-Literal> , I note 
that, "Typed literals have a lexical form and a datatype URI being an 
RDF URI reference."

So datatypes do make it to the model, although they might not appear as 
normal resource properties (which I don't think I ever claimed, and if I 
did it was without thinking and beside the point). They seem to just be 
values related to the literal, sort of like a resource's URI.

So that means, in an API, if I want to see if an object is a US 
president or an integer, I would have to do the following:

Resource resource=getResourceSomehow();
if(resource instanceof Literal)  //we can't look at the datatype unless 
this is a literal
{
  if(XSD_INTEGER.equals(((Literal)resource).getDataType()))
  {
    //this is an integer
  }
}
else  //if this is not a literal, we can check rdf:type
{
  if(US_PRESIDENT.equals(resource.getProperty(RDF_TYPE)))
  {
    //this is a US president
  }
}

Do you see why I still claim that the RDF abstract syntax is 
inconsistent, which you disputed in a separate email to this thread? Why 
must I use two separate ways to check the types of literal resources and 
non-literal resources?

>
> And we wouldn't want anyone to mix up surface serialization syntax and 
> abstract model in this thread, wouldn't we? ;-)

We wouldn't indeed. ;)

>
> (Just kidding -- I think I understood the point you are trying to make.)

Thanks! It's nice to hear that once in a while. :)

Best,

Garret

P.S. Does this mean I can keep railing against rdf:datatype? ;)

Richard Cyganiak | 2 Aug 19:23 2007
Picon

Re: RDF's curious literals


On 2 Aug 2007, at 16:14, Garret Wilson wrote:
>> On 1 Aug 2007, at 18:32, Garret Wilson wrote:
>>>   3. Even if we prefer to write 123 and "123", why do we need
>>>      rdfs:datatype when we can simply use rdf:type set to  
>>> xsd:Integer?
>>
>> Why do you keep railing against rdf:datatype? It is merely an  
>> artifact of the RDF/XML syntax. It does not exist in the RDF  
>> abstract syntax (which you call the “RDF model”).
>
> What? If that were true, there would be no such things as typed  
> literals in the model, because once you suck RDF/XML or N3 into the  
> model and then re-serialize it, you'd just have plain literals again.

Garret, you say things like “why do we need rdf:datatype” and “death  
to rdf:datatype”. rdf:datatype is an XML attribute in the RDF/XML  
sytnax. Nothing else.

You quote RDF Concepts and Abstract Syntax. You will find that the  
document does not mention rdf:datatype. The only specification that  
mentions rdf:datatype is the RDF/XML spec.

Now, of course, there *is* a distinction between typed literals and  
other resources in the RDF abstract syntax.

So, when you say, “death to rdf:datatype”, do you mean “death to  
rdf:datatype” or do you mean “death to the distinction between typed  
literals and URIs in the RDF abstract syntax”?

You know, it's kind of hard to follow your arguments when you mix up  
terms.

[snip]
> So that means, in an API, if I want to see if an object is a US  
> president or an integer, I would have to do the following:

Do you? No. An API is just another surface syntax, and you can do  
whatever pleases you in an RDF API. The abstract syntax doesn't in  
any way limit or prescribe API design decisions. All it gives you is  
some well-defined terminology that you *can* use in your API.

You can do all kinds of convenient shortcuts and use whatever  
terminology you like in your API, just as you can do all kinds of  
convenient shortcuts in your RDF serialization of choice.

Discussions of APIs are not very relevant to questions about the RDF  
abstract syntax. No one is trying to stop you from unifying URIs and  
literals *in your API*. Which I think would be a good idea. But we  
are talking about the RDF abstract model here, don't we?

To use an XML metaphor, the RDF abstract syntax isn't like the DOM  
(an API for XML), it's like the XML Infoset (an abstract data model  
for XML).

> Resource resource=getResourceSomehow();
> if(resource instanceof Literal)  //we can't look at the datatype  
> unless this is a literal
> {
>  if(XSD_INTEGER.equals(((Literal)resource).getDataType()))
>  {
>    //this is an integer
>  }
> }
> else  //if this is not a literal, we can check rdf:type
> {
>  if(US_PRESIDENT.equals(resource.getProperty(RDF_TYPE)))
>  {
>    //this is a US president
>  }
> }
>
> Do you see why I still claim that the RDF abstract syntax is  
> inconsistent,

“Inconsistent” means “containing a contradiction”. The example above  
might demonstrate poor API design or a certain redundancy or  
awkwardness in the RDF abstract syntax. But I don't see a  
contradiction anywhere.

Now you'll certainly say that this is all totally beside the point,  
and that you didn't really mean to say “inconsistent” but something  
else, and that we should consider your argument based on what you  
*mean*, not on what you *say*.

Go ahead.

Best,
Richard

> which you disputed in a separate email to this thread? Why must I  
> use two separate ways to check the types of literal resources and  
> non-literal resources?
>
>>
>> And we wouldn't want anyone to mix up surface serialization syntax  
>> and abstract model in this thread, wouldn't we? ;-)
>
> We wouldn't indeed. ;)
>
>>
>> (Just kidding -- I think I understood the point you are trying to  
>> make.)
>
> Thanks! It's nice to hear that once in a while. :)
>
> Best,
>
> Garret
>
> P.S. Does this mean I can keep railing against rdf:datatype? ;)
>

Lee Feigenbaum | 2 Aug 17:37 2007
Picon

Re: RDF's curious literals


Richard Cyganiak wrote:
> 
> 
> On 2 Aug 2007, at 16:14, Garret Wilson wrote:
>>> On 1 Aug 2007, at 18:32, Garret Wilson wrote:
>>>>   3. Even if we prefer to write 123 and "123", why do we need
>>>>      rdfs:datatype when we can simply use rdf:type set to xsd:Integer?
>>>
>>> Why do you keep railing against rdf:datatype? It is merely an 
>>> artifact of the RDF/XML syntax. It does not exist in the RDF abstract 
>>> syntax (which you call the “RDF model”).
>>
>> What? If that were true, there would be no such things as typed 
>> literals in the model, because once you suck RDF/XML or N3 into the 
>> model and then re-serialize it, you'd just have plain literals again.
> 
> Garret, you say things like “why do we need rdf:datatype” and “death to 
> rdf:datatype”. rdf:datatype is an XML attribute in the RDF/XML sytnax. 
> Nothing else.

Richard, Garret has been railing against rdf**s**:datatype, not 
rdf:datatype. Of course, there is no such thing as rdfs:datatype, so 
I've assumed all along he means rdfs:Datatype, analogous to his other 
comments on rdfs:Literal. See http://www.w3.org/TR/rdf-schema/#ch_datatyp .

rdfs:Datatype is, of course, quite distinct from the RDF/XML datatype 
attribute.

Lee

Story Henry | 2 Aug 17:55 2007
Picon

Re: RDF's curious literals


On 2 Aug 2007, at 17:37, Lee Feigenbaum wrote:
> Richard Cyganiak wrote:
>> On 2 Aug 2007, at 16:14, Garret Wilson wrote:
>>>> On 1 Aug 2007, at 18:32, Garret Wilson wrote:
>>>>>   3. Even if we prefer to write 123 and "123", why do we need
>>>>>      rdfs:datatype when we can simply use rdf:type set to  
>>>>> xsd:Integer?
>>>>
>>>> Why do you keep railing against rdf:datatype? It is merely an  
>>>> artifact of the RDF/XML syntax. It does not exist in the RDF  
>>>> abstract syntax (which you call the “RDF model”).
>>>
>>> What? If that were true, there would be no such things as typed  
>>> literals in the model, because once you suck RDF/XML or N3 into  
>>> the model and then re-serialize it, you'd just have plain  
>>> literals again.
>> Garret, you say things like “why do we need rdf:datatype” and  
>> “death to rdf:datatype”. rdf:datatype is an XML attribute in the  
>> RDF/XML sytnax. Nothing else.
>
> Richard, Garret has been railing against rdf**s**:datatype, not  
> rdf:datatype. Of course, there is no such thing as rdfs:datatype,  
> so I've assumed all along he means rdfs:Datatype, analogous to his  
> other comments on rdfs:Literal. See http://www.w3.org/TR/rdf-schema/ 
> #ch_datatyp .
>
> rdfs:Datatype is, of course, quite distinct from the RDF/XML  
> datatype attribute.
>
> Lee

Thanks Lee, precision really is important and it helps move the  
conversation along. The rdfs:Datatype link you point to above links  
to the following Datatypes section:

http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Datatypes

[[
A datatype consists of a lexical space, a value space and a lexical- 
to-value mapping.

The lexical space of a datatype is a set of Unicode [UNICODE] strings.

The lexical-to-value mapping of a datatype is a set of pairs whose  
first element belongs to the lexical space of the datatype, and the  
second element belongs to the value space of the datatype:

Each member of the lexical space is paired with (maps to) exactly one  
member of the value space.
Each member of the value space may be paired with any number  
(including zero) of members of the lexical space (lexical  
representations for that value).
A datatype is identified by one or more URI references.

RDF may be used with any datatype definition that conforms to this  
abstraction, even if not defined in terms of XML Schema.

]]

The important point there is that there has to be a lexical to value  
mapping, and there has to be a one to one mapping. This can only work  
of course if all the information is contained in the String, ie,  
there is not more information to be got from anywhere else, or else  
there would be no way to create a decision procedure for it. So  
"George Bush"^^xxx:presidents would not work. For one, George Bush  
may never have been president.

Now if you think about say ints in Java you have exactly the same  
thing going on. You have some space allocated to the int which is  
equivalent to the lexical space, and some flag somewhere that tells  
the computer to interpret it as an int (and not as a char say). Think  
of the datatype URI as the same thing as that flag.

All that RDF is doing here is identifying the type using a URI,  
whereas other programming languages would do it with some local  
convention.

Henry

Sandro Hawke | 2 Aug 18:24 2007
Picon

Re: RDF's curious literals


> http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Datatypes
> 
> [[
> A datatype consists of a lexical space, a value space and a lexical-=20
> to-value mapping.
> 
> The lexical space of a datatype is a set of Unicode [UNICODE] strings.
> 
> The lexical-to-value mapping of a datatype is a set of pairs whose =20
> first element belongs to the lexical space of the datatype, and the =20
> second element belongs to the value space of the datatype:
> 
> Each member of the lexical space is paired with (maps to) exactly one =20=
> 
> member of the value space.
> Each member of the value space may be paired with any number =20
> (including zero) of members of the lexical space (lexical =20
> representations for that value).
> A datatype is identified by one or more URI references.
> 
> RDF may be used with any datatype definition that conforms to this =20
> abstraction, even if not defined in terms of XML Schema.
> 
> ]]

Is there some other distinguishing criterion on datatypes?   

I can't think of one.

Therefore, X is a datatype if and only if X is an owl:FunctionalProperty
and the rdfs:domain of X contains only Unicode strings.

It's kind of a shame this logic isn't used more in the design of RDF to
simplify it.   There might be another re-articulation of the RDF design
which would use this fact.    

(Again, this is not a new idea [1]).

> The important point there is that there has to be a lexical to value 
> mapping, and there has to be a one to one mapping.

No, it's a one-to-many mapping.    For instance, the value one can be
mapped to from "1", "01", "001", etc.

> This can only work 
> of course if all the information is contained in the String, ie, 
> there is not more information to be got from anywhere else, or else 
> there would be no way to create a decision procedure for it. So 
> "George Bush"^^xxx:presidents would not work. For one, George Bush 
> may never have been president.

But it's easy for me to construct the mapping you quoted above.  So I
can't understand what you mean.  A datatype like eg:uspresidents makes
every bit as much sense as a datatype like xs:date.

This all brings me back to:

   A simpler abstract syntax for RDF would be N-triples where you only
   have unicode-string-literals (plainLiterals), b-nodes, and a "URI"
   predicate. (You'd have to allow b-nodes in the predicate position.)
   The current design of RDF is basically syntactic sugar on that
   foundation.

   [Personally, I'd probably use bit-strings or byte-strings instead of
   unicode-character-strings, too.    Tough design tradeoffs, there.]

   -- Sandro

[1] http://esw.w3.org/topic/InterpretationProperties

Story Henry | 2 Aug 18:41 2007
Picon

Re: RDF's curious literals


On 2 Aug 2007, at 18:24, Sandro Hawke wrote:

>> http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section- 
>> Datatypes
>>
>> [[
>> A datatype consists of a lexical space, a value space and a  
>> lexical-=20
>> to-value mapping.
>>
>> The lexical space of a datatype is a set of Unicode [UNICODE]  
>> strings.
>>
>> The lexical-to-value mapping of a datatype is a set of pairs whose  
>> =20
>> first element belongs to the lexical space of the datatype, and  
>> the =20
>> second element belongs to the value space of the datatype:
>>
>> Each member of the lexical space is paired with (maps to) exactly  
>> one =20=
>>
>> member of the value space.
>> Each member of the value space may be paired with any number =20
>> (including zero) of members of the lexical space (lexical =20
>> representations for that value).
>> A datatype is identified by one or more URI references.
>>
>> RDF may be used with any datatype definition that conforms to this  
>> =20
>> abstraction, even if not defined in terms of XML Schema.
>>
>> ]]
>
>
> Is there some other distinguishing criterion on datatypes?
>
> I can't think of one.
>
> Therefore, X is a datatype if and only if X is an  
> owl:FunctionalProperty
> and the rdfs:domain of X contains only Unicode strings.

well I was proposing in addition that it be a necessary truth, ie it  
necessarily identify the thing that way,  which will bring up some  
fun problems, but I think that the intuition behind datatypes is  
something along those lines.

> It's kind of a shame this logic isn't used more in the design of  
> RDF to
> simplify it.   There might be another re-articulation of the RDF  
> design
> which would use this fact.
>
> (Again, this is not a new idea [1]).
>
>> The important point there is that there has to be a lexical to value
>> mapping, and there has to be a one to one mapping.
>
> No, it's a one-to-many mapping.    For instance, the value one can be
> mapped to from "1", "01", "001", etc.

quite right, I wrote too fast.

>> This can only work
>> of course if all the information is contained in the String, ie,
>> there is not more information to be got from anywhere else, or else
>> there would be no way to create a decision procedure for it. So
>> "George Bush"^^xxx:presidents would not work. For one, George Bush
>> may never have been president.
>
> But it's easy for me to construct the mapping you quoted above.  So I
> can't understand what you mean.  A datatype like eg:uspresidents makes
> every bit as much sense as a datatype like xs:date.

That would not work with eg:uspresidents, because Bush could have had  
a different name, and  could also not have been president.

With numbers on the other hand it is different. Even though 10 could  
be written differently than "10"

10 xsd:int "10".

is necessarily true.

> This all brings me back to:
>
>    A simpler abstract syntax for RDF would be N-triples where you only
>    have unicode-string-literals (plainLiterals), b-nodes, and a "URI"
>    predicate. (You'd have to allow b-nodes in the predicate position.)
>    The current design of RDF is basically syntactic sugar on that
>    foundation.
>
>    [Personally, I'd probably use bit-strings or byte-strings  
> instead of
>    unicode-character-strings, too.    Tough design tradeoffs, there.]
>

Ok. I think what you are saying is that there is no need for the ^^  
symbol, in which case you are probably right. One can say the same  
using the ^ symbol.

1 xsd:int "1";
   = "1"^xsd:int;
   = "1"^^xsd:int .

On the other hand I think the class of rdfs:Datatypes remains a  
useful one.

Henry

>    -- Sandro
>
> [1] http://esw.w3.org/topic/InterpretationProperties

Sandro Hawke | 2 Aug 19:32 2007
Picon

Re: RDF's curious literals


Story Henry <henry.story <at> bblfish.net> writes:

> > Therefore, X is a datatype if and only if X is an  
> > owl:FunctionalProperty
> > and the rdfs:domain of X contains only Unicode strings.
> 
> well I was proposing in addition that it be a necessary truth, ie it  
> necessarily identify the thing that way,  which will bring up some  
> fun problems, but I think that the intuition behind datatypes is  
> something along those lines.
...
> >> This can only work
> >> of course if all the information is contained in the String, ie,
> >> there is not more information to be got from anywhere else, or else
> >> there would be no way to create a decision procedure for it. So
> >> "George Bush"^^xxx:presidents would not work. For one, George Bush
> >> may never have been president.
> >
> > But it's easy for me to construct the mapping you quoted above.  So I
> > can't understand what you mean.  A datatype like eg:uspresidents makes
> > every bit as much sense as a datatype like xs:date.
> 
> That would not work with eg:uspresidents, because Bush could have had  
> a different name, and  could also not have been president.
> 
> With numbers on the other hand it is different. Even though 10 could  
> be written differently than "10"
> 
> 10 xsd:int "10".
> 
> is necessarily true.

So you're saying the distinction is that datatype lexical representation
strings are rigid designators [1] ?  Hmmm.  Let me think this
through....  

Again, I'll return to my example of dates, rather than integers, since
their being socially constructed is more obvious.  There was an instant
where I paused, in writing this sentence, to record the time.  That
instant is named in RDF as "2007-08-02T17:14:39"^^xs:dateTime.  Is that
a rigid designator?  In a world where the Gregorian calendar corrections
were never adopted, that instant would be named something more like
``"2007-08-16T17:14:39"^^xs:dateTime''.  (etc, etc, with all the
different ways we could do calendars and clocks.)

I think I would argue that URIs are rigid designators, so the example
above fails because I used "xs:dateTime" to name a different mapping.
In the first case it's the Gregorian calendar dateTime; in the second
it's some pseudo-Julian calendar dateTime.  But the string itself,
"2007-08-16T17:14:39" is meaningless/useless without being paired like
that.    It's not the rigid designator itself.

Similarly, my term eg:uspresidents rigidly designates the pairs of names
and presidents in my world.   So, yes, "George W. Bush"^^eg:uspresidents
is a rigid designator -- it necessarily refers to the current president
of the US in my world.

Isn't that right?

   -- Sandro

[1] http://en.wikipedia.org/wiki/Rigid_designator

Story Henry | 2 Aug 20:02 2007
Picon

Re: RDF's curious literals


On 2 Aug 2007, at 19:32, Sandro Hawke wrote:
> Story Henry <henry.story <at> bblfish.net> writes:
>
>>> Therefore, X is a datatype if and only if X is an
>>> owl:FunctionalProperty
>>> and the rdfs:domain of X contains only Unicode strings.
>>
>> well I was proposing in addition that it be a necessary truth, ie it
>> necessarily identify the thing that way,  which will bring up some
>> fun problems, but I think that the intuition behind datatypes is
>> something along those lines.
> ...
>>>> This can only work
>>>> of course if all the information is contained in the String, ie,
>>>> there is not more information to be got from anywhere else, or else
>>>> there would be no way to create a decision procedure for it. So
>>>> "George Bush"^^xxx:presidents would not work. For one, George Bush
>>>> may never have been president.
>>>
>>> But it's easy for me to construct the mapping you quoted above.   
>>> So I
>>> can't understand what you mean.  A datatype like eg:uspresidents  
>>> makes
>>> every bit as much sense as a datatype like xs:date.
>>
>> That would not work with eg:uspresidents, because Bush could have had
>> a different name, and  could also not have been president.
>>
>> With numbers on the other hand it is different. Even though 10 could
>> be written differently than "10"
>>
>> 10 xsd:int "10".
>>
>> is necessarily true.
>
> So you're saying the distinction is that datatype lexical  
> representation
> strings are rigid designators [1] ?  Hmmm.  Let me think this
> through....
>

Oh dear :-/
I did say this would bring up some fun problems....

> Again, I'll return to my example of dates, rather than integers, since
> their being socially constructed is more obvious.  There was an  
> instant
> where I paused, in writing this sentence, to record the time.  That
> instant is named in RDF as "2007-08-02T17:14:39"^^xs:dateTime.  Is  
> that
> a rigid designator?  In a world where the Gregorian calendar  
> corrections
> were never adopted, that instant would be named something more like
> ``"2007-08-16T17:14:39"^^xs:dateTime''.  (etc, etc, with all the
> different ways we could do calendars and clocks.)

ok so let me bite on the rigid designator. Of course the rigid  
designator here
is xs:dateTime. It refers rigidly to a mathematical function that is  
implemented by say
java.util.GregorianCalendar [2]. The string, being a string, rigidly  
refers to itself.

So we have a mathematical function from strings to unix time, seconds  
since 1970, which
can then be mapped to seconds since the beginning of the universe,  
which is just an index
in space time.

I think that solves your problem below.

> I think I would argue that URIs are rigid designators, so the example
> above fails because I used "xs:dateTime" to name a different mapping.
> In the first case it's the Gregorian calendar dateTime; in the second
> it's some pseudo-Julian calendar dateTime.  But the string itself,
> "2007-08-16T17:14:39" is meaningless/useless without being paired like
> that.    It's not the rigid designator itself.
>
> Similarly, my term eg:uspresidents rigidly designates the pairs of  
> names
> and presidents in my world.   So, yes, "George W.  
> Bush"^^eg:uspresidents
> is a rigid designator -- it necessarily refers to the current  
> president
> of the US in my world.

Now if we look here at the eg:uspresidents then I think it would be  
impossible to build such a
function in code that would only require the input from the string to  
get its value. Well the code
would have to have a database of all the name of presidents for it to  
work. And what if we have two presidents with the same name?

You are speaking of the subject here as rigidly designating Bush. And  
that may be. But I was thinking that the class of literals as being  
those things that can be designated by a rigidly designated  
transformation from a string.

It does seem like a slipery position to have.

:-)

>
> Isn't that right?
>
>    -- Sandro
>
> [1] http://en.wikipedia.org/wiki/Rigid_designator
[2] http://java.sun.com/j2se/1.4.2/docs/api/java/util/ 
GregorianCalendar.html

Garret Wilson | 2 Aug 21:24 2007

Re: RDF's curious literals


Story Henry wrote:
> Now if we look here at the eg:uspresidents then I think it would be 
> impossible to build such a
> function in code that would only require the input from the string to 
> get its value. Well the code
> would have to have a database of all the name of presidents for it to 
> work. And what if we have two presidents with the same name?

I think this discussion is getting confusing because of a few too many 
philosophical terms, but that could be just because I don't fully (yet) 
understand those terms. ;)

The name of the president and whether it is unique is a red herring. The 
main question is whether a US president could be a literal in the RDF 
abstract syntax. The answer is yes; use the president's social security 
number or some other identifying number, if names confuse the issue: 
"123-45-6789"^^eg:uspresident.

In fact, any resource could be an instance of rdfs:Literal in the RDF 
abstract syntax, if you decide to give it an identifying string. Those 
same resource would not be literals if you used URIs to identify them 
rather than strings. Having a concept of rdfs:Literal bring no 
additional value to the RDF abstract syntax.

Another way of summarizing things: the string "10" only relates to the 
value 10 in the context of datatype xsd:integer (which is a subclass of 
xsd:decimal). The string "10" would relate to the value 2 in the context 
of the datatype eg:binary. The string "10" doesn't relate to any number 
by itself---only in the context of some datatype. That means you cannot 
talk about an RDF integer literal using only the string "10"; you must 
use "10"^^<|http://www.w3.org/2001/XMLSchema#integer>.

Some people are telling me that there is some huge, unbridgeable divide 
between the following two representations:

|"10"^^<|http://www.w3.org/2001/XMLSchema#integer>|
<|http://www.w3.org/2001/XMLSchema#integer:|10|>|

They look pretty similar to me, except that the latter allows me to 
treat so-called "literals" just like any other resource.

Garret

Jeremy Carroll | 2 Aug 21:43 2007
Picon

Re: RDF's curious literals


Garret Wilson wrote:
> The name of the president and whether it is unique is a red herring. The 
> main question is whether a US president could be a literal in the RDF 
> abstract syntax. The answer is yes; use the president's social security 
> number or some other identifying number, if names confuse the issue: 
> "123-45-6789"^^eg:uspresident.

Correct.
However, your datatype eg:uspresident is unlikely to be widely supported 
so this would be a poor engineering decision.

> 
> In fact, any resource could be an instance of rdfs:Literal in the RDF 
> abstract syntax, if you decide to give it an identifying string. Those 
> same resource would not be literals if you used URIs to identify them 
> rather than strings. Having a concept of rdfs:Literal bring no 
> additional value to the RDF abstract syntax.

Again correct - but to represent an arbitrary resource one would need a 
user defined datatype, and a private agreement between implementors as 
to its lexical-to-value mapping.

> 
> Another way of summarizing things: the string "10" only relates to the 
> value 10 in the context of datatype xsd:integer (which is a subclass of 
> xsd:decimal). The string "10" would relate to the value 2 in the context 
> of the datatype eg:binary. The string "10" doesn't relate to any number 
> by itself---only in the context of some datatype.

Correct
> That means you cannot 
> talk about an RDF integer literal using only the string "10"; 
> you must 
> use "10"^^<|http://www.w3.org/2001/XMLSchema#integer>.

Correct in the abstract syntax, but then that is intended for maximum 
clarity, and not intended for usability. A better surface syntax would 
allow you to simply write 10. You are free to design such a surface syntax.

> 
> Some people are telling me that there is some huge, unbridgeable divide 
> between the following two representations:
> 
> |"10"^^<|http://www.w3.org/2001/XMLSchema#integer>|
> <|http://www.w3.org/2001/XMLSchema#integer:|10|>|
> 
> They look pretty similar to me, except that the latter allows me to 
> treat so-called "literals" just like any other resource.

No, incorrect. In the RDF Semantics some other resource e.g.
http://example.org/10 has an unknown denotation, whereas 
"10"^^<|http://www.w3.org/2001/XMLSchema#integer> has a known 
denotation, the number 10.

If you wanted to modify RDF so that
http://www.w3.org/2001/XMLSchema#integer:|10|
denoted the number 10, then that would be a fairly large change, in that 
it would be necessary to specify which URIs did datatype magic, and 
which do not.

Jeremy

--

-- 
Hewlett-Packard Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Jeremy Carroll | 2 Aug 21:36 2007
Picon

Re: RDF's curious literals


At some level this thread is rather futile.
The RDF design includes a design for representing numbers, amongst other 
things.
This is now fairly well deployed with interoperable implementations.

Garret doesn't like this aspect of the design.

Well, that's life.

All aspects of agreements between numerous people involve aspects that 
some people dislike. It is particular irksome when, for some reason, we 
end up participating in an aspect of the world which other people agreed 
on, and we are too late to the party to argue against something we don't 
like.

I think it may be less futile to give some sort of design rationale.

There are two approaches:
- give an historical account of how we got to where we are
- give a more abstract account of the problem space, and see which 
aspects of the current design are essentially inevitable.

I'll try the latter - the former is available in the mail archives of 
the RDF Core WG.

=====

RDF is intended as a way of describing things.

Most of the things being described, and the means to describe them, are 
identified by URIs. However, URIs are non-rigid designators, i.e. it is 
not always clear what a URI is intended to represent. The RDF Semantics 
is written with the weakest possible assumption that each URI represents 
something, but we don't know what.

It is also helpful to have some aspects of the descriptions using rigid 
designators, where what they represent is known in advance. In RDF these 
things are called literals. Initially the only sort of literals were 
strings. This was fairly limiting, and there was a desire to include 
other datatypes, such as those defined by XML Schema

Given that we wanted to have an open framework, which wasn't limited to 
just the XML Schema datatypes, we decided that the author of an RDF 
document could use whatever datatype they wanted; although we did not 
define a means by which they could declare new datatypes, but require 
private agreement for new datatypes. If there was a call to fix this, it 
could be done.

To allow anyone to introduce there own datatypes we used the notion of a 
datatype URI to identify the datatype being used. I think this is highly 
defensible design decision.

Since the point of having literals is to have things whose 
interpretation is known, the datatype acts as the means by which that 
interpretation is defined. Hence a datatype has a lexical-to-value mapping.

To provide a useful set of datatypes, we use the XML Schema datatypes, 
identified by the URIs given by the XML Schema WG.

As many people have pointed out the abstract syntax is an abstract 
syntax. It is not intended to limit the way that RDF is written down, 
nor is it intended as the meaning of an RDF document. Thus in the 
abstract syntax a typed literal is represented as a pair: the datatype 
URI and a string. In RDF Semantics this is then mapped to the specific 
value as given by the datatype. Having such predefined designators is a 
fundamental requirement for being able to use known values in 
descriptions of resources, which was one of the goals of the literal design.

Moreover any design which allows arbitrary user defined datatypes ends 
up needing something like a URI to represent the datatype, and something 
like the lexical form to represent the string representation of that 
value: at least at the abstract syntax level. You are free to write that 
pair however you like, including omitting the datatype URI and the 
quotes around the string, as long as in the syntax you are using they 
are superfluous, and then they can be (logically) put back into the 
abstract syntax.

====

There were other design options we considered, but they all included the 
notion of a datatype URI and the notion of a lexical form, the notion of 
a lexical to value mapping, and the value space.

Garret's proposed design also seems to include these - except that the 
datatype URI is used as URI prefix, and the lexical form is used as a 
suffix. This seems to require analysis of the internals of a URI in 
order to identify what it means, and I prefer the designs where these 
components are separated.

Jeremy

--

-- 
Hewlett-Packard Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Garret Wilson | 2 Aug 22:38 2007

Re: RDF's curious literals


Jeremy Carroll wrote:
> At some level this thread is rather futile.

At some level you're right. ;)

It's probably appropriate to wind down the discussion now, so I'm bowing 
out. I've pretty much made my points, although I have one last rather 
clever point I haven't made---but it would nevertheless probably be 
redundant. Just remember that this discussion was less of a proposal for 
RDF 2.0 (I wasn't ready to do that yet) than an attempt to establish 
that literals are superfluous in the RDF abstract syntax---whether or we 
want to change that in future RDF versions. Some people got my point; 
others didn't. Out of the ones that did get it, a large number of them 
agreed---although a large number of those that agreed with my point 
thought that making the point was futile. :)

Before I stop participating in this thread, let me say one thing: It 
would be possible for an RDF 1.5 to throw out the concept of a literal 
and create a datatype+lexical form->URI mapping. RDF 1.5 abstract syntax 
would identify all literals by URIs, and use the rdf:type property 
rather than rdfs:Datatype to identify the types of these resources. This 
move could be backwards compatible: an RDF 1.5 processor would simply 
turn existing serializations using "10"^^xsd:integer into a resource 
with the URI some form of <xsd:integer:10> with an rdf:type of 
xsd:Integer. RDF 1.5 processors would therefore work with RDF 1.x 
serializations with no problem. Existing SPARQL queries would work just 
fine. It's just that new serializations wouldn't use all the ^^ and 
rdf:datatatype="" stuff, and new models (and APIs) wouldn't even think 
of literals as a separate type of resource. Literal-related APIs would 
be deprecated, but would still function (they would just get converted 
to URI and rdf:type queries in the background).

But you're right; it's never going to happen.

Thanks for the discussion, everyone.

Best,

Garret

Story Henry | 2 Aug 21:42 2007
Picon

Re: RDF's curious literals


On 2 Aug 2007, at 18:24, Sandro Hawke wrote:
> Therefore, X is a datatype if and only if X is an  
> owl:FunctionalProperty
> and the rdfs:domain of X contains only Unicode strings.

That would be an owl:inverseFunctional property I think and it would  
be the range.
I think you were trying to say something like this .

 <at> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
 <at> prefix owl: <http://www.w3.org/2002/07/owl#> .
 <at> prefix xsd: <<http://www.w3.org/2001/XMLSchema#> .

{ ?r a owl:inverseFunctionalProperty;
      rdfs:range xsd:string .
    ?lit ?r ?string . } => { ?lit a rdfs:Literal . }

How to then add what I was trying to say about necessity is a bit  
more difficult.

Henry

Garret Wilson | 2 Aug 17:45 2007

Re: RDF's curious literals


Richard Cyganiak wrote:
>> What? If that were true, there would be no such things as typed 
>> literals in the model, because once you suck RDF/XML or N3 into the 
>> model and then re-serialize it, you'd just have plain literals again.
>
> Garret, you say things like “why do we need rdf:datatype” and “death 
> to rdf:datatype”. rdf:datatype is an XML attribute in the RDF/XML 
> sytnax. Nothing else.

But it produces something in the model. Forgive me if I said 
"rdf:datatype" when I meant, "the datatype URI of a typed literal, which 
is represented by rdf:datatype in the RDF/XML syntax".

>
> You quote RDF Concepts and Abstract Syntax. You will find that the 
> document does not mention rdf:datatype. The only specification that 
> mentions rdf:datatype is the RDF/XML spec.

But it mentions a typed literal's datatype URI, which is what I'm 
referring to. I'm sorry for putting the "rdf:" prefix on the 
term---you're correct that by doing so I incorrectly referred to 
something in the RDF/XML syntax.

>
> Now, of course, there *is* a distinction between typed literals and 
> other resources in the RDF abstract syntax.
>
> So, when you say, “death to rdf:datatype”, do you mean “death to 
> rdf:datatype” or do you mean “death to the distinction between typed 
> literals and URIs in the RDF abstract syntax”?

When I say "death to rdf:datatype", I mean "death to the datatype URI".

>
> You know, it's kind of hard to follow your arguments when you mix up 
> terms.

Um... well, I technically did mix up terms, but... well, now you know.

Death to the datatype URI.

>
>> So that means, in an API, if I want to see if an object is a US 
>> president or an integer, I would have to do the following:
>
> Do you? No. An API is just another surface syntax, and you can do 
> whatever pleases you in an RDF API. The abstract syntax doesn't in any 
> way limit or prescribe API design decisions. All it gives you is some 
> well-defined terminology that you *can* use in your API.
>
> You can do all kinds of convenient shortcuts and use whatever 
> terminology you like in your API, just as you can do all kinds of 
> convenient shortcuts in your RDF serialization of choice.
>
> Discussions of APIs are not very relevant to questions about the RDF 
> abstract syntax. No one is trying to stop you from unifying URIs and 
> literals *in your API*. Which I think would be a good idea. But we are 
> talking about the RDF abstract model here, don't we?

Because some people were finding discussions of the RDF abstract model a 
little too abstract, I was making it easier for them to understand by 
creating a fictitious Java API that very closely mirrors the RDF 
abstract model to show them how the resulting code uses inconsistent 
ways of doing the same thing. In this regard, discussion of an API very 
true to the RDF abstract syntax is very relevant---it is a pedagogical 
tool for those who have trouble imagining an abstract model in the, 
well, in the abstract.

You and I can talk about the abstract model, so feel free to ignore my 
pedagogical API.

>
> To use an XML metaphor, the RDF abstract syntax isn't like the DOM (an 
> API for XML), it's like the XML Infoset (an abstract data model for XML).

Agreed.

>
> “Inconsistent” means “containing a contradiction”. The example above 
> might demonstrate poor API design or a certain redundancy or 
> awkwardness in the RDF abstract syntax. But I don't see a 
> contradiction anywhere.
>
> Now you'll certainly say that this is all totally beside the point, 
> and that you didn't really mean to say “inconsistent” but something 
> else, and that we should consider your argument based on what you 
> *mean*, not on what you *say*.

"Inconsistent" has several senses, including "containing a 
contradiction," as you pointed out. I was and am using it in its other 
sense of "not regular or predictable" or "following no predictable 
pattern", as in, "does things different ways when there is no necessity 
for doing so." (See <http://www.answers.com/inconsistent> .)

>
> Go ahead.

So now that that's all cleared up, maybe we can go back to my original 
question, with clarification noted by asterisks:

Even if we prefer to write 123 and "123", why do we need *the datatype 
of a typed literal in the RDF abstract syntax* when we can simply use 
rdf:type set to xsd:Integer?

Garret

P.S. If you start saying that you don't understand rdf:type because I 
really mean the property resource identified by the URI 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, I will cry...

Story Henry | 2 Aug 17:58 2007
Picon

Re: RDF's curious literals


Please write this question out in N3, using the right namespaces. You  
may find the solution form itself as you type it out, or you will  
make it very clear to us what you are looking for.

Henry

On 2 Aug 2007, at 17:45, Garret Wilson wrote:

>
> Even if we prefer to write 123 and "123", why do we need *the  
> datatype of a typed literal in the RDF abstract syntax* when we can  
> simply use rdf:type set to xsd:Integer?
>

cr | 2 Aug 19:18 2007

Re: RDF's curious literals


>  Even if we prefer to write 123 and "123", why do we need *the datatype of a 
>  typed literal in the RDF abstract syntax* when we can simply use rdf:type 
>  set to xsd:Integer?

why do we need rdf:Datatype or rdf:type for literals when we can simply not use them?

JSON for serialization - the types are implicit in the encoding - and native language types when in memory - i
added typed-literals to an app this morning and theres certainly no visible type tags anywhere - theyre
somewhere deep in the parsing and implementation layers of the tools used..

> 
>  Garret


Gmane