... personal wiki, blog and notes
Bryan's Blog 2008/05/09
From ORE to DublinCore
Standards really are like buses, there's another along every minute, exactly which one should you choose? I'm deep in a little "standards review" as part of our MOLES upgrade. I plan to muse on the role of standards another day, this post is really about Dublin Core!
You've seen me investigate atom. You know I've been delving in ISO19115. You know I'm deep into the OGC framework of GML and application schema and all that. You know I think that Observations and Measurements is a good thing.
Today's task was to investigate ORE a little more, and the first thing I did was try and chase down the ORE vocabulary, which surprisingly, isn't in the data model per se, it lives in it's own document. Anyway, in doing so, I discovered something that I must have known once, and forgotten: Dublin Core is itself an ISO standard (ISO15836:2003). Of course no one refers to DC via it's ISO roots, because they're toll barred (i.e. the ISO version costs money), wheras the public Dublin Core site stands proud.
What amazes me of course is that Dublin Core and ISO19115 use different vocabularies for the same things, even though Dublin Core preceded ISO19115. What was TC211 thinking? Of course ISO19115 covers a lot more, but why wasn't ISO15836 explicitly in the core of ISO19115? The situation is stupid beyond belief: someone even had to convene a working group to address mapping between them. I've extracted the key mapping here.
Mind you, Dublin Core is evolving, unlke ISO15836 which by definition is static. We might come back to that issue. Anyway, the current DublinCore fifteen which describe a Resource look like this:
|term||what it is||type|
|contributor||a contributors name||A|
|coverage||spatial and or temporal jurisdiction, range, or topic||B|
|creator||the primary author's name||A|
|date||of an event applicable to the resource||C|
|description||of the Resource||D or E|
|format||format, physical medium or dimensions (!)||F|
|identifier||reference to the resource||G|
|language||a language of the resource||B (best is RFC4646)|
|publisher||name of an entity making the resource available||A|
|relation||a related resource||B|
|rights||rights information||D (G)|
|source||a related resource from which the described resource is derived||G|
|subject||describes the resource with keywords||B|
|title||the name of the resource||D|
|type||nature or genre of the resource||H|
We can see the "types" of the Dublin Core elements have some semantics which reduce to
|A||free text (names)|
|B||free text (best to use arbitrary controlled vocab)|
|C||free text (dates)|
|D||really free text|
|F||free text (best to use MIME types)|
|G||free text (best to use a URI)|
|H||free text, B, but best to use dcmi-types|
The last vocabulary consists of Collection, Dataset, Event, Image, InteractiveResource, MovingImage, PhysicalObject, Service, Software, Sound, StillImage, Text. (Note that StillImage differs from Image in that the former includes digital objects as well as "artifacts").