Bryan Lawrence

... personal wiki, blog and notes

More on citation - part two, MST

Yesterday I started talking about how I think one should cite data we hold on behalf of someone else. That discussion isn't yet finished, issues of citing within datasets, and other "standards" for citation still need to be discussed (including how we parse and store them electronically), and I still haven't gotten to addressing any of my points from the original blog entry. Before we go there, it's helpful to discuss our mst radar data set.

The mst radar dataset consists of data from 1990 to the present. Over time the data format and methodology of data collection (i.e. how the raw radar returns are converted to, for example, wind) have changed. Nonetheless, one can imagine someone wanting to cite a timeseries extending back to the beginning or a particular days data, or perhaps the whole thing.

Do we publish the data? We will hold the data in perpetuity, and make it available for scientific use, so in some senses yes 1 (In the same sense as we publish the ashoe data). I'm going to get back to this issue of what I think publication should be for data.

Meanwhile, how should one cite this? Still using the U.S. National Library for Medicine recommendations (pdf), we should probably consider this as an online database.

Things to think about:

The whole thing would then be, for example:

Natural Environment Research Council, Mesosphere-Stratosphere-Troposphere Radar at Aberystwyth, [Internet], British Atmospheric Data Centre (BADC), 1990-, urn http://badc.nerc.ac.uk/data/mst/, [updated Mar 15 2006, downloaded Sep 21 2006, Available from http://badc.nerc.ac.uk/getdata/data_browser/badc/mst/data/mst-products-v2/cartesian] ]

Well, that's obviously horrible. Let's just for a moment imagine our retrieval system was a bit more citation friendly.

In which case the following would be legitimate:

Natural Environment Research Council, Mesosphere-Stratosphere-Troposphere Radar at Aberystwyth, [Internet], British Atmospheric Data Centre (BADC), 1990-, urn badc.nerc.ac.uk/data/mst/v3/upd15032006, [downloaded Sep 21 2006, available from http://badc.nerc.ac.uk/data/mst/v3/]

(In all this, note that in all this, today and yesterday, I have completely removed any reference to the physical location of the database (eg, London, or in our case, The physical location is so irrelevant. It made sense to include when one could physically go to an archive and get a copy of a document, but you can't do that with our data. Coming here will get you nothing. Further, we could move the badc from Chilton to Oxford tomorrow, and it would make NO difference to the accuracy of the citation. Why include it then? I think the location convention has to die when applied to electronic retrievals).

Well, that's enough for now. Next we'll consider citing into that archive, and some of the other issues ...

1: Wikipaedia thinks so: "Publishing is the activity of putting information into the public arena. ..." (ret).

Categories: claddier curation metadata

Trackbacks (1)

(Manual trackback to this URI)

Citation, Hosting and Publication (from "Bryan's Blog" on Friday 20 October, 2006)

Returning to my series on citation (parts one, two, and three). My last example was an MST data set held at the BADC, and I was suggesting something like this (for a citation) ...

Comments (3)

Chris Rusbridge on Tuesday 07 November, 2006:

'Our first issue is: "Who is the author?". I think this is a case where this is a NERC facility, so it is NERC.'

I've been very interested by your series of blogs on data citations (including later ones; I wasn't sure where best to place this comment). If we are to change behaviours, we need to look at the motivating factors for scientists to do difficult things with data, regarding it as important as the words they write in their articles. I think data citations are a critical part of this. But all your examples, as in the quote above, imagine only a corporate author. This might be appropriate for MST data (although shouldn't the scientific leadership get some credit for the work they do in this particular form of endeavour rather than others?), but we know there are cases where at least parts of (or contributions to) the data are made by individuals that are worth crediting.

Would you consider adding examples with personal authorship, and also finer granularity, in your later thoughts on this?

Bryan on Tuesday 07 November, 2006:

Certainly personal authorship is where we want to go ... but regrettably nearly all our datasets are not organised so the actual authors pop out (for example, individual's datasets are generally aggregated into thematic programme holding datasets).
That's one of the key changes we in the archive need to achieve ...
but yes, I will eventually get personal author examples up!

pflmajsp on Monday 22 June, 2009:

4rl06o <a href="http://rviaghgqzbvs.com/">rviaghgqzbvs</a>, [url=http://bdsaaukwpxlz.com/]bdsaaukwpxlz[/url], [link=http://gttrdcktddcn.com/]gttrdcktddcn[/link], http://dzogoxcbvnjx.com/

Add a Comment

What is 96+73?
Name
URI
Comment
Comments are text only.
The math question is to ensure you are a human!

This page last modified Friday 22 September, 2006
DISCLAIMER: This is a personal blog. Nothing written here reflects an official opinion of my employer or any funding agency.