... personal wiki, blog and notes
Bryan's Blog 2006/03/07
One of the problems with producing standard ways of encoding time is that in meteorology we have a lot of times to choose from. This leads to a lot of confusion in the meteorological community as to which time(s) to use in existing metadata standards, and even claims that existing standards cannot cope with meteorological time.
I think this is mainly about confusing storage, description and querying.
Firstly, let's introduce some time use cases and vocabulary:
I run a simulation model to make some predictions about the future (or past). In either case, I have model time which is related to real world time. In doing so, I may have used an unusual calendar (for example, a 360 day year). We have three concepts and two axes to deal with here: the Simulation Time axis (T) and the Simulation Calendar. The Simulation Period runs from T0 to Te. We also have to deal with real time, which we'll denote with a lower case t.
Using a numerical weather prediction model.
Normally such a model will use a "real" calendar, and the intention is that T corresponds directly to t.
It will have used observational data in setting up the initial conditions, and the last time for which observations were allowed into the setup is the Datum Time (td - note that datum time is a real time, I'll stop making that point now, you can tell from whether it is T or t whether it's simulation or real).
The time at which the simulation is actually created is also useful metadata, so we'll give that a name: Creation Time (tc).
The time at which the forecast is actually issued is also useful, call it Issue Time (ti). So:
A weather prediction might be run for longer, but it might only be intended to be valid for a specific period called the ValidUsagePeriod. This period runs from ti until the VerificationTime (tv).
During the ValidUsagePeriod (and particularly at the end) the forecast data (in time axis T) may be directly compared with real world observations (in time axis t), i.e., they share the same calendar. So now we have the following: T_0 < tc but t''d can be either before or after T0!
Note also that the VerificationTime is simply a special labelled time, this doesn't imply that verification can't be done for any time or times during the ValidUsagePeriod.
Note that some are confused about variables which are accumulations, averages, or maximum/minimum, all over some Interval. These do not have a special relationship with these variables. When I do my comparisons, I just need to ensure that the intervals are the same on both axes.
We might have an ensemble of simulations, which share the same time properties, and only differ by an EnsembleID - we treat this as a time problem because each of these is essentially running with a different instance of T, even though each instance maps directly onto t. But for now we'll ignore these.
In the specific case of four-dimensional data assimilation we have:
confused? You shouldn't be now. But the key point here is that there is only one time axis, and one time calendar both described by T! All the things which are on the t axis are metadata about the data (the prediction).
If we consider observations (here defined as including objective analyses) as well, we might want some new time names, it might be helpful to talk about
the ObservationTime (to,the time at which the observation was made, sometimes called the EventTime).
the IssueTime is also relevant here, because the observation may be revised by better calibration or whatever, so we may have two identical observations for the same to, but different ti.
a CollectionPeriod might be helpful for bounding a period over which observations were collected (which might start before the first observations and finish after the last one, not necessarily beginning with the first and ending with the last!)
Finally, we have the hybrid series. In this case we might have observations interspersed with forecasts. However, again, there is one common axis time. We'd have to identify how the hybrid was composed in the metadata.
I would argue that this is all easily accommodated in the existing metadata standards, nearly all these times are properties of the data, they're not intrinsic to the data coverages (in the OGC sense of coverage). Where people mostly get confused is in working out how to store a sequence of, say, five day forecasts, which are initialised one day apart, and where you might want to, for example, extract a timesequence of data which is valid for a specific series of times, but is using the 48 hour forecasts for those times. This I would argue is a problem for your query schema, not your storage schema - for that you simply have a sequence of forecast instances to store, and I think that's straight forward.
I guess I'll have to return to the query schema issue.
Climate Sensitivity and Politics
James Annan has a post about his recent paper with J. C. Hargreaves1 where they combine three available climate sensitivity estimates using Bayesian probability to get a better constrained estimate of the sensitivity of global mean climate to doubling CO2.
For the record, what they've done is used estimates of sensitivity based on studies which were
trying to recreate 20th century warming, which they characterise as (1,3,10) - most likely 3C, but with the 95% limits lying at 1C and 10C,
evaluating the cooling response to volcanic eruptions - characterised as (1.5,3,6), and
recreating the temperature and CO2 conditions associated with the last glacial minimum - (-0.6,2.7,6.1).
The functional shapes are described in the paper, and they use Bayes theorem to come up with a constrained prediction of (1.7,2.9,4.9), and go on to state that they are confident that the upper limit is probably lower too. (A later post uses even more data to drop the 95% confidence interval down to have an upper limit of 3.9C).
In the comments to the first post, Steve Bloom asks the question:
Here's the key question policy-wise: Can we start to ignore the consequences of exceeding 4.5C for a doubling? What percentage should we be looking for to make such a decision? And all of this begs the question of exactly what negative effects we might get at 3C or even lower. My impression is that the science in all sorts of areas seems to be tending toward more harm with smaller temp increases. Then there's the other complicating question of how likely it is we will reach doubling and if so when.
as a general guide one should take action when the probability of an event exceeds the ratio of protective costs to losses (C/L) ... it's a simple betting argument.
So, rather than directly answer Steve's question, for me the key issue is what is the response to a 1.7C climate sensitivity? We're pretty confident (95% sure) that we have to deal with that, and so we already know we have to do something. What to do next boils down to the evaluating protective costs against losses.
Unfortunately it seems easier to quantify costs of doing something than it is to quantify losses, so people use that as an excuse for doing nothing. The situation is exacerbated by the fact that we're going to find it hard to evaluate both without accurate regional predictions of climate change (and concomitant uncertainty estimates). Regrettably the current state of our simulation models is that we really don't have enough confidence in our regional models. Models need better physics, higher resolution, more ensembles, and more analysis. So while it sounds like more special pleading ("give us some more money and we'll tell you more"), that's where we're at ...
.. but that's not an excuse to do nothing, it just means we need to parallelise (computer geek) doing something (adaptation and mitigation) with refining our predictions and estimates of costs and potential losses.