What this does

"ncensemble" is a Unix command-line utility which will do point-wise ensemble statistics on sets of fields stored in netCDF files.

Basically, you have to have one input netCDF file per ensemble member, with identical structure, and you get one netCDF file out (again with identical structure) per type of statistic that you want to calculate, e.g. a file of means a file of standard deviations etc.

It can be used just to calculate mean, standard deviation, maximum and minimum of a single ensemble, but if you give it two ensembles (usually representing a perturbation and a control experiment) then as well as giving these statistics for each ensemble it can also evaluate the anomaly (i.e. difference in means), and its associated t-statistic and one-tailed probability value (lower tail). Again, these are all point-by-point operations within each field.

Note that the standard deviations which it calculates are the form with (n-1) in the denominator, i.e. the estimator of the population standard deviation, (as opposed to the formulation of sample standard deviation with n in the denominator). Note also that they are not the standard error of the ensemble means, as this would have an additional factor of n in the denominator (inside the square-root).



Here is a copy of the usage message that you can also see by running the program without any arguments.

  ncensemble <FLAGS> <INPUT FILES (ensemble 1)> [- <INPUT FILES (ensemble 2)>]

  Calculate fields of statistics for ensembles of NetCDF files.
  N.B. Requires input files to have identical structure.

  FLAGS include:

   one or more of the following:

     -m filename   output file for mean of ensemble 1
     -s filename   output file for s.d. of ensemble 1
     -M filename   output file for mean of ensemble 2
     -S filename   output file for s.d. of ensemble 2
     -a filename   output file for anomaly (ensemble 1 - ensemble 2)
     -t filename   output file for Student's t statistic on means
     -p filename   output file for one-tailed probability of t statistic

     -h filename   output file for maximum of ensemble 1
     -H filename   output file for maximum of ensemble 2
     -l filename   output file for minimum of ensemble 1
     -L filename   output file for minimum of ensemble 2

     (to remember the flags for max and min: h for highest, l for lowest)

   and optionally:

     -V  variables  comma-separated list of variables to process 
                    (defaults to all variables)

     -C  include also the coordinate variables related to the variables
         explicitly specified with -V (has no effect if -V is not used)

     -v  verbose
     -c  permit clobbering (overwriting) of existing output files

     -i  ignore mismatches in dims/vars between files [USE WITH CAUTION]


    calculate mean and standard deviation for single ensemble:

        ncensemble -m -s \

    calculate anomaly and probability for two ensembles, writing only
     the variables "temp", "ps" and related coordinate variables to 
     output file (and allowing output to overwrite existing files):

        ncensemble -c -a -p -V temp,ps -C \

        (NB mnemonic: think of the dash on the command line separating the
         two ensembles as ensemble1 'minus' ensemble2.  This will give the)
         correct sense for the anomaly.)


If you would like to copy this, please see the COPYING file included in the distribution.

Last edited: 18 December 2008
Alan Iwi <>