In the MPP implementation of the UM, when the inter-process communication is handled by calls to the general communicaton library (GCOM), which in turn calls the appropriate routines in whichever underlying communication library is actually in use (MPI / Cray SHMEM / PVM / etc).
Specifically, in the UM code has in the most part calls to
GC_RSEND in the GCOM library to send a real array, and
GC_RRECV to receive a real array.
The way the code is written, all processes call
GC_RSEND, and then all processes call
GC_RRECV. This therefore makes an implicit dependence on the data being buffered somewhere between the sending and receiving stages (i.e. the code does not implement simultaneous pairs of send and receive, which would not require buffering).
GC_RSENDis a wrapper to
MPI_SENDin the case where MPI is in use (i.e. GCOM is built with the
MPI_SENDroutine does no explicit buffering, but some buffering may nonetheless provided by the underlying MPI implementation. For example, with the Myrinet-enabled version
mpich.1.2.1..7, the available buffering appears to be 16kb (although other MPICH versions may have a different limit).
GC_RSENDis a wrapper to
MPI_BSENDwhich does explicit buffering in a user-supplied array (provided that GCOM is explicitly built with the
BUFFERED_MPIoption). In this case, the buffer size is set to 160,000 reals in
include/gc_limits.hin the GCOM distribution.
In general, it would depend on the resolution of the model, and the MPP configuration in use. But here are a few numbers showing the maximum message size actually found to be passed in a few configurations (found by adding a "print" statement to GCOM):
|Model||Configuration||Number of reals||Comment|
|2x2||1824||= 96 x 19|
|4x2||1558||= 19 x 82|
|7840||= 98 x 20 x 4.
NB this occurs
in the ocean steps.
As you can see, the 7840 reals in the HadCM3L ocean would require about 31kb at 32-bit or about 61kb at 64-bit. This exceeds the 16kb provided by the particular MPICH implementation mentioned above. The result was that an ocean-only or coupled integration with GCOM 1m1s5x5 and this MPI library was found to hang with process deadlock. The limit in GCOM 2.8 (625kb at 32-bit, 1250kb at 64-bit) exceeds the HadCM3L requirements by a factor of 20.
The buffering requirement appears to be proportional to the product of horizontal and vertical resolution (i.e. the number of interface points between different processors' regions) - and does not decrease with the number of processors, because the ocean always has only one processor in the zonal direction.
Use a recent version of GCOM with MPI buffering enabled; there is more than adequate buffering for climate studies, and the buffer size is easily increased if required.
If you do have to use the older GCOM, this mod may help deal with the large message which occurs in the ocean model.
Last edited: 30 November 2001