REAL FILES AND "VIRTUAL FILES"

Introduction

The I/O API provides both real, disk-based files (which may be "volatile" or not, and are implemented on top of netCDF) and "virtual files" that may be used to provide safe, structured exchange of data -- the latter of "gridded," "boundary," or "custom" types only -- between different modules in the same program. You may even safely interchange between using real files and virtual files in different executions of the same program merely by changing the values of the files logical names (this would allow you to look at the data being shared between modules whenever you want to, for example, at high temporal resolution). There are two types of virtual files: memory-resident BUFFERED virtual files that can be used to share data between modules of a single program; and PVM-mailbox based COUPLING-MODE virtual files that can be used to share data and coordinate scheduling between different programs, even if they are executing on different machines half a continent apart across the Internet.


VOLATILE Real Files

Real (disk-based) I/O API files may optionally be declared "volatile" by the addition of a trailing " -v" to the value of the file's logical name in order to tell the I/O API to perform disk-synch operations before every input and after every output operation on that file:
        ...
        setenv  QUX  "/tmp/mydir/volatiledata.mymodel -v"
    

NetCDF attempts the I/O optimization of not writing a file's header -- needed in order to interpret the file's contents -- out to disk until either a "synch" operation is performed, or until the file is closed. This has the effect of making non-volatile output files unreadable until the program that writes them does a SHUT3() or M3EXIT() (or if it crashes unexpectedly). This extra "synch" operation does cause some (usually small) performance penalty, but it allows other programs to read I/O API files while they are still being written, and prevents data loss upon program crashes.

NOTE: There is a bug in the interaction of netCDF 3.4 and the SGI IRIX6 operating system NFS implementation that sometimes causes NFS-based data exchange between concurrently-running programs using volatile files to fail. If you plan to do this, either go back to netCDF 3.3.1 or before, or go to netCDF 3.5-beta2 or later.


BUFFERED Virtual Files

For memory-resident BUFFERED files, one restriction at present is that the basic data type of all variables in the virtual file be either integer or real. The other restriction is that only two time steps of data are kept in the buffered file -- the "even step" and the "odd step" (which in normal usage are the last two time steps written). Otherwise, you write code to open, read, write, or interpolate data just as you would with a "real" file. This provides for structured name based identity-tagged exchange of data between different modules in the same program -- since data are stored and accessed by file-name, variable-name, date, and time, the system will detect at run-time the attempt to request data not yet initialized (unlike the situation where data is exchanged via Fortran COMMONs - we've detected some obscure use-before-calculate bugs by replacing COMMONs with BUFFERED virtual files.)

To set up a buffered virtual file, setenv the value of the file's logical name to the value BUFFERED (instead of to the pathname of a real physical file), as given below:

 
    ...
    #
    # myprogram uses "qux" for internal data sharing:
    #
    setenv qux BUFFERED
    ...
    /user/mydir/myprogram 
    ...

Restrictions:

  1. For all-variable READ3() and WRITE3() calls, all of the variables in the file must be of type M3REAL.
  2. Prior to the I/O API V 2.2-beta-May-3-2002 release, all variables in buffered virtual files must be of type M3REAL.

COUPLING-MODE Virtual Files

As part of the Practical Parallel Project, MCNC has developed an extended Model Coupling Mode for the I/O API. This mode, implemented using PVM 3.4 mailboxes, allows the user to specify in the run-script whether "file" means a physical file on disk or a PVM mailbox-based communications channel (a virtual file), on the basis of the value of the file's logical name:

    setenv FOO                "virtual BAR"
    setenv IOAPI_KEEP_NSTEPS  3
    
declares that FOO is the logical name of a virtual file whose physical name (in terms of PVM mailbox names) is BAR. The additional environment variable IOAPI_KEEP_NSTEPS determines the number of time steps to keep in PVM mailbox buffers -- if it is 3 (as here), and there are already 3 timesteps of variable QUX in the mailboxes for virtual file FOO, then writing a fourth time step of QUX to FOO causes the earliest time step of QUX to be erased, leaving only timesteps 2, 3, and 4. This is necessary, so that the coupled modeling system does not require an infinite amount of memory for its sustained operation. If not set, IOAPI_KEEP_NSTEPS defaults to 2 (the minimum needed to support INTERP3()'s double-buffering).

The (UNIX) environments in which the modeler launches multiple models each of which reads or writes from a virtual file must all agree on its physical name (usually achieved by sourcing some script that contains the relevant setenv commands).

For models exchanging data via virtual files of the I/O API's coupling mode, the I/O API schedules the various processes on the basis of data availability:

There are two requirements on the modeler: Using coupling mode to construct complex modeling systems has several advantages from the model-engineering point of view:


Previous Section: Variables and Layers and Time Steps

Up: Conventions

To: Models-3/EDSS I/O API: The Help Pages