The EDSS/Models-3 I/O API
Carlie J. Coats, Jr., Ph.D.
NOTES:
- The Models-3 I/O API is not a data format !!
- I/O API files are not synonymous with
netCDF files !!
Instead, netCDF is one of four distinct lower layers on
which the data and metadata structures for I/O API files are
currently available; additional lower layers may very well be
incorporated at various times in the future (Does anyone want
to fund development of an MPI 2 lower layer? -- Contact the
author!)
Abstract
The Models-3/EDSS Input/Output Applications Programming Interface
(I/O API) provides the environmental model developer with an
easy-to-learn, easy-to-use programming library for data
storage and access, available from both Fortran and C. The
same routines can be used for both file storage (using netCDF files)
and model coupling (using PVM mailboxes). It is the standard data
access library for both the NCSC/CMAS's EDSS project and EPA's
Models-3. There is a external-package wrapper for the I/O API
in the Weather Research
and Forecasting Model [WRF], which optionally uses I/O API
coupling mode to couple WRF-Chem with SMOKE.
The I/O API provides a variety of data structure types for
organizing the data, and a set of access routines which offer
selective direct access to the data in terms meaningful to the
modeler. For example,
Read layer 1 of variable 'OZONE' from 'CONCFILE' for
5:00 PM GMT on July 19, 1988 and put the result
into array A.
is a direct English translation of a typical I/O API READ3() call.
"Selective direct access" means that this READ3 call retrieves exactly
this ozone data immediately. It does not have to read through previous
hours of data, nor whatever other variables (such as NOX or PAN) are
in the file. Data can be read or written in any order (or not at all).
This characteristic provides the following advantages:
- performance: visualization and analysis
programs looking at selected parts of the data don't need
to read unrequested data from the files.
- modularity: Data can be read or written
in any order (or not at all). The same input files serve
both MAQSIP engineering models and full-chemistry
models -- the former reading just a few of the variables
from the files, the latter reading most of them.
The modular-model structure used for MAQSIP and Models-3
depends upon this.
- robustness: data are "tagged" by name,
date, and time; miscounted record-numbers don't scramble
temperatures with pressures, for example.
I/O API files also have the following characteristics
- They are machine-independent and
network-transparent, except for the NCEP native-binary
mode. Files created on a Cray can be read on a desktop
workstation (or vice versa) either via NFS mounting or
FTP, with no further translation necessary.
- They are self-describing -- that is, they
contain headers which provide a complete set of information
necessary to use and interpret the data they contain.
- The API provides automated built-in mechanisms to support
production application requirements for
histories and audit trails:
- ID of the program execution which produced the
file
- Description of the study scenario in which
the file was generated
- They support a variety of data types used in
the environmental sciences, among them
- gridded (e.g., concentration fields or
meteorology-model temperatures)
- grid-boundary (for air quality model
boundary conditions)
- scattered (e.g., meteorology
observations or source level emissions)
- sparse matrix (as used in the
SMOKE
emissions model)
- They support three different time step
structures:
- time-stepped (with regular time-steps) like
hourly model-output concentration fields or
twice-daily upper air meteorology observation
profiles, or
- time-independent like terrain height.
- restart, which maintains
"even" and "odd"
time step records, so that restart data
does not consume an inordinate amount of
disk space
The I/O API also contains an extensive set of utility routines
for manipulating dates and times, performing coordinate conversions,
storing and recalling grid definitions, sparse matrix arithmetic,
etc., as well as a set of data-manipulation and statistical analysis
programs. It has an extensive documentation set, including
Various extensions and research
efforts to the I/O API have been developed or are under
development. Developments include the use of the I/O API
interacting with PVM for model coupling, and adding operations to
read or write entire time series (with multiple time steps) as
single operations, and research projects include data-parallel I/O
and a very powerful "geospatial cell complex" data type
with polygonal-cell decompositions that may be both time
independent (as for finite elememt modeling) and time dependent (as
for moving-mesh plume-in-grid modeling).