4.2. The Database File¶
The database file is a self-contained complete (or nearly complete) binary representation of the ARMI composite model state during a case. The database contains the text of the input files that were used to create the case, and for each time node, the values of all composite parameters as well as layout information to help fully reconstruct the structure of the reactor model.
4.2.1. File format¶
The database file format is built on top of the HDF5 format. There are many tools available for viewing, editing, and scripting HDF5 files. The ARMI database uses the h5py package for interacting with the underlying data and metadata. At a high level there are 3 things to know about HDF5:
Groups - groups are named collections of datasets. You might think of a group as a filesystem folder.
Datasets - Datasets are named values. If a group is a folder, a dataset is a file. Values are strongly typed (think int, float, double, but also whether it is big endian, little endian so that the file is portable across different systems). Values can be scalar, vector, or N-dimensional arrays.
Attributes - attributes can exist on a dataset or a group to provide supplemental information about the group or dataset. We use attributes to idicate the ARMI database version that was used to create the database, the time the case was executed, and whether or not the case completed successfully. We also sometimes apply attributes to datasets to indicate if any special formatting or layout was used to store Parameter values or the like.
There are many other features of HDF5, but from a usability standpoint that is enough information to get started.
4.2.2. Database Structure¶
The database structure is outlined below. This shows the broad strokes of how the database is put together, but many more details may be gleaned from the in-line documentation of the database modules.
Name |
Type |
Description |
|---|---|---|
|
H5Group |
root node |
|
H5Group |
A group that contains all inputs |
|
string |
A representation of the settings file that was used to create the case |
|
string |
A representation of the geometry file used to create the case |
|
string |
A representation of the blueprints file that used to create the case |
|
H5Group |
A group that contains the ARMI model for a specific cycle {CC} and time node
{NN}. For the following, there may be a bit of pseudo-code to explain the origin
of data. Also, it is important to note that all components are flattened and then grouped by type. |
|
H5Group |
A group that contains a description of the ARMI model within this timenode |
|
list of strings |
|
|
list of strings |
|
|
list of int |
|
|
list of 3-tuple floats |
|
|
list of strings |
|
|
list of int |
The components are grouped by |
|
list of int |
|
|
list of 2-tuple floats |
|
|
list of string |
|
|
H5Group |
|
|
list of inferred data |
Values for all parameters for a specific component type, in the order defined by
the |
Python supports a rich and dynamic type system, which is sometimes difficult to represent with the HDF5 format. Namely, HDF5 only supports dense, homogeneous N-dimensional collections of data in any given dataset. Some parameter values do not fit into this mold. Examples of tricky cases are:
Representing
Nonevalues interspersed among a bunch offloatsJagged arrays, where each “row” of a matrix has a different number of entries (or higher-dimensional analogs)
Dictionaries
None of these have a direct representation in HDF5. Therefore, the parameter values on
the composite model sometimes need to be manipulated to fit into the HDF5 format, while
still being able to faithfully reconstruct the original data. To accomplish this, we use
HDF5 dataset attributes to indicate when some manipulation is necessary. Writing
such special data to the HDF5 file and reading it back again is accomplished with the
armi.bookkeeping.db.database3.packSpecialData() and
armi.bookkeeping.db.database3.packSpecialData(). Refer to their implementations
and documentation for more details.