Skip to content

Move depletion chain from XML to HDF (?) #3882

@drewejohnson

Description

@drewejohnson

Description

The depletion chain files hold lots of information, more than just the transmutation chains. Fission yield distributions and decay sources are part of the chain file. For good reasons!

However, these files can get rather large. Downloading the latest chain file from openmc.org/data produces a 27 MB, 31K line XML file. The fission yield and decay source data are specific cases where we have numeric data encoded as strings that need to be specially handed during the read/write step.

Additionally, to build part of the chain, you still have to traverse over the entire chain file.

I've been thinking about translating the chain file to HDF.

Pros

  1. Native handling of arrays (e.g., fission yields)
  2. Compression
  3. Able to load parts of the file contents into memory without loading the entire file

Cons

  1. Can't open the native file up in a text editor for simple manipulations
  2. Still lots of non-numeric data (e.g., reaction types, targets) that, if left as strings in the data file, don't make for the most friendly hdf experience

Alternatives

A middle ground could be XDMF that's a combination of HDF and XML. That does mean we have two files to pass around: primary "light" XML and secondary "heavy" HDF. I don't hate that solution, as it would let us off load the array-like things to HDF.

Compatibility

We presently have Chain.from_xml which could be supported going forward. We could add Chain.from_hdf or Chain.from_xdmf to handle the new file. Same for export_xml / export_to_hdf / export_to_xdmf

Other items

Lazy load

There is an option to lazily load the file data until we need it. This might be a bigger lift for not much gain. Example: if you're doing a decay problem, you maybe don't need to load in fission yield data (neglecting spontaneous fission). Same for photon sources. Maybe a longer term thing

Structure

I'm not sure on the best data layout. Is it nicer to be able to access all the depletion/decay data for a given nuclide as one group, like the current layout where everything is first under a <nuclide> tag?

Or to have all the fission yield data grouped together so we could have larger 2D arrays of products -> targets for a given energy group? And to write non-fission yield reaction data in a tabular format, like

Nuclide Reaction Target Q
"H1" "(n,gamma)" "H2" 2224648.0

and have each column be a vector since they have different data types

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions