HT Parsers is a Python package developed at Institut Néel to parse and structure data from high-throughput experiments.
The package supports multiple characterization techniques and stores data using the MaMMoS ontology, enabling consistent, machine-readable datasets that can be exported to HDF5 (NeXus-inspired format).
Supported techniques include:
- EDX – elemental composition
- MOKE – magnetic measurements
- XRD – structural characterization
- Profilometry (DEKTAK) – film thickness
- SEM – microstructure imaging
Ontology used in this project: https://github.com/MaMMoS-project/MagneticMaterialsOntology
Clone the repository and install in editable mode:
git clone https://github.com/MaMMoS-project/ht-data-parser.git
cd ht-data-parser
python -m venv .venv
source .venv/bin/activate
pip install -e .Dependencies are managed through pyproject.toml, please note that you need to use Python 3.11 or higher.
A Jupyter Notebook DataParser.ipynb is providing a full detail on how to use the ht-data-parser, a basic usage has also been written down below:
Each measurement is represented by a Meas class containing:
metadata– instrument metadatadata– raw measurement dataresults– processed quantities
Example with EDX:
import pathlib
from src.measurements.edxmeas import EdxMeas
path = pathlib.Path("Spectrum_(9,9).spx")
edx_spectrum = EdxMeas(path)
fig = edx_spectrum.plot()
fig.show()Quantities are stored as ontology-aware entities:
energy = edx_spectrum.data["Energy"]
energy.value
energy.unit
energy.ontologyFor wafer-scale experiments (~250 positions), the package provides Scan classes that parse entire folders of measurements.
Example:
from src.scans.edxscan import EdxScan
edx_scan = EdxScan("EDX_folder")
edx_scan.heatmap("results.Nd.AtomPercent")
edx_scan.list_scalar_quantities() # List all values that can be plottedSupported scan classes:
EdxScanMokeScanSmartlabScanEsrfScanProfilScanSemScan
Measurements and scans can be exported to HDF5:
edx_scan.to_hdf5("dataset.hdf5")The resulting structure follows conventions inspired by the NeXus scientific data format: