Skip to content

Releases: NeotomaDB/DataBUS

DataBUS v2.0.0

05 Mar 21:14

Choose a tag to compare

[2.0.0] - 2026-03-05

Added

  • Universal YAML template (data/template_example.yml).
  • Example CSV data file (data/data_example.csv) demonstrating the full column set.
  • Comprehensive test suite with coverage reporting via Codecov.
  • CI pipeline with Ruff linting, pytest + coverage, and Codecov upload (.github/workflows/ci.yml).
  • MkDocs documentation site with auto-generated API reference via mkdocstrings.
  • Tutorials rewritten to reflect the actual two-pass workflow (databus_example.py).
  • OpenSSF Best Practices badge tracking.

Changed

  • Major refactor of the validation/upload architecture (BU-334, BU-349): each validator now also handles insertion when a populated databus dict is supplied, eliminating the separate neotomaUploader module and reducing code duplication.
  • Refactored pull_params into smaller, testable helper functions in utils.py, removing the dependency on pandas.
  • Contact handling consolidated: all contact types (PI, collector, processor, analyst) now go through valid_contact, with chronology modeler assignment handled within valid_chronologies. This significantly reduces repeated code.
  • Data upload now tracks inserted IDs so that data uncertainties can be linked correctly.
  • Chronology handling improved to properly manage calendar years, default chronologies, and sample age linkage.
  • Geopolitical unit insertion updated to handle entities like Scotland under the UK.
  • Improved logging with logging_dict and per-file .valid.log output.
  • Adopted Ruff as the sole linter and formatter, replacing previous tooling.
  • Switched to uv for dependency management and script execution.

Fixed

  • Chron controls now handle calendar years properly.
  • U-Th series insertion works correctly when the number of geochron indices differs from sample indices.
  • Fixed dataset–publication and dataset–database linking during upload.
  • Fixed collector insertion for NODE community datasets.
  • Fixed variable validation to handle null values without comparing null against null.
  • Numerous typos across chroncontrols.py, sample.py, Chronology.py, and others.

DataBUSv1.0.0

01 Dec 18:32

Choose a tag to compare

DataBUSv1.0.0

DataBUS is a Python-based bulk uploader tool for the Neotoma Paleoecology Database. It helps users prepare, validate, and upload large sets of paleoecological records in bulk — using a YAML + CSV template, validation routines, and an upload script that pushes data into a temporary holding database for subsequent ingestion into Neotoma. 

Key Features (v1.0.0)

  • Template-based uploads: Define data using a standardized YAML + CSV template structure that maps CSV columns to Neotoma DB schema (tables/columns) via a “cross-walk.” This enables consistent and repeatable bulk uploads. 
  • Validation suite: BEFORE upload, DataBUS validates submitted CSV data against the template definitions. This includes checks for Site, Collection Unit, Analysis Unit, Dataset, Sample, Data values, dating horizons, and more — reducing risk of malformed or invalid uploads.
  • Automated upload script: Once validated, users can run a single command (python3 data_upload.py) to push data into the neotomaholdingtank or neotoma proper database.
  • Open-source & MIT licensed: DataBUS is released under the MIT license, enabling free use, modification, and redistribution under standard open-source terms. 

Known limitations / Scope & Considerations

  • DataBUS currently expects data templates to be prepared in YAML + CSV format. Data must be in CSV format.
  • Users must follow template rules carefully (column names, vocabularies, types, etc.) — misconfigured templates or CSVs may result in validation failures.
  • Because this is the first official release, the tool will still evolve; future versions might include usability enhancements, more automated checks, or UI tooling.

What's Changed

New Contributors

Full Changelog: v0.0.1...v1.0.0

Alpha DataBUS release

23 Nov 19:29
e4e342e

Choose a tag to compare

This release represents the alpha release of the DataBUS, including template generation and initial package development.

Full Changelog: https://github.com/NeotomaDB/DataBUS/commits/v0.0.1