Skip to content

Support get_class and roundtrip on datetime datasets#1313

Open
rly wants to merge 15 commits intodevfrom
get_class_datetime
Open

Support get_class and roundtrip on datetime datasets#1313
rly wants to merge 15 commits intodevfrom
get_class_datetime

Conversation

@rly
Copy link
Copy Markdown
Contributor

@rly rly commented Sep 19, 2025

Motivation

Fixes #429. Related to (now-merged) #1312.

Previously, neurodata types with isodatetime/datetime datasets could not be roundtripped through get_class-generated classes without manually wiring up an @ObjectMapper.constructor_arg converter to parse the stored ISO string back into a datetime. This PR makes that work end to end.

What changed

  • Write path. New _isoformat dtype helper converts datetime/date values to ASCII-encoded ISO bytes on write so they land as fixed-length text in HDF5.
  • Read path. New _parse_isoformat parses stored values back to datetime (or date for date-only strings) for any dataset/attribute whose spec declares dtype='isodatetime' or dtype='datetime', including arrays.
  • Inc-site dtype overrides. construct() now accepts an optional spec_ext so a parent group that declares dtype='isodatetime' on a data_type_inc-extending dataset whose own def has no dtype is honored on read. spec_ext is plumbed through classgenerator, manager, and objectmapper.
  • Scalar datetime in generated classes. The scalar_data docval macro now includes datetime/date, so get_class-generated dtype-less classes accept scalar datetime values without per-class shims.
  • Test scaffolding. _assert_data_equal skips the length comparison for scalar Data (a 0-d datetime has no len), preventing false negatives in roundtrip assertions.

How to test

The new tests cover the main paths:

  • tests/unit/build_tests/mapper_tests/test_build_datetime.py — datetime/date build round-trips for both regular specs and extension specs (ported from [WIP] Fix building datetime datasets with data_type_inc #1314).
  • tests/unit/test_hdf5_roundtrip_extensions.py — full HDF5 roundtrip including the inc-site dtype override case.
  • tests/unit/build_tests/test_classgenerator.pyget_class with datetime fields, including scalar datetime acceptance via the widened scalar_data macro.
pytest tests/unit/build_tests/mapper_tests/test_build_datetime.py \
       tests/unit/test_hdf5_roundtrip_extensions.py \
       tests/unit/build_tests/test_classgenerator.py

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Does the PR clearly describe the problem and the solution?
  • Have you reviewed our Contributing Guide?
  • Does the PR use "Fix #XXX" notation to tell GitHub to close the relevant issue numbered XXX when the PR is merged?

@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 19, 2025

Codecov Report

❌ Patch coverage is 69.51220% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.97%. Comparing base (f82893b) to head (349602f).

Files with missing lines Patch % Lines
src/hdmf/build/objectmapper.py 65.95% 9 Missing and 7 partials ⚠️
src/hdmf/build/classgenerator.py 67.85% 6 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1313      +/-   ##
==========================================
- Coverage   93.17%   92.97%   -0.21%     
==========================================
  Files          41       41              
  Lines       10099    10158      +59     
  Branches     2079     2099      +20     
==========================================
+ Hits         9410     9444      +34     
- Misses        413      428      +15     
- Partials      276      286      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

rly and others added 12 commits February 11, 2026 22:14
_isoformat was returning a Python str while convert_dtype labeled the
result as the 'ascii' dtype. Encode to bytes (matching _ascii's contract)
so the scalar branch produces the expected b'YYYY-MM-DDTHH:MM:SS' values.
Also handle str and bytes passthrough for idempotence when __convert_string
has already pre-converted the value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add Column (Data) and BarWithColumnData (Container) helper classes and
two tests exercising build of datetime/date values through a dataset
spec with data_type_inc set. test_date_array_ext_spec specifically
mirrors the traceback in #1311.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _parse_isoformat helper (reverse of _isoformat) and wire it into
__get_subspec_values so attributes and dataset data with isodatetime/datetime
dtype are converted from stored ISO str/bytes back into datetime/date before
being passed to the container constructor. Also applied to Data containers'
direct const_args['data'] assignment in construct().

Scoped to scalar values; inc-site dtype override and array datasets are not
yet handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Pass ignore_name/ignore_hdmf_attrs to assertContainerEqual so the
  top-level 'root' vs 'my_group' name mismatch and read-time object_id
  reassignment don't fail the equality check.
- my_data3 has no data_type_inc, so it is stored as a named scalar
  dataset field on the group and read back as a plain datetime value.
  Assert against the value directly rather than .data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
construct() now accepts an optional spec_ext (parallel to build()) that
carries per-site overrides from the parent's subspec. Threaded through
BuildManager.construct, TypeMap.construct, and ObjectMapper.construct.

The Data-container construct path uses spec_ext for dtype resolution when
the base def has no dtype, so a parent group spec like
    DatasetSpec(data_type_inc='X', dtype='datetime', ...)
correctly drives datetime parsing on read even when X's def declares no
dtype. __get_sub_builders passes the inc-site subspec as spec_ext at the
construct call site.

spec_ext defaults to None to preserve backward compatibility with all
existing call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Data containers can wrap scalar values (e.g., a single datetime) for which
len() is not defined. Gate the length comparison on _is_collection so the
helper works for both scalar and collection-valued Data instances.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
scalar_data now includes datetime.datetime and datetime.date so generated
classes for dtype-less DatasetSpecs accept scalar datetime values on
construct (read path) and on user-side construction (write path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- classgenerator: update inherited `data` docval arg in place so subclasses
  preserve parent-set keys like `default` (e.g., VectorData -> [])
- objectmapper: extend __parse_datetime_on_read to handle lists, tuples,
  numpy arrays, and h5py.Dataset by eager materialization
- test_hdf5_roundtrip_extensions: clean imports/dead setup and add an
  array roundtrip test
- test_classgenerator: regression test that subclasses preserve a parent's
  `data` default

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rly rly changed the title [WIP] Support get_class and roundtrip on datetime datasets Support get_class and roundtrip on datetime datasets Apr 26, 2026
@rly rly marked this pull request as ready for review April 26, 2026 05:44
rly added a commit that referenced this pull request Apr 26, 2026
Adds a write-once `resolved_spec` property on the base `Builder` class
so the build and read paths can record which subspec each builder was
matched to. Defaults to None, matches the existing `parent` and `source`
setter discipline (raises on overwrite). No callers wired up yet;
follow-up phases migrate the matchers and consumers.

Includes resolved_specs_epic.md as the planning doc for the full
multi-phase refactor that supersedes the spec_ext plumbing in #1313.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rly added a commit that referenced this pull request Apr 26, 2026
Introduces the first read-time consumer of `builder.resolved_spec`:
attributes and datasets whose spec declares `dtype='isodatetime'` or
`dtype='datetime'` are parsed back into Python `datetime` / `date`
objects on construct.

* Adds `_parse_isoformat` module helper.
* Adds `ObjectMapper.__parse_datetime_on_read` static method handling
  scalars, lists/tuples, numpy arrays, h5py datasets, and zarr arrays
  via the numpy array protocol.
* Wires the consumer into three sites:
  - attribute reads in `__get_subspec_values` (attribute specs are
    already position-resolved).
  - the inline DatasetSpec branch in `__get_subspec_values`, using
    `builder.resolved_spec or spec`.
  - `ObjectMapper.construct` for `Data` containers, using
    `builder.resolved_spec or self.spec`.

This subsumes the read-side functionality of #1313 without the
`spec_ext` plumbing on `BuildManager.construct` / `TypeMap.construct`
/ `ObjectMapper.construct`. The DynamicTable column case (a
VectorData column whose parent's inc-site declares
`dtype='isodatetime'`) reads back as datetime values via Phase 2's
matcher writing to `resolved_spec`.

Stacked on #1449. Part of #1448.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rly added a commit that referenced this pull request Apr 27, 2026
Introduces the first read-time consumer of `builder.matched_spec`:
attributes and datasets whose spec declares `dtype='isodatetime'` or
`dtype='datetime'` are parsed back into Python `datetime` / `date`
objects on construct.

* Adds `_parse_isoformat` module helper.
* Adds `ObjectMapper.__parse_datetime_on_read` static method handling
  scalars, lists/tuples, numpy arrays, h5py datasets, and zarr arrays
  via the numpy array protocol.
* Wires the consumer into three sites:
  - attribute reads in `__get_subspec_values` (attribute specs are
    already position-resolved).
  - the inline DatasetSpec branch in `__get_subspec_values`, using
    `builder.matched_spec or spec`.
  - `ObjectMapper.construct` for `Data` containers, using
    `builder.matched_spec or self.spec`.

This subsumes the read-side functionality of #1313 without the
`spec_ext` plumbing on `BuildManager.construct` / `TypeMap.construct`
/ `ObjectMapper.construct`. The DynamicTable column case (a
VectorData column whose parent's inc-site declares
`dtype='isodatetime'`) reads back as datetime values via Phase 2's
matcher writing to `matched_spec`.

Stacked on #1449. Part of #1448.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rly added a commit that referenced this pull request Apr 27, 2026
Introduces the first read-time consumer of `builder.matched_spec`:
attributes and datasets whose spec declares `dtype='isodatetime'` or
`dtype='datetime'` are parsed back into Python `datetime` / `date`
objects on construct.

* Adds `_parse_isoformat` module helper.
* Adds `ObjectMapper.__parse_datetime_on_read` static method handling
  scalars, lists/tuples, numpy arrays, h5py datasets, and zarr arrays
  via the numpy array protocol.
* Wires the consumer into three sites:
  - attribute reads in `__get_subspec_values` (attribute specs are
    already position-resolved).
  - the inline DatasetSpec branch in `__get_subspec_values`, using
    `builder.matched_spec or spec`.
  - `ObjectMapper.construct` for `Data` containers, using
    `builder.matched_spec or self.spec`.

This subsumes the read-side functionality of #1313 without the
`spec_ext` plumbing on `BuildManager.construct` / `TypeMap.construct`
/ `ObjectMapper.construct`. The DynamicTable column case (a
VectorData column whose parent's inc-site declares
`dtype='isodatetime'`) reads back as datetime values via Phase 2's
matcher writing to `matched_spec`.

Stacked on #1449. Part of #1448.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_class to handle datetimes

1 participant