Skip to content

test: add e2e test for the full training pipeline with a tiny dataset #72

@mihow

Description

@mihow

Context

There is no automated test that exercises the full pipeline from DwC-A to trained model. The individual CLI commands have some test coverage, but integration issues (column mismatches, missing files between steps, incorrect shard patterns) are only caught by running the pipeline manually.

Proposed Changes

Add an end-to-end test that runs the full species classifier pipeline with a tiny dataset:

  • Use a small DwC-A fixture (or a subset of an existing one) with ~10-20 images across 3-5 species
  • Run all pipeline steps: fetch-images -> verify-images -> clean-dataset -> build_species_list.py -> split-dataset -> create-webdataset -> train-model
  • Train for only 1-3 epochs to keep runtime short
  • Use the small-dataset config values: MIN_INSTANCES=0, --val-frac 0.3, --test-frac 0.2 (already documented as commented-out alternatives in scripts/train_species_classifier.sh)
  • Assert that key outputs exist and are valid: category_map.json has expected species, split CSVs are non-empty, webdataset tar files are created, model checkpoint is saved
  • Add GitHub workflows for running the full e2e test locally and in the docker SLURM environment

This could run in CI (CPU-only, training will be slow but feasible for 1-3 epochs on a tiny dataset) or as a local smoke test.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions