You can retrieve a dataset directly from zenodo with the following aeon code. It will by default store the data in your home directory, or you can specify the path as an argument
from aeon.datasets import load_classification
X,y = load_classification("BasicMotions")
X,y = load_classification("BasicMotions", extract_path="C:\\Temp\\")If you download the data using aeon, it is stored in the extract_path in a
directory with the name of the problems and files
<NAME>_TRAIN.ts<NAME>_TEST.tsand possibly<NAME>_TRAIN_eq.tsor<NAME>_TEST_nmv.ts
The extra files are for cases where time series in the original dataset has unequal
length or missing values. Versions with lengths equalised or missing imputed are
stored in files with _eq or _nmv suffix.
- When you first call
load_classificationagain, it looks first in theextract_pathor in your home directory to see if the file exists. If it does, it does not download it again. You can load the combined train/test splits or the train/test separately with the argumentsplit:
X,y = load_classification("BasicMotions") # Load combined train/test
trainX,trainy = load_classification("BasicMotions", split="train")
testX,testy = load_classification("BasicMotions", split="test")Equal length datasets are stored in 3D numpy arrays of shape (n_cases, n_channels, n_timepoints).
Note this is different to some other packages such as tensorflow which assume a single
time series is shape (n_timepoints, n_channels) so if you are not using aeon
you may need to reshape it.
To use sklearn classifiers directly on multivariate equal length datasets, one option is to flatten
the data so that the 3D array (n_cases, n_channels, n_timepoints) becomes a 2D array
of shape (n_cases, n_channels*n_timepoints).
flatTrainX = X.reshape(X.shape[0], X.shape[1] * X.shape[2])
flatTestX = X.reshape(X.shape[0], X.shape[1] * X.shape[2])Unequal length datasets are stored in a list of 2D numpy arrays. You can control
whether to load the equal length version with the parameter load_equal_length.
X,y = load_classification("JapaneseVowels", load_equal_length = False) # Unequal length exampleImputed missing value versions can be loaded with the argument load_no_missing.
You can download whole archives from zenodo or in code
from aeon.datasets import download_archive
download_archive(archive="UEA", extract_path="C:\\Temp\\")
Currently should be one of "EEG","UCR","UEA","Imbalanced","TSR", "Unequal". See
aeon documentation for more details. There are lists of datasets in aeon and a
dictionary of all zenodo keys.
from aeon.datasets.tsc_datasets import multiverse_core, multiverse2026, eeg2026, tsc_zenodo