aertb.core.hdf5tools

class aertb.core.hdf5tools.HDF5File(filename, groups='all')

Bases: object

A wrapper offering useful methods over an HDF5 file, to access the original file use the .file attribute

fixed_train_test_split(n_train, n_test, rand=23)
Parameters
  • n_train – number of train samples per group

  • n_test – number of test samples per group

get_file_stats()

Returns a dictionary with key: group and value: sample count

get_sample_names(n_samples_group='all', rand=- 1)

Returns the samples contained in the file

Parameters

rand – if greater than zero it specifies the seed for the random selection, if negative it is sequential

iterator(n_samples_group='all', rand=23)

returns an iterator over the file samples

Parameters
  • n_samples_group (str, optional) – the samples to consider for each label group, by default ‘all’

  • rand (int, optional) – a seed for shuffling, by default 23

Returns

it can be iterated with next() or a for loop

Return type

iterator

load_events(group, name)
Parameters
  • group – the group/label of the sample to load

  • name – the name of the sample to load

Returns

a structured array of events

Return type

np.array

train_test_split(test_percentage, stratify=True, rand=23)

creates a train/test split from a single HDF5 file,

Parameters
  • test_percentage – specifies in a float [0.0, 1) how big should be the test set

  • groups – specify the groups as a list of strings from where the samples should be taken, all other groups will be ignored

  • statify – if stratify=True the percentages will be relative to the class count and therefore the test set will have the same distribution as the class count, otherwise the test samples are taken randomly regardless of their class, in some scenarios this may cause that some classes may not be in the test set

  • rand – specifies the random seed for shuffling the samples, use negative numbers or None to return samples in a sequential order

class aertb.core.hdf5tools.HDF5FileIterator(file, samples)

Bases: object

Returns an iterator over an HDF5 file, suggested usage is:

iterator = HDF5FileIterator(..) for elem in iterator:

# do something …

reset()

Resets the iterator

aertb.core.hdf5tools.create_hdf5_dataset(dataset_name, file_or_dir, ext, polarities=[0, 1], to_secs=True)

Creates an HDF5 file with the specified name, for a parent directory containing .dat files. It will create a different group for each subdirectory

Parameters
  • dataset_name – the name of the HDF5 file with file extension

  • parent_dir – the path pointing to the parent directory where the dat files reside

  • polarities – indicates the polarity encoding for the data, it can be [0,1] or [-1,1]