trw.train.sequence_rebatch

Module Contents

Classes

RebatchStatistics

SequenceReBatch

This sequence will normalize the batch size of an underlying sequence

Functions

split_in_2_batches(batch: collections.MutableMapping, first_batch_size: int)

Split a single batch into 2 batches. The first batch will have a fixed size.

trw.train.sequence_rebatch.split_in_2_batches(batch: collections.MutableMapping, first_batch_size: int)

Split a single batch into 2 batches. The first batch will have a fixed size.

If there is not enough sample to split the batch, return (batch, None)

Parameters
  • batch – the batch to split

  • first_batch_size – the batch size of the first batch. The remaining samples will be in the second batch

Returns

a tuple (first batch, second batch)

class trw.train.sequence_rebatch.RebatchStatistics
reset(self)
class trw.train.sequence_rebatch.SequenceReBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

This sequence will normalize the batch size of an underlying sequence

If the underlying sequence batch is too large, it will be split in multiple batches. Conversely, if the size of the batch is too small, it several batches will be merged until we reach the expected batch size.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Special method to close and clean the resources of the sequence