trw.train.sampler

Module Contents

Classes

Sampler

Base class for all Samplers.

_SamplerSequentialIter

Lazily iterate the indices of a sequential batch

SamplerSequential

Samples elements sequentially, always in the same order.

SamplerRandom

Samples elements randomly. If without replacement, then sample from a shuffled dataset.

SamplerSubsetRandom

Samples elements randomly from a given list of indices, without replacement.

SamplerClassResampling

Resample the samples so that class_name classes have equal probably of being sampled.

class trw.train.sampler.Sampler

Bases: object

Base class for all Samplers.

Every Sampler subclass has to provide an __iter__ method, providing a way to iterate over indices of dataset elements, and a __len__ method that returns the length of the returned iterators.

abstract initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

abstract __iter__(self)

Returns: an iterator the return indices of the original data source

abstract __len__(self)

Returns: the number of elements the sampler will return in a single iteration

abstract get_batch_size(self)
Returns

the size of the batch

class trw.train.sampler._SamplerSequentialIter(nb_samples, batch_size)

Lazily iterate the indices of a sequential batch

__next__(self)
class trw.train.sampler.SamplerSequential(batch_size=1)

Bases: Sampler

Samples elements sequentially, always in the same order.

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.sampler.SamplerRandom(replacement=False, nb_samples_to_generate=None, batch_size=1)

Bases: Sampler

Samples elements randomly. If without replacement, then sample from a shuffled dataset. If with replacement, then user can specify num_samples to draw.

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__next__(self)
__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.sampler.SamplerSubsetRandom(indices)

Bases: Sampler

Samples elements randomly from a given list of indices, without replacement.

Parameters

indices (sequence) – a sequence of indices

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.sampler.SamplerClassResampling(class_name, nb_samples_to_generate, reuse_class_frequencies_across_epochs=True, batch_size=1)

Bases: Sampler

Resample the samples so that class_name classes have equal probably of being sampled.

Classification problems rarely have balanced classes so it is often required to super-sample the minority class to avoid penalizing the under represented classes and help the classifier to learn good features (as opposed to learn the class distribution).

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

_fit(self, classes)
__next__(self)
__iter__(self)

Returns: an iterator the return indices of the original data source

__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch