`trw.train.sequence_map`¶

Module Contents¶

Classes¶

`JobExecutor`	Simple job executor using queues as communication channels for input and output
`SequenceMap`	A Sequence defines how to iterate the data as a sequence of small batches of data.

Functions¶

single_function_to_run(batch, function_to_run)

apply a list of functions on a batch of data

Attributes¶

`logger`
`default_queue_timeout`

trw.train.sequence_map.logger¶

trw.train.sequence_map.default_queue_timeout = 0.1¶

trw.train.sequence_map.single_function_to_run(batch, function_to_run)¶: apply a list of functions on a batch of data

class trw.train.sequence_map.JobExecutor(nb_workers, function_to_run, max_jobs_at_once=0, worker_post_process_results_fun=None, output_queue_size=0)¶

Simple job executor using queues as communication channels for input and output

Feed jobs using JobExecutor.input_queue.put(argument). function_to_run will be called with argument and the output will be pushed to JobExecutor.output_queue

Jobs that failed will have None pushed to the output queue.

__exit__(self, exc_type, exc_val, exc_tb)¶

__enter__(self)¶

__del__(self)¶

close(self)¶: Terminate all jobs

reset(self)¶

Reset the input and output queues as well as job session IDs.

The results of the jobs that have not yet been calculated will be discarded

static worker(input_queue, output_queue, func, post_process_results_fun, job_session_id, channel_worker_to_main, channel_main_to_worker, must_finish)¶

class trw.train.sequence_map.SequenceMap(source_split, nb_workers, function_to_run, max_jobs_at_once=None, worker_post_process_results_fun=None, queue_timeout=default_queue_timeout, preprocess_fn=None, collate_fn=None)¶

Bases: trw.train.sequence.Sequence

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

fill_queue(self)¶: Fill the queue jobs of the current sequence

initializer(self)¶: Initialize the sequence to iterate through batches

__next_local(self, next_fn)¶

Get the next elements

Handles single item or list of items returned by next_fn :param next_fn: return the next elements

__next__(self)¶

Returns: The next batch of data

has_background_jobs(self)¶

Returns: True if this sequence has a background job to create the next element

next_item(self, blocking)¶

Parameters: blocking – if True, the next elements will block the current thread if not ready
Returns: The next batch of data

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Finish and join the existing pool processes

trw.train.sequence_map¶

Module Contents¶

Classes¶

Functions¶

Attributes¶

`trw.train.sequence_map`¶