trw.train.sequence_async_reservoir
¶
Module Contents¶
Classes¶
This sequence will asynchronously process data and keep a reserve of loaded samples |
|
Iterate through the SequenceAsyncReservoir sequence |
- class trw.train.sequence_async_reservoir.Performance¶
- add(self, time_elapsed)¶
- get_average_time(self)¶
- class trw.train.sequence_async_reservoir.SequenceAsyncReservoir(source_split, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=None, collate_fn=sequence.remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)¶
Bases:
trw.train.sequence.Sequence
This sequence will asynchronously process data and keep a reserve of loaded samples
The idea is to have long loading processes work in the background while still using as efficiently as possible the data that is currently loaded. The data is slowly being replaced by freshly loaded data over time.
Jobs are started and results retrieved at the beginning of each epoch
This sequence can be interrupted (e.g., after a certain number of batches have been returned). When the sequence is restarted, the reservoir will not be emptied.
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- reservoir_size(self)¶
- Returns
The current number of samples in the reservoir
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- initializer(self)¶
- fill_queue(self)¶
Fill the input queue of jobs to be completed
- _retrieve_results_and_fill_queue(self)¶
Retrieve results from the output queue
- _wait_for_job_completion(self)¶
Block the processing until we have enough result in the reservoir
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Finish and join the existing pool processes
- class trw.train.sequence_async_reservoir.SequenceAsyncReservoirIterator(base_sequence, reservoir_sampler)¶
Bases:
trw.train.sequence.SequenceIterator
Iterate through the SequenceAsyncReservoir sequence
- _reset_iter_reservoir(self)¶
- __next__(self)¶
- Returns
The next batch of data
- close(self)¶
Special method to close and clean the resources of the sequence