trw.train

Submodules

Package Contents

Classes

time_it

Simple decorator to measure the time taken to execute a function

CleanAddedHooks

Context manager that automatically track added hooks on the model and remove them when

Output

This is a tag name to find the output reference back from outputs

OutputClassification

Classification output

OutputRegression

Regression output

OutputEmbedding

Represent an embedding

OutputRecord

Record the raw value, but do not compute any loss from it.

OutputSegmentation

Segmentation output

Trainer

This is the main class to train a model

Callback

Defines a callback function that may be called before training, during training, after training

GradCam

Gradient-weighted Class Activation Mapping

GuidedBackprop

Produces gradients generated with guided back propagation from the given image

IntegratedGradients

Implementation of Integrated gradients, a method of attributing the prediction of a deep network

CallbackEpochSummary

Summarizes the last epoch and display useful information such as metric per dataset/split

CallbackExportSamples

Defines a callback function that may be called before training, during training, after training

CallbackSaveLastModel

When the training is finished, save the full model and result

CallbackExportHistory

Summarize the training history of a model (i.e., as a function of iteration)

CallbackExportClassificationReport

Export the main classification measures for the classification outputs of the model

CallbackExportAugmentations

Export samples

CallbackDataSummary

Summarizes the data (min value, max value, number of batches, shapes) for each split of each dataset

CallbackModelSummary

Display important characteristics of the model (e.g., FLOPS, number of parameters, layers, shapes)

CallbackSkipEpoch

Run its callbacks every few epochs

CallbackClearTensorboardLog

Remove any existing logger

CallbackTensorboardEmbedding

This callback records the embedding to be displayed with tensorboard

CallbackTensorboardRecordHistory

This callback records the history to a tensorboard readable log

CallbackTensorboardRecordModel

This callback will export the model to tensorboard

CallbackExportBestHistory

Export the best value of the history and epoch for each metric in a single file

CallbackExportClassificationErrors

Export the classification errors

CallbackLearningRateFinder

Identify a good range for the learning rate parameter.

CallbackLearningRateRecorder

Record the learning rate of the optimizers.

CallbackExplainDecision

Explain the decision of a model

ExplainableAlgorithm

Generic enumeration.

CallbackWorstSamplesByEpoch

The purpose of this callback is to track the samples with the worst loss during the training of the model

CallbackActivationStatistics

Calculate activation statistics of each layer of a neural network.

CallbackZipSources

Record important info relative to the training such as the sources & configuration info

CallbackExportConvolutionKernel

Simply export convolutional kernel.

Sequence

A Sequence defines how to iterate the data as a sequence of small batches of data.

SequenceMap

A Sequence defines how to iterate the data as a sequence of small batches of data.

JobExecutor

Simple job executor using queues as communication channels for input and output

SequenceArray

Create a sequence of batches from numpy arrays, lists and torch.Tensor

SequenceBatch

Group several samples into a single data batch

SequenceAsyncReservoir

This sequence will asynchronously process data and keep a reserve of loaded samples

SequenceAdaptorTorch

Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface

SequenceCollate

Group the data into a sequence of dictionary of torch.Tensor

SequenceReBatch

This sequence will normalize the batch size of an underlying sequence

SamplerRandom

Samples elements randomly. If without replacement, then sample from a shuffled dataset.

SamplerSequential

Samples elements sequentially, always in the same order.

SamplerSubsetRandom

Samples elements randomly from a given list of indices, without replacement.

SamplerClassResampling

Resample the samples so that class_name classes have equal probably of being sampled.

Sampler

Base class for all Samplers.

LossDiceMulticlass

Implementation of the Dice Loss (multi-class) for N-d images

FilterFixed

Apply a fixed filter to n-dimensional images

FilterGaussian

Implement a gaussian filter as a torch.nn.Module

MeaningfulPerturbation

Implementation of "Interpretable Explanations of Black Boxes by Meaningful Perturbation", arXiv:1704.03296

Functions

create_default_options(logging_directory=None, num_epochs=50, device=None)

Create default options for the training and evaluation process.

len_batch(batch)

param batch

a data split or a collections.Sequence

create_or_recreate_folder(path, nb_tries=3, wait_time_between_tries=2.0)

Check if the path exist. If yes, remove the folder then recreate the folder, else create it

to_value(v)

Convert where appropriate from tensors to numpy arrays

set_optimizer_learning_rate(optimizer, learning_rate)

Set the learning rate of the optimizer to a specific value

default_collate_fn(batch, device, pin_memory=False, non_blocking=False)

param batches

a dictionary of features or a list of dictionary of features

collate_dicts(batch, device, pin_memory=False, non_blocking=False)

Default function to collate a dictionary of samples to a dictionary of torch.Tensor

collate_list_of_dicts(batches, device, pin_memory=False, non_blocking=False)

Default function to collate a list of dictionary to a dictionary of `torch.Tensor`s

safe_filename(string)

Return a string good as name for a file by removing all special characters

get_device(module, batch=None)

Return the device of a module. This may be incorrect if we have a module split accross different devices

transfer_batch_to_device(batch, device, non_blocking=False)

Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.

find_default_dataset_and_split_names(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)

Return a good choice of dataset name and split name, possibly not the train split.

create_losses_fn(datasets, generic_loss)

Create a dictionary of loss functions for each of the dataset

epoch_train_eval(options, datasets, optimizers, model, losses, schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, eval_loop_fn=eval_loop, train_loop_fn=train_loop)

Orchestrate the train and evaluation loops

eval_loop(device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)

Run the eval loop (i.e., the model parameters will NOT be updated)

train_loop(device, dataset_name, split_name, split, optimizer, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, apply_backward=True)

Run the train loop (i.e., the model parameters will be updated)

run_trainer_repeat(trainer, options, inputs_fn, model_fn, optimizers_fn, losses_fn=default_sum_all_losses, loss_creator=create_losses_fn, run_prefix='default', eval_every_X_epoch=1, number_of_training_runs=10, post_init_fn=None)

Manages multiple run of a trainer for example to repeat the training and have an idea of the variance of a model

default_post_training_callbacks(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True)

Default callbacks to be performed after the model has been trained

default_per_epoch_callbacks(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None)

Default callbacks to be performed at the end of each epoch

default_pre_training_callbacks(logger=default_logger, with_lr_finder=False, with_export_augmentations=True)

Default callbacks to be performed before the fitting of the model

default_sum_all_losses(dataset_name, batch, loss_terms)

Default loss is the sum of all loss terms

create_sgd_optimizers_fn(datasets, model, learning_rate, momentum=0.9, weight_decay=0, scheduler_fn=None)

Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler

create_sgd_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9)

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

create_scheduler_step_lr(optimizer, step_size=30, gamma=0.1)

Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma

create_adam_optimizers_fn(datasets, model, learning_rate, weight_decay=0, scheduler_fn=None)

Create an ADAM optimizer for each of the dataset with optional scheduler

create_adam_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0)

Create an ADAM optimizer for each of the dataset with optional scheduler

create_optimizers_fn(datasets, model, optimizer_fn, scheduler_fn=None)

Create an optimizer and scheduler

plot_group_histories(root, history_values, title, xlabel, ylabel, max_nb_plots_per_group=5, colors=utilities.make_unique_colors_f())

Plot groups of histories

confusion_matrix(export_path, classes_predictions, classes_trues, classes: list = None, normalize=False, title='Confusion matrix', cmap=plt.cm.plasma, display_numbers=True, maximum_chars_per_line=50, rotate_x=None, rotate_y=None, display_names_x=True, sort_by_decreasing_sample_size=True, excludes_classes_with_samples_less_than=None, main_font_size=16, sub_font_size=8, normalize_unit_percentage=False)

Plot the confusion matrix of a predicted class versus the true class

classification_report(prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: collections.Mapping = None)

Summarizes the important statistics for a classification problem

list_classes_from_mapping(mappinginv: collections.Mapping, default_name='unknown')

Create a contiguous list of label names ordered from 0..N from the class mapping

plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)

Calculate the ROC and AUC of a binary classifier

boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given

export_figure(path, name, maximum_length=259, dpi=300)

Export a figure

auroc(trues, found_1_scores)

Calculate the area under the curve of the ROC plot (AUROC)

find_tensor_leaves_with_grad(tensor)

Find the input leaves of a tensor.

find_last_forward_convolution(model, inputs, types=(nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0)

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

find_last_forward_types(model, inputs, types, relative_index=0)

Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type

find_first_forward_convolution(model, inputs=None, types=(nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0)

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

post_process_output_for_gradient_attribution(output)

Postptocess the output to be suitable for gradient attribution.

model_summary(model, batch, logger)

as_rgb_image(value)

Try interpreting the value as an image. (e.g., 2D, RGB) and return a RGB image

as_image_ui8(image, min_value=None, max_value=None)

Rescale the image to fit in [0..255] range.

export_image(image, path)

Export an image

upsample

default_information_removal_smoothing(image, blurring_sigma=5, blurring_kernel_size=23)

Default information removal (smoothing).

Attributes

default_sample_uid_name

trw.train.create_default_options(logging_directory=None, num_epochs=50, device=None)

Create default options for the training and evaluation process.

Parameters
  • logging_directory – the base directory where the logs will be exported for each trained model. If None and if the environment variable LOGGING_DIRECTORY exists, it will be used as root directory. Else a default folder will be used

  • num_epochs – the number of epochs

  • device – the device to train the model on. If None, we will try first any available GPU then revert to CPU

Returns

the options

trw.train.len_batch(batch)
Parameters

batch – a data split or a collections.Sequence

Returns

the number of elements within a data split

trw.train.create_or_recreate_folder(path, nb_tries=3, wait_time_between_tries=2.0)

Check if the path exist. If yes, remove the folder then recreate the folder, else create it

trw.train.to_value(v)

Convert where appropriate from tensors to numpy arrays :param v: :return:

trw.train.set_optimizer_learning_rate(optimizer, learning_rate)

Set the learning rate of the optimizer to a specific value

Parameters
  • optimizer – the optimizer to update

  • learning_rate – the learning rate to set

Returns

None

trw.train.default_collate_fn(batch, device, pin_memory=False, non_blocking=False)
Parameters
  • batches – a dictionary of features or a list of dictionary of features

  • device – the device where to create the torch.Tensor

  • pin_memory – if True, pin the memory. Required to be a CUDA allocated torch.Tensor

Returns

a dictionary of torch.Tensor

trw.train.collate_dicts(batch, device, pin_memory=False, non_blocking=False)

Default function to collate a dictionary of samples to a dictionary of torch.Tensor

Parameters
  • batch – a dictionary of features

  • device – the device where to create the torch.Tensor

  • pin_memory – if True, pin the memory. Required to be a CUDA allocated torch.Tensor

Returns

a dictionary of torch.Tensor

trw.train.collate_list_of_dicts(batches, device, pin_memory=False, non_blocking=False)

Default function to collate a list of dictionary to a dictionary of `torch.Tensor`s

Parameters
  • batches – a list of dictionary of features

  • device – the device where to create the torch.Tensor

  • pin_memory – if True, pin the memory. Required to be a CUDA allocated torch.Tensor

Returns

a dictionary of torch.Tensor

class trw.train.time_it(time_name=None, log=None)

Simple decorator to measure the time taken to execute a function :param time_name: the name of the function to time, else we will use fn.__str__() :param log: how to log the timing

__call__(self, fn, *args, **kwargs)
class trw.train.CleanAddedHooks(model)

Context manager that automatically track added hooks on the model and remove them when the context is released

__enter__(self)
__exit__(self, type, value, traceback)
static record_hooks(module_source)

Record hooks :param module_source: the module to track the hooks

Returns

at tuple (forward, backward). forward and backward are a dictionary of hooks ID by module

trw.train.safe_filename(string)

Return a string good as name for a file by removing all special characters :param string: :return:

trw.train.get_device(module, batch=None)

Return the device of a module. This may be incorrect if we have a module split accross different devices

trw.train.transfer_batch_to_device(batch, device, non_blocking=False)

Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.

Parameters
  • batch – the batch of data to be transferred

  • device – the device to move the tensors to

  • non_blocking – non blocking memory transfer to GPU

Returns

a batch of data on the specified device

trw.train.find_default_dataset_and_split_names(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)

Return a good choice of dataset name and split name, possibly not the train split.

Parameters
  • datasets – the datasets

  • default_dataset_name – a possible dataset name. If None, find a suitable dataset, if not, the dataset must be present

  • default_split_name – a possible split name. If None, find a suitable split, if not, the dataset must be present. if train_split_name is specified, the selected split name will be different from train_split_name

  • train_split_name – if not None, exclude the train split

Returns

a tuple (dataset_name, split_name)

class trw.train.Output(output, criterion_fn, collect_output=False, sample_uid_name=None)

This is a tag name to find the output reference back from outputs

output_ref_tag = output_ref
evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: a dictionary

loss_term_cleanup(self, loss_term)

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored

extract_history(self, outputs)

Summarizes epoch statistics from the calculated outputs to populate an history :param outputs: the aggregated evaluate_batch output :return: a dictionary

class trw.train.OutputClassification(output, classes_name, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics=metrics.default_classification_metrics(), loss_reduction=torch.mean, weight_name=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1), maybe_optional=False, sample_uid_name=default_sample_uid_name)

Bases: Output

Classification output

extract_history(self, outputs)

Summarizes epoch statistics from the calculated outputs to populate an history :param outputs: the aggregated evaluate_batch output :return: a dictionary

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: a dictionary

class trw.train.OutputRegression(output, target_name, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics=metrics.default_regression_metrics(), loss_reduction=mean_all, weight_name=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., sample_uid_name=default_sample_uid_name)

Bases: Output

Regression output

extract_history(self, outputs)

Summarizes epoch statistics from the calculated outputs to populate an history :param outputs: the aggregated evaluate_batch output :return: a dictionary

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: a dictionary

class trw.train.OutputEmbedding(output, clean_loss_term_each_batch=False, sample_uid_name=default_sample_uid_name)

Bases: Output

Represent an embedding

This is only used to record a tensor that we consider an embedding (e.g., to be exported to tensorboard)

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: a dictionary

loss_term_cleanup(self, loss_term)

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored

class trw.train.OutputRecord(output)

Bases: Output

Record the raw value, but do not compute any loss from it.

This is useful, e.g., to collect UIDs so that we can save them in the network result and further post-process it (e.g., k-fold cross validation)

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: a dictionary

class trw.train.OutputSegmentation(output, target_name, criterion_fn=lambda : ..., collect_only_non_training_output=True, metrics=metrics.default_segmentation_metrics(), loss_reduction=torch.mean, weight_name=None, loss_scaling=1.0, collect_output=True, output_postprocessing=functools.partial(torch.argmax, dim=1), sample_uid_name=default_sample_uid_name)

Bases: Output

Segmentation output

extract_history(self, outputs)

Summarizes epoch statistics from the calculated outputs to populate an history :param outputs: the aggregated evaluate_batch output :return: a dictionary

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: a dictionary

trw.train.default_sample_uid_name = sample_uid
class trw.train.Trainer(callbacks_per_batch_fn=None, callbacks_per_batch_loss_terms_fn=None, callbacks_per_epoch_fn=default_per_epoch_callbacks, callbacks_pre_training_fn=default_pre_training_callbacks, callbacks_post_training_fn=default_post_training_callbacks, trainer_callbacks_per_batch=trainer_callbacks_per_batch, run_epoch_fn=epoch_train_eval)

This is the main class to train a model

static save_model(model, result, path)

Save a model :param model: a PyTorch model :param result: None or the result of the model :param path: where to store the model. The result will be saved at path + ‘.result’

static load_model(path, with_result=False, device=None)

load a saved model

Parameters
  • path – where to store the model. result’s will be loaded from path + ‘.result’

  • with_result – if True, the results of the model will be loaded

  • device – where to load the model. For example, models are typically trained on GPU, but for deployment, CPU might be good enough. If None, use the same device as when the model was exported

Returns

a tuple model, result

fit(self, options, inputs_fn, model_fn, optimizers_fn, losses_fn=default_sum_all_losses, loss_creator=create_losses_fn, run_prefix='default', with_final_evaluation=True, eval_every_X_epoch=1)

Fit the model

Requirements:

  • enough main memory to store the outputs of all the datasets of a single epoch.

    If this cannot be satisfied, sub-sample the epoch so that it can fit in main memory.

Notes:

  • if a feature value is Callable, its value will be replaced by the result of the call

    (e.g., this can be useful to generate z embedding in GANs)

Parameters
  • options

  • inputs_fn

    a functor returning a dictionary of datasets. Alternatively, datasets infos can be specified. inputs_fn must return one of:

    • datasets: dictionary of dataset

    • (datasets, datasets_infos): dictionary of dataset and additional infos

    We define:

    • datasets: a dictionary of dataset. a dataset is a dictionary of splits. a split is a dictionary of batched features.

    • Datasets infos are additional infos useful for the debugging of the dataset (e.g., class mappings, sample UIDs).

    Datasets infos are typically much smaller than datasets should be loaded in loadable in memory

  • model_fn – a functor with parameter options and returning a Module or a ModuleDict

Depending of the type of the model, this is how it will be used:

  • Module: optimizer will optimize model.parameters()

  • ModuleDict: for each dataset name, the optimizer will optimize

    model[dataset_name].parameters(). Note that a forward method will need to be implemented

Parameters
  • losses_fn

  • optimizers_fn

  • loss_creator

  • eval_every_X_epoch – evaluate the model every X epochs

  • run_prefix – the prefix of the output folder

  • with_final_evaluation – if True, once the model is fitted, evaluate all the data again in eval mode

Returns

a tuple model, result

trw.train.create_losses_fn(datasets, generic_loss)

Create a dictionary of loss functions for each of the dataset

Parameters
  • datasets – the datasets

  • generic_loss – a loss function

Returns

A dictionary of losses for each of the dataset

trw.train.epoch_train_eval(options, datasets, optimizers, model, losses, schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, eval_loop_fn=eval_loop, train_loop_fn=train_loop)

Orchestrate the train and evaluation loops

Parameters
  • options

  • datasets

  • optimizers – if None, no optimization will be performed on the train split else a dictionary of

optimizers (on for each dataset) :param model: :param losses: :param schedulers: :param history: :param callbacks_per_batch: :param callbacks_per_batch_loss_terms: :param run_eval: if True, run the evaluation :param eval_loop_fn: the eval function to be used :param train_loop_fn: the train function to be used :return:

trw.train.eval_loop(device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)

Run the eval loop (i.e., the model parameters will NOT be updated)

Note

If callback_per_batch or callbacks_per_batch_loss_terms raise StopIteration, the eval loop will be stopped

Parameters
  • device

  • dataset_name

  • split_name

  • split

  • model

  • loss_fn

  • history

  • callbacks_per_batch

  • callbacks_per_batch_loss_terms

Returns

trw.train.train_loop(device, dataset_name, split_name, split, optimizer, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, apply_backward=True)

Run the train loop (i.e., the model parameters will be updated)

Note

If callbacks_per_batch or callbacks_per_batch_loss_terms raise an exception StopIteration, the train loop will be stopped

Parameters
  • device – the device to be used to optimize the model

  • dataset_name – the name of the dataset

  • split_name – the name of the split

  • split – a dictionary of feature name and values

  • optimizer – an optimizer to optimize the model

  • model – the model to be optimized

  • loss_fn – the loss function

  • history – a list of history step

  • callbacks_per_batch – the callbacks to be performed on each batch. if None, no callbacks to be run

  • callbacks_per_batch_loss_terms – the callbacks to be performed on each loss term. if None, no callbacks to be run

  • apply_backward – if True, the gradient will be back-propagated

trw.train.run_trainer_repeat(trainer, options, inputs_fn, model_fn, optimizers_fn, losses_fn=default_sum_all_losses, loss_creator=create_losses_fn, run_prefix='default', eval_every_X_epoch=1, number_of_training_runs=10, post_init_fn=None)

Manages multiple run of a trainer for example to repeat the training and have an idea of the variance of a model

Parameters
  • trainer

  • options

  • inputs_fn

  • model_fn

  • optimizers_fn

  • losses_fn

  • loss_creator

  • run_prefix

  • eval_every_X_epoch

  • number_of_training_runs

  • post_init_fn – if not None, a function to be called before each training repeat

Returns

a tuple model, result of the last model trained

trw.train.default_post_training_callbacks(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True)

Default callbacks to be performed after the model has been trained

trw.train.default_per_epoch_callbacks(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None)

Default callbacks to be performed at the end of each epoch

trw.train.default_pre_training_callbacks(logger=default_logger, with_lr_finder=False, with_export_augmentations=True)

Default callbacks to be performed before the fitting of the model

trw.train.default_sum_all_losses(dataset_name, batch, loss_terms)

Default loss is the sum of all loss terms

trw.train.create_sgd_optimizers_fn(datasets, model, learning_rate, momentum=0.9, weight_decay=0, scheduler_fn=None)

Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • scheduler_fn – a scheduler, or None

  • momentum – the momentum of the SGD

  • weight_decay – the weight decay

Returns

An optimizer

trw.train.create_sgd_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9)

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma

  • gamma – the factor to apply to the learning rate every step

  • weight_decay – the weight decay

Returns

An optimizer with a step scheduler

trw.train.create_scheduler_step_lr(optimizer, step_size=30, gamma=0.1)

Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma

Parameters
  • optimizer – the optimizer

  • step_size – every number of epochs composing one step. Each step the learning rate will be decreased

  • gamma – apply this factor to the learning rate every time it is adjusted

Returns

a learning rate scheduler

trw.train.create_adam_optimizers_fn(datasets, model, learning_rate, weight_decay=0, scheduler_fn=None)

Create an ADAM optimizer for each of the dataset with optional scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • weight_decay – the weight decay

  • scheduler_fn – a scheduler, or None

Returns

An optimizer

trw.train.create_adam_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0)

Create an ADAM optimizer for each of the dataset with optional scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma

  • gamma – the factor to apply to the learning rate every step

  • weight_decay – the weight decay

Returns

An optimizer with a step scheduler

trw.train.create_optimizers_fn(datasets, model, optimizer_fn, scheduler_fn=None)

Create an optimizer and scheduler

Note

if model is an instance of`ModuleDict`, then the optimizer will only consider the parameters model[dataset_name].parameters() else model.parameters()

Parameters
  • datasets – a dictionary of dataset

  • model – the model. Should be a Module or a ModuleDict

  • optimizer_fn – the functor to instantiate the optimizer

  • scheduler_fn – the functor to instantiate the scheduler. May be None, in that case there will be no scheduler

Returns

a dict of optimizers, one per dataset

trw.train.plot_group_histories(root, history_values, title, xlabel, ylabel, max_nb_plots_per_group=5, colors=utilities.make_unique_colors_f())

Plot groups of histories :param root: the directory where the plot will be exported :param history_values: a map of list of list of (epoch, value) :param title: the title of the graph :param xlabel: the x label :param ylabel: the y label :param max_nb_plots_per_group: the maximum number of plots per group :param colors: the colors to be used

trw.train.confusion_matrix(export_path, classes_predictions, classes_trues, classes: list = None, normalize=False, title='Confusion matrix', cmap=plt.cm.plasma, display_numbers=True, maximum_chars_per_line=50, rotate_x=None, rotate_y=None, display_names_x=True, sort_by_decreasing_sample_size=True, excludes_classes_with_samples_less_than=None, main_font_size=16, sub_font_size=8, normalize_unit_percentage=False)

Plot the confusion matrix of a predicted class versus the true class

Parameters
  • export_path – the folder where the confusion matrix will be exported

  • classes_predictions – the classes that were predicted by the classifier

  • classes_trues – the true classes

  • classes – a list of labels. Label 0 for class 0, label 1 for class 1…

  • normalize – if True, the confusion matrix will be normalized to 1.0 per row

  • title – the title of the plot

  • cmap – the color map to use

  • display_numbers – if True, display the numbers within each cell of the confusion matrix

  • maximum_chars_per_line – the title will be split every maximum_chars_per_line characters to avoid display issues

  • rotate_x – if not None, indicates the rotation of the label on x axis

  • rotate_y – if not None, indicates the rotation of the label on y axis

  • display_names_x – if True, the class name, if specified, will also be displayed on the x axis

  • sort_by_decreasing_sample_size – if True, the confusion matrix will be sorted by decreasing number of samples. This can be useful to show if the errors may be due to low number of samples

  • excludes_classes_with_samples_less_than – if not None, the classes with less than excludes_classes_with_samples_less_than samples will be excluded

  • normalize_unit_percentage – if True, use 100% base as unit instead of 1.0

  • main_font_size – the font size of the text

  • sub_font_size – the font size of the sub-elements (e.g., ticks)

trw.train.classification_report(prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: collections.Mapping = None)

Summarizes the important statistics for a classification problem :param prediction_scores: the scores for each, for each sample :param trues: the true class for each sample :param class_mapping: the class mapping (class id, class name) :return: a dictionary of statistics or sub-report

trw.train.list_classes_from_mapping(mappinginv: collections.Mapping, default_name='unknown')

Create a contiguous list of label names ordered from 0..N from the class mapping

Parameters
  • mappinginv – a dictionary like structure encoded as (class id, class_name)

  • default_name – if there is no class name, use this as default

Returns

a list of class names ordered from class id = 0 to class id = N. If mappinginv is None, returns None

trw.train.plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)

Calculate the ROC and AUC of a binary classifier

Supports multiple ROC curves.

Parameters
  • export_path – the folder where the plot will be exported

  • trues – the expected class. Can be a list for multiple ROC curves

  • found_scores_1 – the score found for the prediction of class 1. Must be a numpy array of floats. Can be a list for multiple ROC curves

  • title – the title of the ROC

  • label_name – the name of the ROC curve. Can be a list for multiple ROC curves

  • colors – if None use default colors. Else, a numpy array of dim (Nx3) where N is the number of colors. Must be in [0..1] range

trw.train.boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given measure?

Parameters
  • export_path – where to export the figure

  • features_trials – a dictionary of list. Each list representing a feature

  • title – the title of the plot

  • ylabel – the label for axis y

  • xlabel – the label for axis x

  • meanline – if True, draw a line from the center of the plot for each history name to the next

  • maximum_chars_per_line – the maximum of characters allowed per line of title. If exceeded, newline will be created.

  • plot_trials – if True, each trial of a feature will be plotted

  • scale – the axis scale to be used

  • y_range – if not None, the (min, max) of the y-axis

  • rotate_x – if not None, the rotation of the x axis labels in degree

  • showfliers – if True, plot the outliers

  • maximum_chars_per_line – the maximum number of characters of the title per line

  • title_line_height – the height of the title lines

trw.train.export_figure(path, name, maximum_length=259, dpi=300)

Export a figure

Parameters
  • path – the folder where to export the figure

  • name – the name of the figure.

  • maximum_length – the maximum length of the full path of a figure. If the full path name is greater than maximum_length, the name will be subs-ampled to the maximal allowed length

  • dpi – Dots Per Inch: the density of the figure

trw.train.auroc(trues, found_1_scores)

Calculate the area under the curve of the ROC plot (AUROC)

Parameters
  • trues – the expected class

  • found_1_scores – the score found for the class 1. Must be a numpy array of floats

Returns

the AUROC

class trw.train.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
trw.train.find_tensor_leaves_with_grad(tensor)

Find the input leaves of a tensor.

Input Leaves REQUIRES have requires_grad=True, else they will not be found

Parameters

tensor – a torch.Tensor

Returns

a list of torch.Tensor with attribute requires_grad=True that is an input of tensor

trw.train.find_last_forward_convolution(model, inputs, types=(nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0)

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

Parameters
  • inputs – the input of the model so that we can call model(inputs)

  • model – the model

  • types – the types to be captured. Can be a single type or a tuple of types

  • relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

trw.train.find_last_forward_types(model, inputs, types, relative_index=0)

Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type

Parameters
  • inputs – the input of the model so that we can call model(inputs)

  • model – the model

  • types – the types to be captured. Can be a single type or a tuple of types

  • relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

trw.train.find_first_forward_convolution(model, inputs=None, types=(nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0)

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

Parameters
  • inputs – NOT USED

  • model – the model

  • types – the types to be captured. Can be a single type or a tuple of types

  • relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

class trw.train.GradCam(model, find_convolution=graph_reflection.find_last_forward_convolution, post_process_output=guided_back_propagation.post_process_output_id)

Gradient-weighted Class Activation Mapping

This is based on the paper “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization”, Ramprasaath R et al.

__call__(self, inputs, target_class_name=None, target_class=None)
Parameters
  • inputs – the inputs to be fed to the model

  • target_class_name

    the output node to be used. If None: * if model output is a single tensor then use this as target output

    • else it will use the first OutputClassification output

  • target_class – the index of the class to explain the decision. If None, the class output will be used

Returns

a tuple (output name, a dictionary (input_name, GradCAMs))

class trw.train.GuidedBackprop(model, unguided_gradient=False, post_process_output=post_process_output_id)

Produces gradients generated with guided back propagation from the given image

update_relus(self)
Updates relu activation functions so that

1- stores output in forward pass 2- imputes zero for gradient values that are less than zero

static get_floating_inputs_with_gradients(inputs)

Extract inputs that have a gradient

Parameters

inputs – a tensor of dictionary of tensors

Returns

Return a list of tuple (name, input) for the input that have a gradient

__call__(self, inputs, target_class, target_class_name)

Generate the guided back-propagation gradient

Parameters
  • inputs – a tensor or dictionary of tensors

  • target_class – the target class to be explained

  • target_class_name – the name of the output class if multiple outputs

Returns

a tuple (output_name, dictionary (input, gradient))

static get_positive_negative_saliency(gradient)

Generates positive and negative saliency maps based on the gradient

Parameters

gradient (numpy arr) – Gradient of the operation to visualize

Returns

pos_saliency ( )

trw.train.post_process_output_for_gradient_attribution(output)

Postptocess the output to be suitable for gradient attribution.

In particular, if we have a trw.train.OutputClassification, we need to apply a softmax operation so that we can backpropagate the loss of a particular class with the appropriate value (1.0).

Parameters

output – a trw.train.OutputClassification

Returns

a torch.Tensor

class trw.train.IntegratedGradients(model, steps=100, baseline_inputs=None, use_output_as_target=False, post_process_output=guided_back_propagation.post_process_output_id)
Implementation of Integrated gradients, a method of attributing the prediction of a deep network

to its input features.

This is implementing the paper Axiomatic Attribution for Deep Networks, Mukund Sundararajan, Ankur Taly, Qiqi Yan as described in https://arxiv.org/abs/1703.01365

__call__(self, inputs, target_class_name, target_class=None)

Generate the guided back-propagation gradient

Parameters
  • inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained

  • target_class – the index of the class to explain the decision. If None, the class output will be used

  • target_class_name

    the output node to be used. If None: * if model output is a single tensor then use this as target output

    • else it will use the first OutputClassification output

Returns

a tuple (output_name, dictionary (input, integrated gradient))

class trw.train.CallbackEpochSummary(logger=utilities.log_and_print, track_best_so_far=True)

Bases: trw.train.callback.Callback

Summarizes the last epoch and display useful information such as metric per dataset/split

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportSamples(max_samples=20, dirname='samples', loss_terms_inclusion=None, feature_exclusions=None, dataset_exclusions=None, split_exclusions=None)

Bases: trw.train.callback.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackSaveLastModel(model_name='last')

Bases: trw.train.callback.Callback

When the training is finished, save the full model and result

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportHistory(export_dirname='history', metrics_to_report=default_metrics())

Bases: trw.train.callback.Callback

Summarize the training history of a model (i.e., as a function of iteration)

  • One plot per dataset

  • splits are plotted together

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportClassificationReport(with_confusion_matrix=True, with_ROC=True, with_history=True, with_report=True)

Bases: trw.train.callback.Callback

Export the main classification measures for the classification outputs of the model

This include: * text report (e.g., accuracy, sensitivity, specificity, F1, typical errors & confusion matrix) * confusion matrix plot * ROC & AUC for binary classification problems

max_class_names = 40
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportAugmentations(nb_samples=10, nb_augmentation=5, dirname='augmentations', split_name=None, uid_name='sample_uid', keep_samples=False)

Bases: trw.train.callback.Callback

Export samples

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackDataSummary(logger=utilities.log_and_print, collect_stats=True)

Bases: trw.train.callback.Callback

Summarizes the data (min value, max value, number of batches, shapes) for each split of each dataset

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackModelSummary(logger=utilities.log_and_print, dataset_name=None, split_name=None)

Bases: trw.train.callback.Callback

Display important characteristics of the model (e.g., FLOPS, number of parameters, layers, shapes)

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
trw.train.model_summary(model, batch, logger)
class trw.train.CallbackSkipEpoch(nb_epochs, callbacks)

Bases: trw.train.callback.Callback

Run its callbacks every few epochs

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackClearTensorboardLog

Bases: CallbackTensorboardBased

Remove any existing logger

This is useful when we train multiple models so that they have their own tensorboard log file

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackTensorboardEmbedding(embedding_name, dataset_name=None, split_name=None, image_name=None, maximum_samples=2000, keep_features_fn=keep_small_features)

Bases: trw.train.callback_tensorboard.CallbackTensorboardBased

This callback records the embedding to be displayed with tensorboard

Note: we must recalculate the embedding as we need to associate a specific input (i.e., we can’t store everything in memory so we need to collect what we need batch by batch)

first_time(self, datasets, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackTensorboardRecordHistory

Bases: trw.train.callback_tensorboard.CallbackTensorboardBased

This callback records the history to a tensorboard readable log

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackTensorboardRecordModel(dataset_name=None, split_name=None, onnx_folder='onnx')

Bases: trw.train.callback_tensorboard.CallbackTensorboardBased

This callback will export the model to tensorboard

@TODO ONNX is probably adding hooks and are not removed. To be investigated.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportBestHistory(filename='best_history.txt', metric_to_discard=[])

Bases: trw.train.callback.Callback

Export the best value of the history and epoch for each metric in a single file

This can be useful to accurately get the best value of a metric and in particular at which step it occurred.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportClassificationErrors(max_samples=100, discard_train=True, dirname='errors')

Bases: trw.train.callback.Callback

Export the classification errors

Note: since we can’t guaranty the repeatability of the input (i.e., from the outputs, we can’t associate the corresponding batches), we need to re-run the evaluation and collect batch by batch the errors.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackLearningRateFinder(nb_samples_per_learning_rate=1000, learning_rate_start=1e-06, learning_rate_stop=10.0, learning_rate_mul=1.2, dataset_name=None, split_name=None, dirname='lr_finder', identify_learning_rate_section=default_identify_learning_rate_section, set_new_learning_rate=False, param_maximum_loss_ratio=0.8)

Bases: trw.train.callback.Callback

Identify a good range for the learning rate parameter.

See “Cyclical Learning Rates for Training Neural Networks”, Leslie N. Smith. https://arxiv.org/abs/1506.01186

Start from a small learning rate and every iteration, increase the learning rate by a factor. At the same time record the loss per epoch. Suitable learning rates will make the loss function decrease. We should select the highest learning rate which decreases the loss function.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)

Note

The model will be deep copied so that we don’t influence the training

Parameters

**kwargs – required optimizers_fn

class trw.train.CallbackLearningRateRecorder(dirname='lr_recorder')

Bases: trw.train.callback.Callback

Record the learning rate of the optimizers.

This is useful as a debugging tool.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
__del__(self)
class trw.train.CallbackExplainDecision(max_samples=10, dirname='explained', dataset_name=None, split_name=None, algorithm=(ExplainableAlgorithm.MeaningfulPerturbations, ExplainableAlgorithm.GuidedBackPropagation, ExplainableAlgorithm.GradCAM, ExplainableAlgorithm.Gradient, ExplainableAlgorithm.IntegratedGradients), output_name=None, nb_explanations=1, algorithms_kwargs=default_algorithm_args(), average_filters=True)

Bases: trw.train.callback.Callback

Explain the decision of a model

first_time(self, datasets, options)
static find_output_name(outputs, name)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.ExplainableAlgorithm

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

GuidedBackPropagation
GradCAM
Gradient
IntegratedGradients
MeaningfulPerturbations
class trw.train.CallbackWorstSamplesByEpoch(split_names=None, output_name=None, dataset_name=None, dirname='worst_samples_by_epoch', sort_samples_by_loss_error=True, worst_k_samples=1000, export_top_k_samples=50, uids_name=sequence_array.sample_uid_name, output_of_interest=(trw_outputs.OutputClassification, trw_outputs.OutputSegmentation, trw_outputs.OutputRegression))

Bases: trw.train.callback.Callback

The purpose of this callback is to track the samples with the worst loss during the training of the model

It is interesting to understand what are the difficult samples (train and test split), are they always wrongly during the training or random? Are they the same samples with different models (i.e., initialization or model dependent)?

first_time(self, datasets, outputs)
static sort_split_data(errors_by_sample, worst_k_samples, discard_first_n_epochs=0)

Helper function to sort the samples

Parameters
  • errors_by_sample – the data

  • worst_k_samples – the number of samples to select or None

  • discard_first_n_epochs – the first few epochs are typically very noisy, so don’t use these

Returns

sorted data

export_stats(self, model, losses, datasets, datasets_infos, options, callbacks_per_batch)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackActivationStatistics(dataset_name=None, split_name='train', logger_fn=utilities.log_and_print)

Bases: trw.train.callback.Callback

Calculate activation statistics of each layer of a neural network.

This can be useful to detect connectivity issues within the network, overflow and underflow which may impede the training of the network.

first_time(self, datasets)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackZipSources(folders_to_record, extensions=default_extensions(), filename='sources.zip', max_width=200)

Bases: trw.train.callback.Callback

Record important info relative to the training such as the sources & configuration info

This is to make sure a result can always be easily reproduced. Any configuration option can be safely appended in options[‘runtime’]

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.CallbackExportConvolutionKernel(export_frequency=500, dirname='convolution_kernels', find_convolution_fn=graph_reflection.find_first_forward_convolution, dataset_name=None, split_name=None, export_filter_fn=default_export_filter)

Bases: trw.train.callback.Callback

Simply export convolutional kernel.

This can be useful to check over the time if the weights have converger.

first_time(self, options, datasets, model)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.train.Sequence(source_split)

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch
abstract __iter__(self)
Returns

An iterator of batches

collate(self, collate_fn=utilities.default_collate_fn, device=None)

Aggregate the input batch as a dictionary of torch.Tensor and move the data to the appropriate device

Parameters
  • collate_fn – the function to collate the input batch

  • device – the device where to send the samples. If None, the default device is CPU

Returns

a collated sequence of batches

map(self, function_to_run, nb_workers=0, max_jobs_at_once=None, worker_post_process_results_fun=None, queue_timeout=0.1, preprocess_fn=None, collate_fn=None)

Transform a sequence using a given function.

Note

The map may create more samples than the original sequence.

Parameters
  • function_to_run – the mapping function

  • nb_workers – the number of workers that will process the split. If 0, no workers will be created.

  • max_jobs_at_once – the maximum number of results that can be pushed in the result queue at once. If 0, no limit. If None, it will be set equal to the number of workers

  • worker_post_process_results_fun – a function used to post-process the worker results (executed by the worker)

  • queue_timeout – the timeout used to pull results from the output queue

  • preprocess_fn – a function that will preprocess the batch just prior to sending it to the other processes

  • collate_fn – a function to collate each batch of data

Returns

a sequence of batches

batch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)

Group several batches of samples into a single batch

Parameters
  • batch_size – the number of samples of the batch

  • discard_batch_not_full – if True and if a batch is not full, discard these

  • collate_fn – a function to collate the batches. If None, no collation performed

Returns

a sequence of batches

rebatch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)

Normalize a sequence to identical batch size given an input sequence with varying batch size

Parameters
  • batch_size – the size of the batches created by this sequence

  • discard_batch_not_full – if True, the last batch will be discarded if not full

  • collate_fn – function to merge multiple batches

async_reservoir(self, max_reservoir_samples, function_to_run, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=sampler.SamplerSequential(), collate_fn=remove_nested_list, maximum_number_of_samples_per_epoch=None)

Create a sequence created from a reservoir. The purpose of this sequence is to maximize the GPU for batches of data at the expense of recycling previously processed samples.

Parameters
  • max_reservoir_samples – the maximum number of samples of the reservoir

  • function_to_run – the function to run asynchronously

  • min_reservoir_samples – the minimum of samples of the reservoir needed before an output sequence can be created

  • nb_workers – the number of workers that will process function_to_run

  • max_jobs_at_once – the maximum number of jobs that can be pushed in the result list at once. If 0, no limit. If None: set to the number of workers

  • reservoir_sampler – a sampler that will be used to sample the reservoir or None if no sampling needed

  • collate_fn – a function to post-process the samples into a single batch. If None, return the items as they were in source_split

  • maximum_number_of_samples_per_epoch – the maximum number of samples per epoch to generate. If we reach this maximum, this will not empty the reservoir but simply interrupt the sequence so that we can restart.

fill_queue(self)

Fill the queue jobs of the current sequence

fill_queue_all_sequences(self)

Go through all the sequences and fill their input queue

__next__(self)
Returns

The next batch of data

next_item(self, blocking)
Parameters

blocking – if True, the next elements will block the current thread if not ready

Returns

The next batch of data

has_background_jobs(self)
Returns

True if this sequence has a background job to create the next element

has_background_jobs_previous_sequences(self)
Returns

the number of sequences that have background jobs currently running to create the next element

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

class trw.train.SequenceMap(source_split, nb_workers, function_to_run, max_jobs_at_once=None, worker_post_process_results_fun=None, queue_timeout=default_queue_timeout, preprocess_fn=None, collate_fn=None)

Bases: trw.train.sequence.Sequence

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch
subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

fill_queue(self)

Fill the queue jobs of the current sequence

initializer(self)

Initialize the sequence to iterate through batches

__next_local(self, next_fn)

Get the next elements

Handles single item or list of items returned by next_fn :param next_fn: return the next elements

__next__(self)
Returns

The next batch of data

has_background_jobs(self)
Returns

True if this sequence has a background job to create the next element

next_item(self, blocking)
Parameters

blocking – if True, the next elements will block the current thread if not ready

Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Finish and join the existing pool processes

class trw.train.JobExecutor(nb_workers, function_to_run, max_jobs_at_once=0, worker_post_process_results_fun=None, output_queue_size=0)

Simple job executor using queues as communication channels for input and output

Feed jobs using JobExecutor.input_queue.put(argument). function_to_run will be called with argument and the output will be pushed to JobExecutor.output_queue

Jobs that failed will have None pushed to the output queue.

__exit__(self, exc_type, exc_val, exc_tb)
__enter__(self)
__del__(self)
close(self)

Terminate all jobs

reset(self)

Reset the input and output queues as well as job session IDs.

The results of the jobs that have not yet been calculated will be discarded

static worker(input_queue, output_queue, func, post_process_results_fun, job_session_id, channel_worker_to_main, channel_main_to_worker, must_finish)
class trw.train.SequenceArray(split, sampler=sampler.SamplerRandom(), transforms=None, use_advanced_indexing=True, sample_uid_name=sample_uid_name)

Bases: trw.train.sequence.Sequence

Create a sequence of batches from numpy arrays, lists and torch.Tensor

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

initializer(self)
static get(split, nb_samples, indices, transforms, use_advanced_indexing)

Collect the split indices given and apply a series of transformations

Parameters
  • nb_samples – the total number of samples of split

  • split – a mapping of np.ndarray or torch.Tensor

  • indices – a list of indices as numpy array

  • transforms – a transformation or list of transformations or None

  • use_advanced_indexing – if True, use the advanced indexing mechanism else use a simple list (original data is referenced) advanced indexing is typically faster for small objects, however for large objects (e.g., 3D data) the advanced indexing makes a copy of the data making it very slow.

Returns

a split with the indices provided

get_next(self)
__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

class trw.train.SequenceBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)

Bases: trw.train.sequence.Sequence

Group several samples into a single data batch

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

class trw.train.SequenceAsyncReservoir(source_split, max_reservoir_samples, function_to_run, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=None, collate_fn=sequence.remove_nested_list, maximum_number_of_samples_per_epoch=None)

Bases: trw.train.sequence.Sequence

This sequence will asynchronously process data and keep a reserve of loaded samples

The idea is to have long loading processes work in the background while still using as efficiently as possible the data that is currently loaded. The data is slowly being replaced by freshly loaded data over time.

Jobs are started and results retrieved at the beginning of each epoch

This sequence can be interrupted (e.g., after a certain number of batches have been returned). When the sequence is restarted, the reservoir will not be emptied.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

reservoir_size(self)
Returns

The current number of samples in the reservoir

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

initializer(self)
_reset_iter_reservoir(self)

Restart the reservoir iterator

fill_queue(self)

Fill the input queue of jobs to be completed

_retrieve_results_and_fill_queue(self)

Retrieve results from the output queue

_wait_for_job_completion(self)

Block the processing until we have enough result in the reservoir

_get_next(self)
Returns

the next batch of samples

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Finish and join the existing pool processes

class trw.train.SequenceAdaptorTorch(torch_dataloader, features=None)

Bases: trw.train.Sequence

Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface

The main purpose is to enable compatibility with the torch data loader and any existing third party code.

__len__(self)
__iter__(self)
Returns

An iterator of batches

__next__(self)
Returns

The next batch of data

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

class trw.train.SequenceCollate(source_split, collate_fn=utilities.default_collate_fn, device=None)

Bases: trw.train.sequence.Sequence

Group the data into a sequence of dictionary of torch.Tensor

This can be useful to combine batches of dictionaries into a single batch with all features concatenated on axis 0. Often used in conjunction of trw.train.SequenceAsyncReservoir and trw.train.SequenceMap.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

class trw.train.SequenceReBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)

Bases: trw.train.sequence.Sequence

This sequence will normalize the batch size of an underlying sequence

If the underlying sequence batch is too large, it will be split in multiple cases. Conversely, if the size of the batch is too small, it several batches will be merged until we reach the expected batch size.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

class trw.train.SamplerRandom(replacement=False, nb_samples_to_generate=None, batch_size=1)

Bases: Sampler

Samples elements randomly. If without replacement, then sample from a shuffled dataset. If with replacement, then user can specify num_samples to draw.

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__next__(self)
__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.SamplerSequential(batch_size=1)

Bases: Sampler

Samples elements sequentially, always in the same order.

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.SamplerSubsetRandom(indices)

Bases: Sampler

Samples elements randomly from a given list of indices, without replacement.

Parameters

indices (sequence) – a sequence of indices

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.SamplerClassResampling(class_name, nb_samples_to_generate, reuse_class_frequencies_across_epochs=True, batch_size=1)

Bases: Sampler

Resample the samples so that class_name classes have equal probably of being sampled.

Classification problems rarely have balanced classes so it is often required to super-sample the minority class to avoid penalizing the under represented classes and help the classifier to learn good features (as opposed to learn the class distribution).

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

_fit(self, classes)
__next__(self)
__iter__(self)

Returns: an iterator the return indices of the original data source

__len__(self)

Returns: the number of elements the sampler will return in a single iteration

get_batch_size(self)
Returns

the size of the batch

class trw.train.Sampler

Bases: object

Base class for all Samplers.

Every Sampler subclass has to provide an __iter__ method, providing a way to iterate over indices of dataset elements, and a __len__ method that returns the length of the returned iterators.

abstract initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

abstract __iter__(self)

Returns: an iterator the return indices of the original data source

abstract __len__(self)

Returns: the number of elements the sampler will return in a single iteration

abstract get_batch_size(self)
Returns

the size of the batch

trw.train.as_rgb_image(value)

Try interpreting the value as an image. (e.g., 2D, RGB) and return a RGB image :param value: an array of shape (y, x), (1, y, x), (3, y, x) :return: return a (3, y, x) array

trw.train.as_image_ui8(image, min_value=None, max_value=None)

Rescale the image to fit in [0..255] range.

Image min will be mapped to 0 and max to 255. Values in this range are interpolated :param image: a RGB float image :return: a RGB unsigned char image

trw.train.export_image(image, path)

Export an image

Parameters
  • image – a RGB image (float or ui8) with format (channels, height, width)

  • path – where to write the image

Returns

class trw.train.LossDiceMulticlass(normalization_fn=nn.Sigmoid, eps=0.0001)

Bases: torch.nn.Module

Implementation of the Dice Loss (multi-class) for N-d images

If multi-class, compute the loss for each class then average the losses

forward(self, output, target)
Parameters
  • output – must have W x C x d0 x … x dn shape, where C is the total number of classes to predict

  • target – must have W x d0 x … x dn shape

Returns

The dice score

trw.train.upsample(tensor, size, mode='linear')

Upsample a 1D, 2D, 3D tensor

This is a wrapper around torch.nn.Upsample to make it more practical. Support integer based tensors.

Note

PyTorch as of version 1.3 doesn’t support non-floating point upsampling (see https://github.com/pytorch/pytorch/issues/13218 and https://github.com/pytorch/pytorch/issues/5580). Instead use a workaround (TODO assess the speed impact!).

Parameters
  • tensor – 1D (shape = b x c x n), 2D (shape = b x c x h x w) or 3D (shape = b x c x d x h x w)

  • size – if 1D, shape = n, if 2D shape = h x w, if 3D shape = d x h x w

  • modelinear or nearest

Returns

an up-sampled tensor with same batch size and filter size as the input

class trw.train.FilterFixed(kernel, groups=1, padding=0)

Bases: torch.nn.Module

Apply a fixed filter to n-dimensional images

__call__(self, value)
class trw.train.FilterGaussian(input_channels, nb_dims, sigma, kernel_sizes=None, padding='same', device=None)

Bases: FilterFixed

Implement a gaussian filter as a torch.nn.Module

class trw.train.MeaningfulPerturbation(model, iterations=150, l1_coeff=0.01, tv_coeff=0.2, tv_beta=3, noise=0.2, model_output_postprocessing=functools.partial(F.softmax, dim=1), mask_reduction_factor=8, optimizer_fn=default_optimizer, information_removal_fn=default_information_removal_smoothing, export_fn=None)

Implementation of “Interpretable Explanations of Black Boxes by Meaningful Perturbation”, arXiv:1704.03296

Handle only 2D and 3D inputs. Other inputs will be discarded.

Deviations: - use a global smoothed image to speed up the processing

__call__(self, inputs, target_class_name, target_class=None)
Parameters
  • inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained

  • target_class – the index of the class to explain the decision. If None, the class output will be used

  • target_class_name

    the output node to be used. If None: * if model output is a single tensor then use this as target output

    • else it will use the first OutputClassification output

Returns

a tuple (output_name, dictionary (input, explanation mask))

static _get_output(target_class_name, outputs, postprocessing)
trw.train.default_information_removal_smoothing(image, blurring_sigma=5, blurring_kernel_size=23)

Default information removal (smoothing).

Parameters
  • image – an image

  • blurring_sigma – the sigma of the blurring kernel used to “remove” information from the image

  • blurring_kernel_size – the size of the kernel to be used. This is an internal parameter to approximate the gaussian kernel. This is exposed since in 3D case, the memory consumption may be high and having a truthful gaussian blurring is not crucial.

Returns

a smoothed image