trw.train

Submodules

Package Contents

Classes

Options

Create default options for the training and evaluation process.

CleanAddedHooks

Context manager that automatically track added hooks on the model and remove them when

Output

This is a tag name to find the output reference back from outputs

OutputClassification

Classification output

OutputRegression

Regression output

OutputEmbedding

Represent an embedding

OutputTriplets

This is a tag name to find the output reference back from outputs

OutputLoss

Represent a given loss as an output.

OutputSegmentation

Classification output

OutputSegmentationBinary

Output for binary segmentation.

OutputClassificationBinary

Classification output for binary classification

LossDiceMulticlass

Implementation of the soft Dice Loss (multi-class) for N-d images

LossFocalMulticlass

This criterion is a implementation of Focal Loss, which is proposed in

LossTriplets

Implement a triplet loss

LossCenter

Center loss, penalize the features falling further from the feature class center.

LossContrastive

Implementation of the contrastive loss.

LossCrossEntropyCsiMulticlass

Optimize a metric similar to Critical Success Index (CSI) on the cross-entropy

LossBinaryF1

The macro F1-score is non-differentiable. Instead use a surrogate that is differentiable

LossMsePacked

Mean squared error loss with target packed as an integer (e.g., classification)

TrainerV2

ClippingGradientNorm

Clips the gradient norm during optimization

Optimizer

OptimizerAdam

OptimizerSGD

OptimizerAdamW

GradCam

Gradient-weighted Class Activation Mapping

GuidedBackprop

Produces gradients generated with guided back propagation from the given image

IntegratedGradients

Implementation of Integrated gradients, a method of attributing the prediction of a deep network

Sequence

A Sequence defines how to iterate the data as a sequence of small batches of data.

SequenceMap

A Sequence defines how to iterate the data as a sequence of small batches of data.

SequenceArray

Create a sequence of batches from numpy arrays, lists and torch.Tensor

SequenceBatch

Group several batches into a single batch

SequenceAsyncReservoir

This sequence will asynchronously process data and keep a reserve of loaded samples

SequenceAdaptorTorch

Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface

SequenceCollate

Group the data into a sequence of dictionary of torch.Tensor

SequenceReBatch

This sequence will normalize the batch size of an underlying sequence

SequenceSubBatch

This sequence will split batches in smaller batches if the underlying sequence batch is too large.

Metric

A metric base class

MetricClassificationError

Calculate the 1 - accuracy using the output_truth and output

MetricClassificationBinarySensitivitySpecificity

Calculate the sensitivity and specificity for a binary classification using the output_truth and output

MetricLoss

Extract the loss from the outputs

MetricClassificationBinaryAUC

Calculate the Area under the Receiver operating characteristic (ROC) curve.

MetricClassificationF1

A metric base class

SamplerRandom

Samples elements randomly. If without replacement, then sample from a shuffled dataset.

SamplerSequential

Samples elements sequentially, always in the same order.

SamplerSubsetRandom

Samples elements randomly from a given list of indices, without replacement.

SamplerClassResampling

Resample the samples so that class_name classes have equal probably of being sampled.

Sampler

Base class for all Samplers.

SamplerSubsetRandomByListInterleaved

Elements from a given list of list of indices are randomly drawn without replacement,

FilterFixed

Apply a fixed filter to n-dimensional images

FilterGaussian

Implement a gaussian filter as a torch.nn.Module

MeaningfulPerturbation

Implementation of "Interpretable Explanations of Black Boxes by Meaningful Perturbation", arXiv:1704.03296

DataParallelExtended

Customized version of torch.nn.DataParallel to support model with

Functions

get_logging_root(logging_root: Optional[str] = None) → str

Return the data root directory

create_or_recreate_folder(path, nb_tries=3, wait_time_between_tries=2.0)

Check if the path exist. If yes, remove the folder then recreate the folder, else create it

set_optimizer_learning_rate(optimizer, learning_rate)

Set the learning rate of the optimizer to a specific value

safe_filename(filename)

Clean the filename so that it can be used as a valid filename

get_device(module, batch=None)

Return the device of a module. This may be incorrect if we have a module split accross different devices

transfer_batch_to_device(batch, device, non_blocking=True)

Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.

find_default_dataset_and_split_names(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)

Return a good choice of dataset name and split name, possibly not the train split.

get_class_name(mapping, classid)

get_classification_mapping(datasets_infos, dataset_name, split_name, output_name)

Return the output mappings of a classification output from the datasets infos

get_classification_mappings(datasets_infos, dataset_name, split_name)

Return the output mappings of a classification output from the datasets infos

make_triplet_indices(targets)

Make random index triplets (anchor, positive, negative) such that anchor and positive

make_pair_indices(targets, same_target_ratio=0.5)

Make random indices of pairs of samples that belongs or not to the same target.

make_unique_colors()

Return a set of unique and easily distinguishable colors

make_unique_colors_f()

Return a set of unique and easily distinguishable colors

apply_spectral_norm(module, n_power_iterations=1, eps=1e-12, dim=None, name='weight', discard_layer_types=(torch.nn.InstanceNorm2d, torch.nn.InstanceNorm3d))

Apply spectral norm on every sub-modules

apply_gradient_clipping(module: torch.nn.Module, value)

Apply gradient clipping recursively on a module as callback.

segmentation_criteria_ce_dice(output, truth, per_voxel_weights=None, ce_weight=0.5, per_class_weights=None, power=1.0, smooth=1.0, focal_gamma=None)

loss combining cross entropy and multi-class dice

total_variation_norm(x, beta)

Calculate the total variation norm

one_hot(targets: trw.basic_typing.TorchTensorNX, num_classes: int, dtype=torch.float32, device: Optional[torch.device] = None) → trw.basic_typing.TorchTensorNCX

Encode the targets (an tensor of integers representing a class)

create_losses_fn(datasets, generic_loss)

Create a dictionary of loss functions for each of the dataset

epoch_train_eval(options, datasets, optimizers, model, losses, schedulers, per_step_schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, force_eval_mode, eval_loop_fn=eval_loop, train_loop_fn=train_loop)

param options

eval_loop(options, device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)

Run the eval loop (i.e., the model parameters will NOT be updated)

train_loop(options, device, dataset_name, split_name, split, optimizer, per_step_scheduler, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, gradient_scaler=None)

Run the train loop (i.e., the model parameters will be updated)

default_post_training_callbacks(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True, additional_callbacks=None)

Default callbacks to be performed after the model has been trained

default_per_epoch_callbacks(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None, additional_callbacks=None)

Default callbacks to be performed at the end of each epoch

default_pre_training_callbacks(logger=default_logger, with_lr_finder=False, with_export_augmentations=True, with_reporting_server=True, with_profiler=False, additional_callbacks=None)

Default callbacks to be performed before the fitting of the model

default_sum_all_losses(dataset_name, batch, loss_terms)

Default loss is the sum of all loss terms

create_sgd_optimizers_fn(datasets, model, learning_rate, momentum=0.9, weight_decay=0, nesterov=False, scheduler_fn=None, per_step_scheduler_fn=None)

Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler

create_sgd_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9, nesterov=False)

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

create_scheduler_step_lr(optimizer, step_size=30, gamma=0.1)

Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma

create_adam_optimizers_fn(datasets, model, learning_rate, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, scheduler_fn=None, per_step_scheduler_fn=None)

Create an ADAM optimizer for each of the dataset with optional scheduler

create_adam_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, betas=(0.9, 0.999))

Create an ADAM optimizer for each of the dataset with optional scheduler

create_optimizers_fn(datasets, model, optimizer_fn, scheduler_fn=None, per_step_scheduler_fn=None)

Create an optimizer and scheduler

create_sgd_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3, nesterov=False)

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

create_adam_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3)

Create a ADAM optimizer for each of the dataset with step learning rate scheduler

plot_group_histories(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) → None

Plot groups of histories

confusion_matrix(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) → None

Plot the confusion matrix of a predicted class versus the true class

classification_report(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)

Summarizes the important statistics for a classification problem

list_classes_from_mapping(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')

Create a contiguous list of label names ordered from 0..N from the class mapping

plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)

Calculate the ROC and AUC of a binary classifier

boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given

export_figure(path, name, maximum_length=259, dpi=None)

Export a figure

auroc(trues: numpy.ndarray, found_1_scores: numpy.ndarray) → float

Calculate the area under the curve of the ROC plot (AUROC)

find_tensor_leaves_with_grad(tensor: torch.Tensor) → Sequence[torch.Tensor]

Find the input leaves of a tensor.

find_last_forward_convolution(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) → Optional[Mapping]

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

find_last_forward_types(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]], relative_index: int = 0) → Optional[Mapping]

Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type

find_first_forward_convolution(model: torch.nn.Module, inputs: Any = None, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) → Optional[Mapping]

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

post_process_output_for_gradient_attribution(output: trw.train.outputs_trw.Output)

Postptocess the output to be suitable for gradient attribution.

default_collate_fn(batch: Union[Sequence[Any], Mapping[str, Any]], device: torch.device, pin_memory: bool = False, non_blocking: bool = False)

param batch

a dictionary of features or a list of dictionary of features

default_information_removal_smoothing(image, blurring_sigma=5, blurring_kernel_size=23, explanation_for=None)

Default information removal (smoothing).

grid_sample(input: torch.Tensor, grid: torch.Tensor, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = None) → torch.Tensor

Compatibility layer for argument change between pytorch <= 1.2 and pytorch > 1.3

Attributes

default_sample_uid_name

class trw.train.Options(logging_directory: Optional[str] = None, num_epochs: int = 50, device: Optional[torch.device] = None, mixed_precision_enabled: bool = False, gradient_update_frequency: int = 1)

Create default options for the training and evaluation process.

__repr__(self) str

Return repr(self).

trw.train.get_logging_root(logging_root: Optional[str] = None) str

Return the data root directory

trw.train.create_or_recreate_folder(path, nb_tries=3, wait_time_between_tries=2.0)

Check if the path exist. If yes, remove the folder then recreate the folder, else create it

Parameters
  • path – the path to create or recreate

  • nb_tries – the number of tries to be performed before failure

  • wait_time_between_tries – the time to wait before the next try

Returns

True if successful or False if failed.

trw.train.set_optimizer_learning_rate(optimizer, learning_rate)

Set the learning rate of the optimizer to a specific value

Parameters
  • optimizer – the optimizer to update

  • learning_rate – the learning rate to set

Returns

None

class trw.train.CleanAddedHooks(model)

Context manager that automatically track added hooks on the model and remove them when the context is released

__enter__(self)
__exit__(self, type, value, traceback)
static record_hooks(module_source)

Record hooks :param module_source: the module to track the hooks

Returns

at tuple (forward, backward). forward and backward are a dictionary of hooks ID by module

trw.train.safe_filename(filename)

Clean the filename so that it can be used as a valid filename

trw.train.get_device(module, batch=None)

Return the device of a module. This may be incorrect if we have a module split accross different devices

trw.train.transfer_batch_to_device(batch, device, non_blocking=True)

Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.

Parameters
  • batch – the batch of data to be transferred

  • device – the device to move the tensors to

  • non_blocking – non blocking memory transfer to GPU

Returns

a batch of data on the specified device

trw.train.find_default_dataset_and_split_names(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)

Return a good choice of dataset name and split name, possibly not the train split.

Parameters
  • datasets – the datasets

  • default_dataset_name – a possible dataset name. If None, find a suitable dataset, if not, the dataset must be present

  • default_split_name – a possible split name. If None, find a suitable split, if not, the dataset must be present. if train_split_name is specified, the selected split name will be different from train_split_name

  • train_split_name – if not None, exclude the train split

Returns

a tuple (dataset_name, split_name)

trw.train.get_class_name(mapping, classid)
trw.train.get_classification_mapping(datasets_infos, dataset_name, split_name, output_name)

Return the output mappings of a classification output from the datasets infos

Parameters
  • datasets_infos – the info of the datasets

  • dataset_name – the name of the dataset

  • split_name – the split name

  • output_name – the output name

Returns

a dictionary {‘mapping’: {name->ID}, ‘mappinginv’: {ID->name}}

trw.train.get_classification_mappings(datasets_infos, dataset_name, split_name)

Return the output mappings of a classification output from the datasets infos

Parameters
  • datasets_infos – the info of the datasets

  • dataset_name – the name of the dataset

  • split_name – the split name

Returns

a dictionary {outputs: {‘mapping’: {name->ID}, ‘mappinginv’: {ID->name}}}

trw.train.make_triplet_indices(targets)
Make random index triplets (anchor, positive, negative) such that anchor and positive

belong to the same target while negative belongs to a different target

Parameters

targets – a 1D integral tensor in range [0..C]

Returns

a tuple of indices (samples, samples_positive, samples_negative)

trw.train.make_pair_indices(targets, same_target_ratio=0.5)

Make random indices of pairs of samples that belongs or not to the same target.

Parameters
  • same_target_ratio – specify the ratio of same target to be generated for sample pairs

  • targets – a 1D integral tensor in range [0..C] to be used to group the samples into same or different target

Returns

a tuple with (samples_0 indices, samples_1 indices, same_target)

trw.train.make_unique_colors()

Return a set of unique and easily distinguishable colors :return: a list of RBG colors

trw.train.make_unique_colors_f()

Return a set of unique and easily distinguishable colors :return: a list of RBG colors

trw.train.apply_spectral_norm(module, n_power_iterations=1, eps=1e-12, dim=None, name='weight', discard_layer_types=(torch.nn.InstanceNorm2d, torch.nn.InstanceNorm3d))

Apply spectral norm on every sub-modules

Parameters
  • module – the parent module to apply spectral norm

  • discard_layer_types – the layers_legacy of this type will not have spectral norm applied

  • n_power_iterations – number of power iterations to calculate spectral norm

  • eps – epsilon for numerical stability in calculating norms

  • dim – dimension corresponding to number of outputs, the default is 0, except for modules that are instances of ConvTranspose{1,2,3}d, when it is 1

  • name – name of weight parameter

Returns

the same module as input module

trw.train.apply_gradient_clipping(module: torch.nn.Module, value)

Apply gradient clipping recursively on a module as callback.

Every time the gradient is calculated, it is intercepted and clipping applied.

Parameters
  • module – a module where sub-modules will have their gradients clipped

  • value – the maximum value of the gradient

class trw.train.Output(metrics, output, criterion_fn, collect_output=False, sample_uid_name=None)

This is a tag name to find the output reference back from outputs

output_ref_tag = output_ref
evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

class trw.train.OutputClassification(output, output_truth, *, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics: List[OutputClassification.__init__.metrics] = metrics.default_classification_metrics(), loss_reduction=torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1, keepdim=True), maybe_optional=False, classes_name='unknown', sample_uid_name=default_sample_uid_name)

Bases: Output

Classification output

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

class trw.train.OutputRegression(output, output_truth, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics=metrics.default_regression_metrics(), loss_reduction=mean_all, weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., target_name=None, sample_uid_name=default_sample_uid_name)

Bases: Output

Regression output

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

class trw.train.OutputEmbedding(output, clean_loss_term_each_batch=False, sample_uid_name=default_sample_uid_name, functor=None)

Bases: Output

Represent an embedding

This is only used to record a tensor that we consider an embedding (e.g., to be exported to tensorboard)

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

trw.train.default_sample_uid_name = sample_uid
trw.train.segmentation_criteria_ce_dice(output, truth, per_voxel_weights=None, ce_weight=0.5, per_class_weights=None, power=1.0, smooth=1.0, focal_gamma=None)

loss combining cross entropy and multi-class dice

Parameters
  • output – the output value, with shape [N, C, Dn…D0]

  • truth – the truth, with shape [N, 1, Dn..D0]

  • ce_weight – the weight of the cross entropy to use. This controls the importance of the cross entropy loss to the overall segmentation loss. Range in [0..1]

  • per_class_weights – the weight per class. A 1D vector of size C indicating the weight of the classes. This will be used for the cross-entropy loss

  • per_voxel_weights – the weight of each truth voxel. Must be of shape [N, Dn..D0]

Returns

a torch tensor

class trw.train.OutputTriplets(samples, positive_samples, negative_samples, criterion_fn=lambda : ..., metrics=metrics.default_generic_metrics(), loss_reduction=mean_all, weight_name=None, loss_scaling=1.0, sample_uid_name=default_sample_uid_name)

Bases: Output

This is a tag name to find the output reference back from outputs

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

class trw.train.OutputLoss(losses, loss_reduction=torch.mean, metrics=metrics.default_generic_metrics(), sample_uid_name=default_sample_uid_name)

Bases: Output

Represent a given loss as an output.

This can be useful to add additional regularizer to the training (e.g., trw.train.LossCenter).

evaluate_batch(self, batch, is_training)

Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

class trw.train.OutputSegmentation(output: torch.Tensor, output_truth: torch.Tensor, criterion_fn: Callable[[], Any] = LossDiceMulticlass, collect_output: bool = False, collect_only_non_training_output: bool = False, metrics: List[OutputSegmentation.__init__.metrics] = metrics.default_segmentation_metrics(), loss_reduction: Callable[[torch.Tensor], torch.Tensor] = torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1, keepdim=True), maybe_optional=False, sample_uid_name=default_sample_uid_name)

Bases: OutputClassification

Classification output

class trw.train.OutputSegmentationBinary(output: torch.Tensor, output_truth: torch.Tensor, criterion_fn: Callable[[], Any] = LossDiceMulticlass, collect_output: bool = False, collect_only_non_training_output: bool = False, metrics: List[OutputSegmentationBinary.__init__.metrics] = metrics.default_segmentation_metrics(), loss_reduction: Callable[[torch.Tensor], torch.Tensor] = torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., maybe_optional=False, sample_uid_name=default_sample_uid_name)

Bases: OutputSegmentation

Output for binary segmentation.

Parameters
  • output – shape N * 1 * X format, must be raw logits

  • output_truth – should have N * 1 * X format, with values 0 or 1

class trw.train.OutputClassificationBinary(output, output_truth, *, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics: List[OutputClassificationBinary.__init__.metrics] = metrics.default_classification_metrics(), loss_reduction=torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., maybe_optional=False, classes_name='unknown', sample_uid_name=default_sample_uid_name)

Bases: OutputClassification

Classification output for binary classification

Parameters
  • output – the output with shape [N, 1, {X}], without any activation applied (i.e., logits)

  • output_truth – the truth with shape [N, 1, {X}]

class trw.train.LossDiceMulticlass(normalization_fn: Callable[[torch.Tensor], torch.Tensor] = partial(nn.Softmax, dim=1), eps: float = 1e-05, return_dice_by_class: bool = False, smooth: float = 0.001, power: float = 1.0, per_class_weights: Sequence[float] = None, discard_background_loss: bool = True)

Bases: torch.nn.Module

Implementation of the soft Dice Loss (multi-class) for N-d images

If multi-class, compute the loss for each class then average the losses

References

[1] “V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation” https://arxiv.org/pdf/1606.04797.pdf

forward(self, output, target)
Parameters
  • output – must have N x C x d0 x … x dn shape, where C is the total number of classes to predict

  • target – must have N x 1 x d0 x … x dn shape

Returns

if return_dice_by_class is False, return 1 - dice score suitable for optimization. Else, return the (numerator, cardinality) by class and by sample

class trw.train.LossFocalMulticlass(alpha=None, gamma=2, reduction='mean')

Bases: torch.nn.Module

This criterion is a implementation of Focal Loss, which is proposed in Focal Loss for Dense Object Detection, https://arxiv.org/pdf/1708.02002.pdf

Loss(x, class) = - alpha (1-softmax(x)[class])^gamma log(softmax(x)[class])

Parameters
  • alpha (1D Tensor, Variable) – the scalar factor for this criterion. One weight factor for each class.

  • gamma (float, double) – gamma > 0; reduces the relative loss for well-classified examples (p > .5), putting more focus on hard, misclassified examples

forward(self, outputs, targets)
class trw.train.LossTriplets(margin=1.0, distance=nn.PairwiseDistance(p=2))

Bases: torch.nn.Module

Implement a triplet loss

The goal of the triplet loss is to make sure that:

  • Two examples with the same label have their embeddings close together in the embedding space

  • Two examples with different labels have their embeddings far away.

However, we don’t want to push the train embeddings of each label to collapse into very small clusters. The only requirement is that given two positive examples of the same class and one negative example, the negative should be farther away than the positive by some margin. This is very similar to the margin used in SVMs, and here we want the clusters of each class to be separated by the margin.

The loss implements the following equation:

mathcal{L} = max(d(a, p) - d(a, n) + margin, 0)

forward(self, samples, positive_samples, negative_samples)

Calculate the triplet loss

Parameters
  • samples – the samples

  • positive_samples – the samples that belong to the same group as samples

  • negative_samples – the samples that belong to a different group than samples

Returns

a 1D tensor (N) representing the loss per sample

class trw.train.LossCenter(number_of_classes, number_of_features, alpha=1.0)

Bases: torch.nn.Module

Center loss, penalize the features falling further from the feature class center.

In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this loss can be used as a new supervision signal. Specifically, the center loss simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers.

An implementation of center loss: Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016.

Note

This loss must be part of a parent module or explicitly optimized by an optimizer. If not, the centers will not be modified.

forward(self, x, classes)
Parameters
  • x – the features, an arbitrary n-d tensor (N * C * …). Features should ideally be in range [0..1]

  • classes – a 1D integral tensor (N) representing the class of each x

Returns

a 1D tensor (N) representing the loss per sample

class trw.train.LossContrastive(margin=1.0)

Bases: torch.nn.Module

Implementation of the contrastive loss.

L(x0, x1, y) = 0.5 * (1 - y) * d(x0, x1)^2 + 0.5 * y * max(0, m - d(x0, x1))^2

with y = 0 for samples x0 and x1 deemed dissimilar while y = 1 for similar samples. Dissimilar pairs contribute to the loss function only if their distance is within this radius m and minimize d(x0, x1) over the set of all similar pairs.

See Dimensionality Reduction by Learning an Invariant Mapping, Raia Hadsell, Sumit Chopra, Yann LeCun, 2006.

forward(self, x0, x1, same_target)
Parameters
  • x0 – N-D tensor

  • x1 – N-D tensor

  • same_target0 or 1 1D tensor. 1 means the x0 and x1 belongs to the same class, while 0 means they are from a different class

Returns

a 1D tensor (N) representing the loss per sample

trw.train.total_variation_norm(x, beta)

Calculate the total variation norm

Parameters
  • x – a tensor with format (samples, components, dn, …, d0)

  • beta – the exponent

Returns

a scalar

class trw.train.LossCrossEntropyCsiMulticlass

Bases: torch.nn.Module

Optimize a metric similar to Critical Success Index (CSI) on the cross-entropy

A loss for heavily unbalanced data (order of magnitude more negative than positive) Calculate the cross-entropy and use only the loss using the TP, FP and FN. Loss from TN is simply discarded.

forward(self, outputs, targets, important_class=1)
Parameters
  • outputs – a N x C tensor with N the number of samples and C the number of classes

  • targets – a N integral tensor

  • important_class – the class to keep the cross-entropy loss even if classification is correct

Returns

a N floating tensor representing the loss of each sample

class trw.train.LossBinaryF1(eps=0.0001)

Bases: torch.nn.Module

The macro F1-score is non-differentiable. Instead use a surrogate that is differentiable

and correlates well with the Macro F1 score by working on the class probabilities rather than the discrete classification.

For example, if the ground truth is 1 and the model prediction is 0.8, we calculate it as 0.8 true

positive and 0.2 false negative

forward(self, outputs, targets)
trw.train.one_hot(targets: trw.basic_typing.TorchTensorNX, num_classes: int, dtype=torch.float32, device: Optional[torch.device] = None) trw.basic_typing.TorchTensorNCX

Encode the targets (an tensor of integers representing a class) as one hot encoding.

Support target as N-dimensional data (e.g., 3D segmentation map).

Equivalent to torch.nn.functional.one_hot for backward compatibility with pytorch 1.0

Parameters
  • num_classes – the total number of classes

  • targets – a N-dimensional integral tensor (e.g., 1D for classification, 2D for 2D segmentation map…)

  • dtype – the type of the output tensor

  • device – the device of the one-hot encoded tensor. If None, use the target’s device

Returns

a one hot encoding of a N-dimentional integral tensor

class trw.train.LossMsePacked(reduction: typing_extensions.Literal[mean, none] = 'mean')

Bases: torch.nn.Module

Mean squared error loss with target packed as an integer (e.g., classification)

The packed_target will be one hot encoded and the mean squared error is applied with the tensor.

forward(self, tensor, packed_target)
Parameters
  • tensor – a NxCx… tensor

  • packed_target – a Nx1x… tensor

trw.train.create_losses_fn(datasets, generic_loss)

Create a dictionary of loss functions for each of the dataset

Parameters
  • datasets – the datasets

  • generic_loss – a loss function

Returns

A dictionary of losses for each of the dataset

trw.train.epoch_train_eval(options, datasets, optimizers, model, losses, schedulers, per_step_schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, force_eval_mode, eval_loop_fn=eval_loop, train_loop_fn=train_loop)
Parameters
  • options

  • datasets

  • optimizers

  • model

  • losses

  • schedulers

  • per_step_schedulers

  • history

  • callbacks_per_batch

  • callbacks_per_batch_loss_terms

  • run_eval

  • force_eval_mode

  • eval_loop_fn

  • train_loop_fn

Returns:

trw.train.eval_loop(options, device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)

Run the eval loop (i.e., the model parameters will NOT be updated)

Note

If callback_per_batch or callbacks_per_batch_loss_terms raise StopIteration, the eval loop will be stopped

Parameters
  • device

  • dataset_name

  • split_name

  • split

  • model

  • loss_fn

  • history

  • callbacks_per_batch

  • callbacks_per_batch_loss_terms

Returns

trw.train.train_loop(options, device, dataset_name, split_name, split, optimizer, per_step_scheduler, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, gradient_scaler=None)

Run the train loop (i.e., the model parameters will be updated)

Note

If callbacks_per_batch or callbacks_per_batch_loss_terms raise an exception StopIteration, the train loop will be stopped

Parameters
  • device – the device to be used to optimize the model

  • dataset_name – the name of the dataset

  • split_name – the name of the split

  • split – a dictionary of feature name and values

  • optimizer – an optimizer to optimize the model

  • per_step_scheduler – scheduler to be applied per-batch

  • model – the model to be optimized

  • loss_fn – the loss function

  • history – a list of history step

  • callbacks_per_batch – the callbacks to be performed on each batch. if None, no callbacks to be run

  • callbacks_per_batch_loss_terms – the callbacks to be performed on each loss term. if None, no callbacks to be run

  • gradient_scaler – if mixed precision is enabled, this is the scale to be used for the gradient update

Notes

if optimizer is None, there MUST be a .backward() to free graph and memory.

trw.train.default_post_training_callbacks(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True, additional_callbacks=None)

Default callbacks to be performed after the model has been trained

trw.train.default_per_epoch_callbacks(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None, additional_callbacks=None)

Default callbacks to be performed at the end of each epoch

trw.train.default_pre_training_callbacks(logger=default_logger, with_lr_finder=False, with_export_augmentations=True, with_reporting_server=True, with_profiler=False, additional_callbacks=None)

Default callbacks to be performed before the fitting of the model

trw.train.default_sum_all_losses(dataset_name, batch, loss_terms)

Default loss is the sum of all loss terms

class trw.train.TrainerV2(callbacks_per_batch=None, callbacks_per_batch_loss_terms=None, callbacks_per_epoch=default_per_epoch_callbacks(), callbacks_pre_training=default_pre_training_callbacks(), callbacks_post_training=default_post_training_callbacks(), trainer_callbacks_per_batch=trainer_callbacks_per_batch, run_epoch_fn=epoch_train_eval, logging_level=logging.DEBUG, skip_eval_epoch_0=True)
static save_model(model, metadata: trw.train.utilities.RunMetadata, path, pickle_module=pickle)

Save a model to file

Parameters
  • model – the model to serialize

  • metadata – an optional result file associated with the model

  • path – the base path to save the model

  • pickle_module – the serialization module that will be used to save the model and results

static load_state(model: torch.nn.Module, path: str, device: torch.device = None, pickle_module: Any = pickle, strict: bool = True) None

Load the state of a model

Parameters
  • model – where to load the state

  • path – where the model’s state was saved

  • device – where to locate the model

  • pickle_module – how to read the model parameters and metadata

  • strict – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function

static load_model(path: str, model_kwargs: Optional[Dict[Any, Any]] = None, with_result: bool = False, device: torch.device = None, pickle_module: Any = pickle) Tuple[torch.nn.Module, trw.train.utilities.RunMetadata]

Load a previously saved model

Construct a model from the RunMetadata.class_name class and with arguments model_kwargs

Parameters
  • path – where to store the model. result’s will be loaded from path + ‘.result’

  • model_kwargs – arguments used to instantiate the model stored in RunMetadata.class_name

  • with_result – if True, the results of the model will be loaded

  • device – where to load the model. For example, models are typically trained on GPU, but for deployment, CPU might be good enough. If None, use the same device as when the model was exported

  • pickle_module – the de-serialization module to be used to load model and results

Returns

a tuple model, metadata

fit(self, options, datasets, model: torch.nn.Module, optimizers_fn, losses_fn=default_sum_all_losses, loss_creator=create_losses_fn, log_path=None, with_final_evaluation=True, history=None, erase_logging_folder=True, eval_every_X_epoch=1) trw.train.utilities.RunMetadata

Fit the model

Parameters
  • options

  • datasets

    a functor returning a dictionary of datasets. Alternatively, datasets infos can be specified. inputs_fn must return one of:

    • datasets: dictionary of dataset

    • (datasets, datasets_infos): dictionary of dataset and additional infos

    We define:

    • datasets: a dictionary of dataset. a dataset is a dictionary of splits. a split is a dictionary of batched features.

    • Datasets infos are additional infos useful for the debugging of the dataset (e.g., class mappings, sample UIDs). Datasets infos are typically much smaller than datasets should be loaded in loadable in memory

  • model – a Module or a ModuleDict

  • optimizers_fn

  • losses_fn

  • loss_creator

  • log_path – the path of the logs to be exported during the training of the model. if the log_path is not an absolute path, the options.workflow_options.logging_directory is used as root

  • with_final_evaluation

  • history

  • erase_logging_folder – if True, the logging will be erased when fitting starts

  • eval_every_X_epoch – evaluate the model every X epochs

Returns:

trw.train.create_sgd_optimizers_fn(datasets, model, learning_rate, momentum=0.9, weight_decay=0, nesterov=False, scheduler_fn=None, per_step_scheduler_fn=None)

Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • scheduler_fn – a scheduler, or None

  • momentum – the momentum of the SGD

  • weight_decay – the weight decay

  • nesterov – enables Nesterov momentum

  • per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)

Returns

An optimizer

trw.train.create_sgd_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9, nesterov=False)

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma

  • gamma – the factor to apply to the learning rate every step

  • weight_decay – the weight decay

  • nesterov – enables Nesterov momentum

  • momentum – the momentum of the SGD

Returns

An optimizer with a step scheduler

trw.train.create_scheduler_step_lr(optimizer, step_size=30, gamma=0.1)

Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma

Parameters
  • optimizer – the optimizer

  • step_size – every number of epochs composing one step. Each step the learning rate will be decreased

  • gamma – apply this factor to the learning rate every time it is adjusted

Returns

a learning rate scheduler

trw.train.create_adam_optimizers_fn(datasets, model, learning_rate, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, scheduler_fn=None, per_step_scheduler_fn=None)

Create an ADAM optimizer for each of the dataset with optional scheduler

Parameters
  • datasets – a dictionary of datasets

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • weight_decay – the weight decay

  • scheduler_fn – a scheduler, or None

  • betas – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

  • eps – term to add to denominator to avoid division by zero

  • per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)

Returns

An optimizer

trw.train.create_adam_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, betas=(0.9, 0.999))

Create an ADAM optimizer for each of the dataset with optional scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • learning_rate – the initial learning rate

  • step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma

  • gamma – the factor to apply to the learning rate every step

  • weight_decay – the weight decay

  • betas – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

Returns

An optimizer with a step scheduler

trw.train.create_optimizers_fn(datasets, model, optimizer_fn, scheduler_fn=None, per_step_scheduler_fn=None)

Create an optimizer and scheduler

Note

if model is an instance of`ModuleDict`, then the optimizer will only consider the parameters model[dataset_name].parameters() else model.parameters()

Parameters
  • datasets – a dictionary of dataset

  • model – the model. Should be a Module or a ModuleDict

  • optimizer_fn – the functor to instantiate the optimizer

  • scheduler_fn – the functor to instantiate the scheduler to be run by epoch. May be None, in that case there will be no schedule

  • per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)

trw.train.create_sgd_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3, nesterov=False)

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • max_learning_rate – the maximum learning rate

  • epochs – The number of epochs to train for

  • steps_per_epoch – The number of steps per epoch. If 0 or None, the schedule will be based on mumber of epochs only

  • learning_rate_start_div_factor – defines the initial learning rate for the first step as initial_learning = max_learning_rate / learning_rate_start_div_factor

  • learning_rate_end_div_factor – defines the end learning rate for the last step as final_learning_rate = max_learning_rate / learning_rate_start_div_factor / learning_rate_end_div_factor

  • percentage_cycle_increase – The percentage of the cycle (in number of steps) spent increasing the learning rate

  • additional_scheduler_kwargs – additional arguments provided to the scheduler

  • weight_decay – the weight decay

  • nesterov – enables Nesterov momentum

  • momentum – the momentum of the SGD

Returns

An optimizer with a step scheduler

trw.train.create_adam_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3)

Create a ADAM optimizer for each of the dataset with step learning rate scheduler

Parameters
  • datasets – a dictionary of dataset

  • model – a model to optimize

  • max_learning_rate – the maximum learning rate

  • epochs – The number of epochs to train for

  • steps_per_epoch – The number of steps per epoch. If 0 or None, the schedule will be based on mumber of epochs only

  • learning_rate_start_div_factor – defines the initial learning rate for the first step as initial_learning = learning_rate_start_multiplier * max_learning_rate

  • learning_rate_end_div_factor – defines the end learning rate for the last step as final_learning_rate = max_learning_rate / learning_rate_start_div_factor / learning_rate_end_div_factor

  • percentage_cycle_increase – The percentage of the cycle (in number of steps) spent increasing the learning rate

  • additional_scheduler_kwargs – additional arguments provided to the scheduler

  • weight_decay – the weight decay

  • betasbetas of the ADAM optimizer

  • epseps of the ADAM optimizer

Returns

An optimizer with a step scheduler

class trw.train.ClippingGradientNorm(optimizer_base: torch.optim.Optimizer, max_norm: float = 1.0, norm_type: float = 2.0)

Bases: torch.optim.Optimizer

Clips the gradient norm during optimization

step(self, closure=None)

Performs a single optimization step (parameter update).

Parameters

closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.

class trw.train.Optimizer(optimizer_fn: Callable[[Iterator[torch.nn.parameter.Parameter]], torch.optim.Optimizer], scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]] = None, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]] = None)
set_scheduler_fn(self, scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]])
set_step_scheduler_fn(self, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]])
__call__(self, datasets: trw.basic_typing.Datasets, model: torch.nn.Module) Tuple[Dict[str, torch.optim.Optimizer], Optional[Dict[str, SchedulerType]], Optional[Dict[str, StepSchedulerType]]]
scheduler_step_lr(self, step_size: int, gamma: float = 0.1) Optimizer

Apply a scheduler on the learning rate.

Decays the learning rate of each parameter group by gamma every step_size epochs.

scheduler_cosine_annealing_warm_restart(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1) Optimizer

Apply a scheduler on the learning rate.

Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.

References

https://arxiv.org/pdf/1608.03983v5.pdf

scheduler_cosine_annealing_warm_restart_decayed(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1, decay_factor=0.7) Optimizer

Apply a scheduler on the learning rate. Each time the learning rate is restarted, the base learning rate is decayed

Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.

References

https://arxiv.org/pdf/1608.03983v5.pdf

scheduler_one_cycle(self, max_learning_rate: float, epochs: int, steps_per_epoch: int, learning_rate_start_div_factor: float = 25.0, learning_rate_end_div_factor: float = 10000.0, percentage_cycle_increase: float = 0.3, anneal_strategy: str = 'cos', cycle_momentum: bool = True, base_momentum: float = 0.85, max_momentum: float = 0.95)

This scheduler should not be used with another scheduler!

The learning rate or momentum provided by the Optimizer will be overriden by this scheduler.

clip_gradient_norm(self, max_norm: float = 1.0, norm_type: float = 2.0)

Clips the gradient norm during optimization

Parameters
  • max_norm – the maximum norm of the concatenated gradients of the optimizer. Note: the gradient is modulated by the learning rate

  • norm_type – type of the used p-norm. Can be 'inf' for infinity norm

See:

torch.nn.utils.clip_grad_norm_()

class trw.train.OptimizerAdam(learning_rate: float, weight_decay: float = 0, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08)

Bases: Optimizer

class trw.train.OptimizerSGD(learning_rate: float, momentum: float = 0.9, weight_decay: float = 0, nesterov: bool = False)

Bases: Optimizer

class trw.train.OptimizerAdamW(learning_rate: float, weight_decay: float = 0.01, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08)

Bases: Optimizer

trw.train.plot_group_histories(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) None

Plot groups of histories :param root: the directory where the plot will be exported :param history_values: a map of list of list of (epoch, value) :param title: the title of the graph :param xlabel: the x label :param ylabel: the y label :param max_nb_plots_per_group: the maximum number of plots per group :param colors: the colors to be used

trw.train.confusion_matrix(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) None

Plot the confusion matrix of a predicted class versus the true class

Parameters
  • export_path – the folder where the confusion matrix will be exported

  • classes_predictions – the classes that were predicted by the classifier

  • classes_trues – the true classes

  • classes – a list of labels. Label 0 for class 0, label 1 for class 1…

  • normalize – if True, the confusion matrix will be normalized to 1.0 per row

  • title – the title of the plot

  • cmap – the color map to use

  • display_numbers – if True, display the numbers within each cell of the confusion matrix

  • maximum_chars_per_line – the title will be split every maximum_chars_per_line characters to avoid display issues

  • rotate_x – if not None, indicates the rotation of the label on x axis

  • rotate_y – if not None, indicates the rotation of the label on y axis

  • display_names_x – if True, the class name, if specified, will also be displayed on the x axis

  • sort_by_decreasing_sample_size – if True, the confusion matrix will be sorted by decreasing number of samples. This can

be useful to show if the errors may be due to low number of samples :param excludes_classes_with_samples_less_than: if not None, the classes with

less than excludes_classes_with_samples_less_than samples will be excluded

:param normalize_unit_percentage if True, use 100% base as unit instead of 1.0 :param main_font_size: the font size of the text :param sub_font_size: the font size of the sub-elements (e.g., ticks) :param max_size_x_label: the maximum length of a label on the x-axis

trw.train.classification_report(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)

Summarizes the important statistics for a classification problem :param predictions: the classes predicted :param prediction_scores: the scores for each, for each sample :param trues: the true class for each sample :param class_mapping: the class mapping (class id, class name) :return: a dictionary of statistics or sub-report

trw.train.list_classes_from_mapping(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')

Create a contiguous list of label names ordered from 0..N from the class mapping

Parameters
  • mappinginv – a dictionary like structure encoded as (class id, class_name)

  • default_name – if there is no class name, use this as default

Returns

a list of class names ordered from class id = 0 to class id = N. If mappinginv is None, returns None

trw.train.plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)

Calculate the ROC and AUC of a binary classifier

Supports multiple ROC curves.

Parameters
  • export_path – the folder where the plot will be exported

  • trues – the expected class. Can be a list for multiple ROC curves

  • found_scores_1 – the score found for the prediction of class 1. Must be a numpy array of floats. Can be a list for multiple ROC curves

  • title – the title of the ROC

  • label_name – the name of the ROC curve. Can be a list for multiple ROC curves

  • colors – if None use default colors. Else, a numpy array of dim (Nx3) where N is the number of colors. Must be in [0..1] range

trw.train.boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given measure?

Parameters
  • export_path – where to export the figure

  • features_trials – a dictionary of list. Each list representing a feature

  • title – the title of the plot

  • ylabel – the label for axis y

  • xlabel – the label for axis x

  • meanline – if True, draw a line from the center of the plot for each history name to the next

  • maximum_chars_per_line – the maximum of characters allowed per line of title. If exceeded, newline will be created.

  • plot_trials – if True, each trial of a feature will be plotted

  • scale – the axis scale to be used

  • y_range – if not None, the (min, max) of the y-axis

  • rotate_x – if not None, the rotation of the x axis labels in degree

  • showfliers – if True, plot the outliers

  • maximum_chars_per_line – the maximum number of characters of the title per line

  • title_line_height – the height of the title lines

trw.train.export_figure(path, name, maximum_length=259, dpi=None)

Export a figure

Parameters
  • path – the folder where to export the figure

  • name – the name of the figure.

  • maximum_length – the maximum length of the full path of a figure. If the full path name is greater than maximum_length, the name will be subs-ampled to the maximal allowed length

  • dpi – Dots Per Inch: the density of the figure

trw.train.auroc(trues: numpy.ndarray, found_1_scores: numpy.ndarray) float

Calculate the area under the curve of the ROC plot (AUROC)

Parameters
  • trues – the expected class

  • found_1_scores – the score found for the class 1. Must be a numpy array of floats

Returns

the AUROC

trw.train.find_tensor_leaves_with_grad(tensor: torch.Tensor) Sequence[torch.Tensor]

Find the input leaves of a tensor.

Input Leaves REQUIRES have requires_grad=True, else they will not be found

Parameters

tensor – a torch.Tensor

Returns

a list of torch.Tensor with attribute requires_grad=True that is an input of tensor

trw.train.find_last_forward_convolution(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) Optional[Mapping]

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

Parameters
  • inputs – the input of the model so that we can call model(inputs)

  • model – the model

  • types – the types to be captured. Can be a single type or a tuple of types

  • relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

trw.train.find_last_forward_types(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]], relative_index: int = 0) Optional[Mapping]

Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type

Parameters
  • inputs – the input of the model so that we can call model(inputs)

  • model – the model

  • types – the types to be captured. Can be a single type or a tuple of types

  • relative_index – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

trw.train.find_first_forward_convolution(model: torch.nn.Module, inputs: Any = None, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) Optional[Mapping]

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

Parameters
  • inputs – NOT USED

  • model – the model

  • types – the types to be captured. Can be a single type or a tuple of types

  • relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

class trw.train.GradCam(model: torch.nn.Module, find_convolution: Callable[[torch.nn.Module, Union[trw.basic_typing.Batch, torch.Tensor]], Optional[Mapping]] = graph_reflection.find_last_forward_convolution, post_process_output: Callable[[Any], torch.Tensor] = guided_back_propagation.post_process_output_id)

Gradient-weighted Class Activation Mapping

This is based on the paper “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization”, Ramprasaath R et al.

__call__(self, inputs: Union[trw.basic_typing.Batch, torch.Tensor], target_class_name: str = None, target_class: int = None) Optional[Tuple[str, Mapping]]
Parameters
  • inputs – the inputs to be fed to the model

  • target_class_name

    the output node to be used. If None: * if model output is a single tensor then use this as target output

    • else it will use the first OutputClassification output

  • target_class – the index of the class to explain the decision. If None, the class output will be used

Returns

a tuple (output name, a dictionary (input_name, GradCAMs))

class trw.train.GuidedBackprop(model: torch.nn.Module, unguided_gradient: bool = False, post_process_output: Callable[[Any], torch.Tensor] = post_process_output_id)

Produces gradients generated with guided back propagation from the given image

update_relus(self) None
Updates relu activation functions so that

1- stores output in forward pass 2- imputes zero for gradient values that are less than zero

static get_floating_inputs_with_gradients(inputs)

Extract inputs that have a gradient

Parameters

inputs – a tensor of dictionary of tensors

Returns

Return a list of tuple (name, input) for the input that have a gradient

__call__(self, inputs: Tuple[torch.Tensor, trw.basic_typing.Batch], target_class: int, target_class_name: str) Optional[Tuple[str, Mapping]]

Generate the guided back-propagation gradient

Parameters
  • inputs – a tensor or dictionary of tensors

  • target_class – the target class to be explained

  • target_class_name – the name of the output class if multiple outputs

Returns

a tuple (output_name, dictionary (input, gradient))

static get_positive_negative_saliency(gradient: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Generates positive and negative saliency maps based on the gradient

Parameters

gradient (numpy arr) – Gradient of the operation to visualize

Returns

pos_saliency ( )

trw.train.post_process_output_for_gradient_attribution(output: trw.train.outputs_trw.Output)

Postptocess the output to be suitable for gradient attribution.

In particular, if we have a trw.train.OutputClassification, we need to apply a softmax operation so that we can backpropagate the loss of a particular class with the appropriate value (1.0).

Parameters

output – a trw.train.OutputClassification

Returns

a torch.Tensor

class trw.train.IntegratedGradients(model: torch.nn.Module, steps: int = 100, baseline_inputs: Any = None, use_output_as_target: bool = False, post_process_output: Callable[[Any], torch.Tensor] = guided_back_propagation.post_process_output_id)
Implementation of Integrated gradients, a method of attributing the prediction of a deep network

to its input features.

This is implementing the paper Axiomatic Attribution for Deep Networks, Mukund Sundararajan, Ankur Taly, Qiqi Yan as described in https://arxiv.org/abs/1703.01365

__call__(self, inputs: Any, target_class_name: str, target_class: Optional[int] = None) Optional[Tuple[str, Mapping]]

Generate the guided back-propagation gradient

Parameters
  • inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained

  • target_class – the index of the class to explain the decision. If None, the class output will be used

  • target_class_name

    the output node to be used. If None: * if model output is a single tensor then use this as target output

    • else it will use the first OutputClassification output

Returns

a tuple (output_name, dictionary (input, integrated gradient))

trw.train.default_collate_fn(batch: Union[Sequence[Any], Mapping[str, Any]], device: torch.device, pin_memory: bool = False, non_blocking: bool = False)
Parameters
  • batch – a dictionary of features or a list of dictionary of features

  • device – the device where to create the torch.Tensor

  • pin_memory – if True, pin the memory. Required to be a CUDA allocated torch.Tensor

  • non_blocking – if True, use non blocking memory transfer

Returns

a dictionary of torch.Tensor

class trw.train.Sequence(source_split)

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch
abstract __iter__(self)
Returns

An iterator of batches

collate(self, collate_fn=default_collate_fn, device=None)

Aggregate the input batch as a dictionary of torch.Tensor and move the data to the appropriate device

Parameters
  • collate_fn – the function to collate the input batch

  • device – the device where to send the samples. If None, the default device is CPU

Returns

a collated sequence of batches

map(self, function_to_run, nb_workers=0, max_jobs_at_once=None, queue_timeout=default_queue_timeout, collate_fn=None, max_queue_size_pin=None)

Transform a sequence using a given function.

Note

The map may create more samples than the original sequence.

Parameters
  • function_to_run – the mapping function

  • nb_workers – the number of workers that will process the split. If 0, no workers will be created.

  • max_jobs_at_once – the maximum number of results that can be pushed in the result queue at once. If 0, no limit. If None, it will be set equal to the number of workers

  • queue_timeout – the timeout used to pull results from the output queue

  • collate_fn – a function to collate each batch of data

: param max_queue_size_pin: defines the max number of batches prefected. If None, defaulting to

a size based on the number of workers. This only controls the final queue sized of the pin thread (the workers queue can be independently set)

Returns

a sequence of batches

batch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)

Group several batches of samples into a single batch

Parameters
  • batch_size – the number of samples of the batch

  • discard_batch_not_full – if True and if a batch is not full, discard these

  • collate_fn – a function to collate the batches. If None, no collation performed

Returns

a sequence of batches

sub_batch(self, batch_size, discard_batch_not_full=False)

This sequence will split batches in smaller batches if the underlying sequence batch is too large.

This sequence can be useful to manage very large tensors. Indeed, this class avoids concatenating tensors (as opposed to in trw.train.SequenceReBatch). Since this operation can be costly as the tensors must be reallocated. In this case, it may be faster to work on a smaller batch by avoiding the concatenation cost.

Parameters
  • batch_size – the maximum size of a batch

  • discard_batch_not_full – if True, batch that do have size batch_size will be discarded

rebatch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)

Normalize a sequence to identical batch size given an input sequence with varying batch size

Parameters
  • batch_size – the size of the batches created by this sequence

  • discard_batch_not_full – if True, the last batch will be discarded if not full

  • collate_fn – function to merge multiple batches

max_samples(self, max_samples)
Virtual resize of the sequence. The sequence will terminate when a certain number

of samples produced has been reached. Restart the sequence where it was stopped.

Parameters

max_samples – the number of samples this sequence will produce before stopping

async_reservoir(self, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=sampler.SamplerSequential(), collate_fn=remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)
Parameters
  • max_reservoir_samples – the maximum number of samples of the reservoir

  • function_to_run – the function to run asynchronously

  • min_reservoir_samples – the minimum of samples of the reservoir needed before an output sequence can be created

  • nb_workers – the number of workers that will process function_to_run to fill the reservoir. Must be >= 1

  • max_jobs_at_once – the maximum number of jobs that can be started and stored by epoch by the workers. If 0, no limit. If None: set to the number of workers

  • reservoir_sampler – a sampler that will be used to sample the reservoir or None for sequential sampling of the reservoir

  • collate_fn – a function to post-process the samples into a single batch, or None if not to be collated

  • maximum_number_of_samples_per_epoch – the maximum number of samples that will be generated per epoch. If we reach this maximum, the sequence will be interrupted

  • max_reservoir_replacement_size – Specify the maximum number of samples replaced in the reservoir by epoch. If None, we will use the whole result queue. This can be useful to control explicitly how the reservoir is updated and depend less on the speed of hardware. Note that to have an effect, max_jobs_at_once should be greater than max_reservoir_replacement_size.

fill_queue(self)

Fill the queue jobs of the current sequence

fill_queue_all_sequences(self)

Go through all the sequences and fill their input queue

has_background_jobs(self)
Returns

True if this sequence has a background job to create the next element

has_background_jobs_previous_sequences(self)
Returns

the number of sequences that have background jobs currently running to create the next element

abstract subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

abstract subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

abstract close(self)
class trw.train.SequenceMap(source_split, nb_workers, function_to_run, max_jobs_at_once=None, queue_timeout=default_queue_timeout, debug_job_report_timeout=30.0, collate_fn=None, max_queue_size_pin=None)

Bases: trw.train.sequence.Sequence

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch
subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

fill_queue(self)

Fill the queue jobs of the current sequence

initializer(self)

Initialize the sequence to iterate through batches

__next_local(self, next_fn)

Get the next elements

Handles single item or list of items returned by next_fn :param next_fn: return the next elements

__next__(self)
has_background_jobs(self)
Returns

True if this sequence has a background job to create the next element

next_item(self, blocking)
__iter__(self)
Returns

An iterator of batches

close(self)

Finish and join the existing pool processes

class trw.train.SequenceArray(split, sampler=sampler_trw.SamplerRandom(), transforms=None, use_advanced_indexing=True, sample_uid_name=sample_uid_name)

Bases: trw.train.sequence.Sequence

Create a sequence of batches from numpy arrays, lists and torch.Tensor

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__iter__(self)
Returns

An iterator of batches

close(self)
class trw.train.SequenceBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

Group several batches into a single batch

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Special method to close and clean the resources of the sequence

class trw.train.SequenceAsyncReservoir(source_split, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=None, collate_fn=sequence.remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)

Bases: trw.train.sequence.Sequence

This sequence will asynchronously process data and keep a reserve of loaded samples

The idea is to have long loading processes work in the background while still using as efficiently as possible the data that is currently loaded. The data is slowly being replaced by freshly loaded data over time.

Jobs are started and results retrieved at the beginning of each epoch

This sequence can be interrupted (e.g., after a certain number of batches have been returned). When the sequence is restarted, the reservoir will not be emptied.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

reservoir_size(self)
Returns

The current number of samples in the reservoir

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

initializer(self)
fill_queue(self)

Fill the input queue of jobs to be completed

_retrieve_results_and_fill_queue(self)

Retrieve results from the output queue

_wait_for_job_completion(self)

Block the processing until we have enough result in the reservoir

__iter__(self)
Returns

An iterator of batches

close(self)

Finish and join the existing pool processes

class trw.train.SequenceAdaptorTorch(torch_dataloader, features=None)

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface

The main purpose is to enable compatibility with the torch data loader and any existing third party code.

__len__(self)
__iter__(self)
Returns

An iterator of batches

__next__(self)
Returns

The next batch of data

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

close(self)

Special method to close and clean the resources of the sequence

class trw.train.SequenceCollate(source_split, collate_fn=collate.default_collate_fn, device=None)

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

Group the data into a sequence of dictionary of torch.Tensor

This can be useful to combine batches of dictionaries into a single batch with all features concatenated on axis 0. Often used in conjunction of trw.train.SequenceAsyncReservoir and trw.train.SequenceMap.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Special method to close and clean the resources of the sequence

class trw.train.SequenceReBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

This sequence will normalize the batch size of an underlying sequence

If the underlying sequence batch is too large, it will be split in multiple batches. Conversely, if the size of the batch is too small, it several batches will be merged until we reach the expected batch size.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Special method to close and clean the resources of the sequence

class trw.train.SequenceSubBatch(source_split, batch_size, discard_batch_not_full=False)

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

This sequence will split batches in smaller batches if the underlying sequence batch is too large.

This sequence can be useful to manage very large tensors. Indeed, this class avoids concatenating tensors (as opposed to in trw.train.SequenceReBatch). Since this operation can be costly as the tensors must be reallocated. In this case, it may be faster to work on a smaller batch by avoiding the concatenation cost.

subsample(self, nb_samples)

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters

nb_samples – the number of samples desired in the original sequence

Returns

a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)

Sub-sample a sequence to samples with specified UIDs.

Parameters
  • uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering

  • uids_name (str) – the name of the UIDs

  • new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)
Returns

The next batch of data

__iter__(self)
Returns

An iterator of batches

close(self)

Special method to close and clean the resources of the sequence

class trw.train.Metric

Bases: abc.ABC

A metric base class

Calculate interesting metric

abstract __call__(self, outputs: Dict) Optional[Dict]
Parameters

outputs – the outputs of a batch

Returns

a dictionary of metric names/values or None

abstract aggregate_metrics(self, metric_by_batch: List[Dict]) Dict[str, float]

Aggregate all the metrics into a consolidated metric.

Parameters

metric_by_batch – a list of metrics, one for each batch

Returns

a dictionary of result name and value

class trw.train.MetricClassificationError

Bases: Metric

Calculate the 1 - accuracy using the output_truth and output

__call__(self, outputs)
Parameters

outputs – the outputs of a batch

Returns

a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)

Aggregate all the metrics into a consolidated metric.

Parameters

metric_by_batch – a list of metrics, one for each batch

Returns

a dictionary of result name and value

class trw.train.MetricClassificationBinarySensitivitySpecificity

Bases: Metric

Calculate the sensitivity and specificity for a binary classification using the output_truth and output

__call__(self, outputs)
Parameters

outputs – the outputs of a batch

Returns

a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)

Aggregate all the metrics into a consolidated metric.

Parameters

metric_by_batch – a list of metrics, one for each batch

Returns

a dictionary of result name and value

class trw.train.MetricLoss

Bases: Metric

Extract the loss from the outputs

__call__(self, outputs)
Parameters

outputs – the outputs of a batch

Returns

a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)

Aggregate all the metrics into a consolidated metric.

Parameters

metric_by_batch – a list of metrics, one for each batch

Returns

a dictionary of result name and value

class trw.train.MetricClassificationBinaryAUC

Bases: Metric

Calculate the Area under the Receiver operating characteristic (ROC) curve.

For this, the output needs to provide an output_raw of shape [N, 2] (i.e., binary classification framed as a multi-class classification) or of shape [N, 1] (binary classification)

__call__(self, outputs)
Parameters

outputs – the outputs of a batch

Returns

a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)

Aggregate all the metrics into a consolidated metric.

Parameters

metric_by_batch – a list of metrics, one for each batch

Returns

a dictionary of result name and value

class trw.train.MetricClassificationF1(average=None)

Bases: Metric

A metric base class

Calculate interesting metric

__call__(self, outputs)
Parameters

outputs – the outputs of a batch

Returns

a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)

Aggregate all the metrics into a consolidated metric.

Parameters

metric_by_batch – a list of metrics, one for each batch

Returns

a dictionary of result name and value

class trw.train.SamplerRandom(replacement=False, nb_samples_to_generate=None, batch_size=1)

Bases: Sampler

Samples elements randomly. If without replacement, then sample from a shuffled dataset. If with replacement, then user can specify num_samples to draw.

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

__next__(self)
class trw.train.SamplerSequential(batch_size=1)

Bases: Sampler

Samples elements sequentially, always in the same order.

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

class trw.train.SamplerSubsetRandom(indices)

Bases: Sampler

Samples elements randomly from a given list of indices, without replacement.

Parameters

indices (sequence) – a sequence of indices

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

class trw.train.SamplerClassResampling(class_name, nb_samples_to_generate, reuse_class_frequencies_across_epochs=True, batch_size=1)

Bases: Sampler

Resample the samples so that class_name classes have equal probably of being sampled.

Classification problems rarely have balanced classes so it is often required to super-sample the minority class to avoid penalizing the under represented classes and help the classifier to learn good features (as opposed to learn the class distribution).

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

_fit(self, classes)
__next__(self)
__iter__(self)

Returns: an iterator the return indices of the original data source

class trw.train.Sampler

Bases: object

Base class for all Samplers.

Every Sampler subclass has to provide an __iter__ method, providing a way to iterate over indices of dataset elements, and a __len__ method that returns the length of the returned iterators.

abstract initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

abstract __iter__(self)

Returns: an iterator the return indices of the original data source

class trw.train.SamplerSubsetRandomByListInterleaved(indices: Sequence[Sequence[int]])

Bases: Sampler

Elements from a given list of list of indices are randomly drawn without replacement, one element per list at a time.

For sequences with different sizes, the longest of the sequences will be trimmed to the size of the shortest sequence.

This can be used for example to resample without replacement imbalanced classes in a classification task.

Examples:

>>> l1 = np.asarray([1, 2])
>>> l2 = np.asarray([3, 4, 5])
>>> sampler = trw.train.SamplerSubsetRandomByListInterleaved([l1, l2])
>>> sampler.initializer(None)
>>> indices = [i for i in sampler]
# indices could be [1, 5, 2, 4]
Parameters

indices – a sequence of sequence of indices

initializer(self, data_source)

Initialize the sequence iteration

Parameters

data_source – the data source to iterate

__iter__(self)

Returns: an iterator the return indices of the original data source

class trw.train.FilterFixed(kernel: torch.Tensor, groups: int = 1, padding: int = 0)

Bases: torch.nn.Module

Apply a fixed filter to n-dimensional images

__call__(self, value: trw.basic_typing.TorchTensorNCX) trw.basic_typing.TorchTensorNCX
class trw.train.FilterGaussian(input_channels: int, nb_dims: int, sigma: Union[float, Sequence[float]], kernel_sizes: Optional[Union[int, Sequence[int]]] = None, padding: typing_extensions.Literal[same, none] = 'same', device: Optional[torch.device] = None)

Bases: FilterFixed

Implement a gaussian filter as a torch.nn.Module

class trw.train.MeaningfulPerturbation(model, iterations=150, l1_coeff=0.1, tv_coeff=0.2, tv_beta=3, noise=0.2, model_output_postprocessing=functools.partial(F.softmax, dim=1), mask_reduction_factor=8, optimizer_fn=default_optimizer, information_removal_fn=default_information_removal_smoothing, export_fn=None)

Implementation of “Interpretable Explanations of Black Boxes by Meaningful Perturbation”, arXiv:1704.03296

Handle only 2D and 3D inputs. Other inputs will be discarded.

Deviations: - use a global smoothed image to speed up the processing

__call__(self, inputs, target_class_name, target_class=None)
Parameters
  • inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained

  • target_class – the index of the class to explain the decision. If None, the class output will be used

  • target_class_name

    the output node to be used. If None: * if model output is a single tensor then use this as target output

    • else it will use the first OutputClassification output

Returns

a tuple (output_name, dictionary (input, explanation mask))

static _get_output(target_class_name, outputs, postprocessing)
trw.train.default_information_removal_smoothing(image, blurring_sigma=5, blurring_kernel_size=23, explanation_for=None)

Default information removal (smoothing).

Parameters
  • image – an image

  • blurring_sigma – the sigma of the blurring kernel used to “remove” information from the image

  • blurring_kernel_size – the size of the kernel to be used. This is an internal parameter to approximate the gaussian kernel. This is exposed since in 3D case, the memory consumption may be high and having a truthful gaussian blurring is not crucial.

  • explanation_for – the class to explain

Returns

a smoothed image

class trw.train.DataParallelExtended(*arg, **argv)

Bases: torch.nn.DataParallel

Customized version of torch.nn.DataParallel to support model with complex outputs such as trw.train.Output

gather(self, outputs, output_device)
trw.train.grid_sample(input: torch.Tensor, grid: torch.Tensor, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = None) torch.Tensor

Compatibility layer for argument change between pytorch <= 1.2 and pytorch > 1.3

See torch.nn.functional.grid_sample()