`trw.train`¶

Package Contents¶

Classes¶

`Options`	Create default options for the training and evaluation process.
`CleanAddedHooks`	Context manager that automatically track added hooks on the model and remove them when
`Output`	This is a tag name to find the output reference back from outputs
`OutputClassification`	Classification output
`OutputRegression`	Regression output
`OutputEmbedding`	Represent an embedding
`OutputTriplets`	This is a tag name to find the output reference back from outputs
`OutputLoss`	Represent a given loss as an output.
`OutputSegmentation`	Classification output
`OutputSegmentationBinary`	Output for binary segmentation.
`OutputClassificationBinary`	Classification output for binary classification
`LossDiceMulticlass`	Implementation of the soft Dice Loss (multi-class) for N-d images
`LossFocalMulticlass`	This criterion is a implementation of Focal Loss, which is proposed in
`LossTriplets`	Implement a triplet loss
`LossCenter`	Center loss, penalize the features falling further from the feature class center.
`LossContrastive`	Implementation of the contrastive loss.
`LossCrossEntropyCsiMulticlass`	Optimize a metric similar to `Critical Success Index` (CSI) on the cross-entropy
`LossBinaryF1`	The macro F1-score is non-differentiable. Instead use a surrogate that is differentiable
`LossMsePacked`	Mean squared error loss with target packed as an integer (e.g., classification)
`TrainerV2`
`ClippingGradientNorm`	Clips the gradient norm during optimization
`Optimizer`
`OptimizerAdam`
`OptimizerSGD`
`OptimizerAdamW`
`GradCam`	Gradient-weighted Class Activation Mapping
`GuidedBackprop`	Produces gradients generated with guided back propagation from the given image
`IntegratedGradients`	Implementation of Integrated gradients, a method of attributing the prediction of a deep network
`Sequence`	A Sequence defines how to iterate the data as a sequence of small batches of data.
`SequenceMap`	A Sequence defines how to iterate the data as a sequence of small batches of data.
`SequenceArray`	Create a sequence of batches from numpy arrays, lists and `torch.Tensor`
`SequenceBatch`	Group several batches into a single batch
`SequenceAsyncReservoir`	This sequence will asynchronously process data and keep a reserve of loaded samples
`SequenceAdaptorTorch`	Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface
`SequenceCollate`	Group the data into a sequence of dictionary of torch.Tensor
`SequenceReBatch`	This sequence will normalize the batch size of an underlying sequence
`SequenceSubBatch`	This sequence will split batches in smaller batches if the underlying sequence batch is too large.
`Metric`	A metric base class
`MetricClassificationError`	Calculate the `1 - accuracy` using the output_truth and output
`MetricClassificationBinarySensitivitySpecificity`	Calculate the sensitivity and specificity for a binary classification using the output_truth and output
`MetricLoss`	Extract the loss from the outputs
`MetricClassificationBinaryAUC`	Calculate the Area under the Receiver operating characteristic (ROC) curve.
`MetricClassificationF1`	A metric base class
`SamplerRandom`	Samples elements randomly. If without replacement, then sample from a shuffled dataset.
`SamplerSequential`	Samples elements sequentially, always in the same order.
`SamplerSubsetRandom`	Samples elements randomly from a given list of indices, without replacement.
`SamplerClassResampling`	Resample the samples so that class_name classes have equal probably of being sampled.
`Sampler`	Base class for all Samplers.
`SamplerSubsetRandomByListInterleaved`	Elements from a given list of list of indices are randomly drawn without replacement,
`FilterFixed`	Apply a fixed filter to n-dimensional images
`FilterGaussian`	Implement a gaussian filter as a `torch.nn.Module`
`MeaningfulPerturbation`	Implementation of "Interpretable Explanations of Black Boxes by Meaningful Perturbation", arXiv:1704.03296
`DataParallelExtended`	Customized version of `torch.nn.DataParallel` to support model with

Functions¶

`get_logging_root`(logging_root: Optional[str] = None) → str	Return the data root directory
`create_or_recreate_folder`(path, nb_tries=3, wait_time_between_tries=2.0)	Check if the path exist. If yes, remove the folder then recreate the folder, else create it
`set_optimizer_learning_rate`(optimizer, learning_rate)	Set the learning rate of the optimizer to a specific value
`safe_filename`(filename)	Clean the filename so that it can be used as a valid filename
`get_device`(module, batch=None)	Return the device of a module. This may be incorrect if we have a module split accross different devices
`transfer_batch_to_device`(batch, device, non_blocking=True)	Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.
`find_default_dataset_and_split_names`(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)	Return a good choice of dataset name and split name, possibly not the train split.
`get_class_name`(mapping, classid)
`get_classification_mapping`(datasets_infos, dataset_name, split_name, output_name)	Return the output mappings of a classification output from the datasets infos
`get_classification_mappings`(datasets_infos, dataset_name, split_name)	Return the output mappings of a classification output from the datasets infos
`make_triplet_indices`(targets)	Make random index triplets (anchor, positive, negative) such that `anchor` and `positive`
`make_pair_indices`(targets, same_target_ratio=0.5)	Make random indices of pairs of samples that belongs or not to the same target.
`make_unique_colors`()	Return a set of unique and easily distinguishable colors
`make_unique_colors_f`()	Return a set of unique and easily distinguishable colors
`apply_spectral_norm`(module, n_power_iterations=1, eps=1e-12, dim=None, name='weight', discard_layer_types=(torch.nn.InstanceNorm2d, torch.nn.InstanceNorm3d))	Apply spectral norm on every sub-modules
`apply_gradient_clipping`(module: torch.nn.Module, value)	Apply gradient clipping recursively on a module as callback.
`segmentation_criteria_ce_dice`(output, truth, per_voxel_weights=None, ce_weight=0.5, per_class_weights=None, power=1.0, smooth=1.0, focal_gamma=None)	loss combining cross entropy and multi-class dice
`total_variation_norm`(x, beta)	Calculate the total variation norm
`one_hot`(targets: trw.basic_typing.TorchTensorNX, num_classes: int, dtype=torch.float32, device: Optional[torch.device] = None) → trw.basic_typing.TorchTensorNCX	Encode the targets (an tensor of integers representing a class)
`create_losses_fn`(datasets, generic_loss)	Create a dictionary of loss functions for each of the dataset
`epoch_train_eval`(options, datasets, optimizers, model, losses, schedulers, per_step_schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, force_eval_mode, eval_loop_fn=eval_loop, train_loop_fn=train_loop)	param options
`eval_loop`(options, device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)	Run the eval loop (i.e., the model parameters will NOT be updated)
`train_loop`(options, device, dataset_name, split_name, split, optimizer, per_step_scheduler, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, gradient_scaler=None)	Run the train loop (i.e., the model parameters will be updated)
`default_post_training_callbacks`(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True, additional_callbacks=None)	Default callbacks to be performed after the model has been trained
`default_per_epoch_callbacks`(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None, additional_callbacks=None)	Default callbacks to be performed at the end of each epoch
`default_pre_training_callbacks`(logger=default_logger, with_lr_finder=False, with_export_augmentations=True, with_reporting_server=True, with_profiler=False, additional_callbacks=None)	Default callbacks to be performed before the fitting of the model
`default_sum_all_losses`(dataset_name, batch, loss_terms)	Default loss is the sum of all loss terms
`create_sgd_optimizers_fn`(datasets, model, learning_rate, momentum=0.9, weight_decay=0, nesterov=False, scheduler_fn=None, per_step_scheduler_fn=None)	Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler
`create_sgd_optimizers_scheduler_step_lr_fn`(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9, nesterov=False)	Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler
`create_scheduler_step_lr`(optimizer, step_size=30, gamma=0.1)	Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma
`create_adam_optimizers_fn`(datasets, model, learning_rate, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, scheduler_fn=None, per_step_scheduler_fn=None)	Create an ADAM optimizer for each of the dataset with optional scheduler
`create_adam_optimizers_scheduler_step_lr_fn`(datasets, model, learning_rate, step_size, gamma, weight_decay=0, betas=(0.9, 0.999))	Create an ADAM optimizer for each of the dataset with optional scheduler
`create_optimizers_fn`(datasets, model, optimizer_fn, scheduler_fn=None, per_step_scheduler_fn=None)	Create an optimizer and scheduler
`create_sgd_optimizers_scheduler_one_cycle_lr_fn`(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3, nesterov=False)	Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler
`create_adam_optimizers_scheduler_one_cycle_lr_fn`(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3)	Create a ADAM optimizer for each of the dataset with step learning rate scheduler
`plot_group_histories`(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) → None	Plot groups of histories
`confusion_matrix`(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) → None	Plot the confusion matrix of a predicted class versus the true class
`classification_report`(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)	Summarizes the important statistics for a classification problem
`list_classes_from_mapping`(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')	Create a contiguous list of label names ordered from 0..N from the class mapping
`plot_roc`(export_path, trues, found_scores_1, title, label_name=None, colors=None)	Calculate the ROC and AUC of a binary classifier
`boxplots`(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)	Compare different histories: e.g., compare 2 configuration, which one has the best results for a given
`export_figure`(path, name, maximum_length=259, dpi=None)	Export a figure
`auroc`(trues: numpy.ndarray, found_1_scores: numpy.ndarray) → float	Calculate the area under the curve of the ROC plot (AUROC)
`find_tensor_leaves_with_grad`(tensor: torch.Tensor) → Sequence[torch.Tensor]	Find the input leaves of a tensor.
`find_last_forward_convolution`(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) → Optional[Mapping]	Perform a forward pass of the model with given inputs and retrieve the last convolutional layer
`find_last_forward_types`(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]], relative_index: int = 0) → Optional[Mapping]	Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type
`find_first_forward_convolution`(model: torch.nn.Module, inputs: Any = None, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) → Optional[Mapping]	Perform a forward pass of the model with given inputs and retrieve the last convolutional layer
`post_process_output_for_gradient_attribution`(output: trw.train.outputs_trw.Output)	Postptocess the output to be suitable for gradient attribution.
`default_collate_fn`(batch: Union[Sequence[Any], Mapping[str, Any]], device: torch.device, pin_memory: bool = False, non_blocking: bool = False)	param batch a dictionary of features or a list of dictionary of features
`default_information_removal_smoothing`(image, blurring_sigma=5, blurring_kernel_size=23, explanation_for=None)	Default information removal (smoothing).
`grid_sample`(input: torch.Tensor, grid: torch.Tensor, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = None) → torch.Tensor	Compatibility layer for argument change between pytorch <= 1.2 and pytorch > 1.3

Attributes¶

default_sample_uid_name

class trw.train.Options(logging_directory: Optional[str] = None, num_epochs: int = 50, device: Optional[torch.device] = None, mixed_precision_enabled: bool = False, gradient_update_frequency: int = 1)¶

Create default options for the training and evaluation process.

__repr__(self) → str¶: Return repr(self).

trw.train.get_logging_root(logging_root: Optional[str] = None) → str¶: Return the data root directory

trw.train.create_or_recreate_folder(path, nb_tries=3, wait_time_between_tries=2.0)¶

Check if the path exist. If yes, remove the folder then recreate the folder, else create it

Parameters

path – the path to create or recreate
nb_tries – the number of tries to be performed before failure
wait_time_between_tries – the time to wait before the next try

Returns

True if successful or False if failed.

trw.train.set_optimizer_learning_rate(optimizer, learning_rate)¶

Set the learning rate of the optimizer to a specific value

Parameters

optimizer – the optimizer to update
learning_rate – the learning rate to set

Returns

None

class trw.train.CleanAddedHooks(model)¶

Context manager that automatically track added hooks on the model and remove them when the context is released

__enter__(self)¶

__exit__(self, type, value, traceback)¶

static record_hooks(module_source)¶

Record hooks :param module_source: the module to track the hooks

Returns: at tuple (forward, backward). forward and backward are a dictionary of hooks ID by module

trw.train.safe_filename(filename)¶: Clean the filename so that it can be used as a valid filename

trw.train.get_device(module, batch=None)¶: Return the device of a module. This may be incorrect if we have a module split accross different devices

trw.train.transfer_batch_to_device(batch, device, non_blocking=True)¶

Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.

Parameters

batch – the batch of data to be transferred
device – the device to move the tensors to
non_blocking – non blocking memory transfer to GPU

Returns

a batch of data on the specified device

trw.train.find_default_dataset_and_split_names(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)¶

Return a good choice of dataset name and split name, possibly not the train split.

Parameters

datasets – the datasets
default_dataset_name – a possible dataset name. If None, find a suitable dataset, if not, the dataset must be present
default_split_name – a possible split name. If None, find a suitable split, if not, the dataset must be present. if train_split_name is specified, the selected split name will be different from train_split_name
train_split_name – if not None, exclude the train split

Returns

a tuple (dataset_name, split_name)

trw.train.get_class_name(mapping, classid)¶

trw.train.get_classification_mapping(datasets_infos, dataset_name, split_name, output_name)¶

Return the output mappings of a classification output from the datasets infos

Parameters

datasets_infos – the info of the datasets
dataset_name – the name of the dataset
split_name – the split name
output_name – the output name

Returns

a dictionary {‘mapping’: {name->ID}, ‘mappinginv’: {ID->name}}

trw.train.get_classification_mappings(datasets_infos, dataset_name, split_name)¶

Return the output mappings of a classification output from the datasets infos

Parameters

datasets_infos – the info of the datasets
dataset_name – the name of the dataset
split_name – the split name

Returns

a dictionary {outputs: {‘mapping’: {name->ID}, ‘mappinginv’: {ID->name}}}

trw.train.make_triplet_indices(targets)¶

Make random index triplets (anchor, positive, negative) such that anchor and positive: belong to the same target while negative belongs to a different target

Parameters: targets – a 1D integral tensor in range [0..C]
Returns: a tuple of indices (samples, samples_positive, samples_negative)

trw.train.make_pair_indices(targets, same_target_ratio=0.5)¶

Make random indices of pairs of samples that belongs or not to the same target.

Parameters

same_target_ratio – specify the ratio of same target to be generated for sample pairs
targets – a 1D integral tensor in range [0..C] to be used to group the samples into same or different target

Returns

a tuple with (samples_0 indices, samples_1 indices, same_target)

trw.train.make_unique_colors()¶: Return a set of unique and easily distinguishable colors :return: a list of RBG colors

trw.train.make_unique_colors_f()¶: Return a set of unique and easily distinguishable colors :return: a list of RBG colors

trw.train.apply_spectral_norm(module, n_power_iterations=1, eps=1e-12, dim=None, name='weight', discard_layer_types=(torch.nn.InstanceNorm2d, torch.nn.InstanceNorm3d))¶

Apply spectral norm on every sub-modules

Parameters

module – the parent module to apply spectral norm
discard_layer_types – the layers_legacy of this type will not have spectral norm applied
n_power_iterations – number of power iterations to calculate spectral norm
eps – epsilon for numerical stability in calculating norms
dim – dimension corresponding to number of outputs, the default is 0, except for modules that are instances of ConvTranspose{1,2,3}d, when it is 1
name – name of weight parameter

Returns

the same module as input module

trw.train.apply_gradient_clipping(module: torch.nn.Module, value)¶

Apply gradient clipping recursively on a module as callback.

Every time the gradient is calculated, it is intercepted and clipping applied.

Parameters

module – a module where sub-modules will have their gradients clipped
value – the maximum value of the gradient

class trw.train.Output(metrics, output, criterion_fn, collect_output=False, sample_uid_name=None)¶

This is a tag name to find the output reference back from outputs

output_ref_tag = output_ref¶

evaluate_batch(self, batch, is_training)¶: Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)¶

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

class trw.train.OutputClassification(output, output_truth, *, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics: List[OutputClassification.__init__.metrics] = metrics.default_classification_metrics(), loss_reduction=torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1, keepdim=True), maybe_optional=False, classes_name='unknown', sample_uid_name=default_sample_uid_name)¶

Bases: Output

Classification output

evaluate_batch(self, batch, is_training)¶: Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)¶

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

class trw.train.OutputRegression(output, output_truth, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics=metrics.default_regression_metrics(), loss_reduction=mean_all, weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., target_name=None, sample_uid_name=default_sample_uid_name)¶

Bases: Output

Regression output

evaluate_batch(self, batch, is_training)¶: Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

class trw.train.OutputEmbedding(output, clean_loss_term_each_batch=False, sample_uid_name=default_sample_uid_name, functor=None)¶

Bases: Output

Represent an embedding

This is only used to record a tensor that we consider an embedding (e.g., to be exported to tensorboard)

evaluate_batch(self, batch, is_training)¶: Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)¶

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

trw.train.default_sample_uid_name = sample_uid¶

trw.train.segmentation_criteria_ce_dice(output, truth, per_voxel_weights=None, ce_weight=0.5, per_class_weights=None, power=1.0, smooth=1.0, focal_gamma=None)¶

loss combining cross entropy and multi-class dice

Parameters

output – the output value, with shape [N, C, Dn…D0]
truth – the truth, with shape [N, 1, Dn..D0]
ce_weight – the weight of the cross entropy to use. This controls the importance of the cross entropy loss to the overall segmentation loss. Range in [0..1]
per_class_weights – the weight per class. A 1D vector of size C indicating the weight of the classes. This will be used for the cross-entropy loss
per_voxel_weights – the weight of each truth voxel. Must be of shape [N, Dn..D0]

Returns

a torch tensor

class trw.train.OutputTriplets(samples, positive_samples, negative_samples, criterion_fn=lambda : ..., metrics=metrics.default_generic_metrics(), loss_reduction=mean_all, weight_name=None, loss_scaling=1.0, sample_uid_name=default_sample_uid_name)¶

Bases: Output

This is a tag name to find the output reference back from outputs

evaluate_batch(self, batch, is_training)¶: Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

class trw.train.OutputLoss(losses, loss_reduction=torch.mean, metrics=metrics.default_generic_metrics(), sample_uid_name=default_sample_uid_name)¶

Bases: Output

Represent a given loss as an output.

This can be useful to add additional regularizer to the training (e.g., trw.train.LossCenter).

evaluate_batch(self, batch, is_training)¶: Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)

loss_term_cleanup(self, loss_term)¶

This function is called for each batch just before switching to another batch.

It can be used to clean up large arrays stored or release CUDA memory

class trw.train.OutputSegmentation(output: torch.Tensor, output_truth: torch.Tensor, criterion_fn: Callable[[], Any] = LossDiceMulticlass, collect_output: bool = False, collect_only_non_training_output: bool = False, metrics: List[OutputSegmentation.__init__.metrics] = metrics.default_segmentation_metrics(), loss_reduction: Callable[[torch.Tensor], torch.Tensor] = torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1, keepdim=True), maybe_optional=False, sample_uid_name=default_sample_uid_name)¶

Bases: OutputClassification

Classification output

class trw.train.OutputSegmentationBinary(output: torch.Tensor, output_truth: torch.Tensor, criterion_fn: Callable[[], Any] = LossDiceMulticlass, collect_output: bool = False, collect_only_non_training_output: bool = False, metrics: List[OutputSegmentationBinary.__init__.metrics] = metrics.default_segmentation_metrics(), loss_reduction: Callable[[torch.Tensor], torch.Tensor] = torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., maybe_optional=False, sample_uid_name=default_sample_uid_name)¶

Bases: OutputSegmentation

Output for binary segmentation.

Parameters

output – shape N * 1 * X format, must be raw logits
output_truth – should have N * 1 * X format, with values 0 or 1

class trw.train.OutputClassificationBinary(output, output_truth, *, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics: List[OutputClassificationBinary.__init__.metrics] = metrics.default_classification_metrics(), loss_reduction=torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., maybe_optional=False, classes_name='unknown', sample_uid_name=default_sample_uid_name)¶

Bases: OutputClassification

Classification output for binary classification

Parameters

output – the output with shape [N, 1, {X}], without any activation applied (i.e., logits)
output_truth – the truth with shape [N, 1, {X}]

class trw.train.LossDiceMulticlass(normalization_fn: Callable[[torch.Tensor], torch.Tensor] = partial(nn.Softmax, dim=1), eps: float = 1e-05, return_dice_by_class: bool = False, smooth: float = 0.001, power: float = 1.0, per_class_weights: Sequence[float] = None, discard_background_loss: bool = True)¶

Bases: torch.nn.Module

Implementation of the soft Dice Loss (multi-class) for N-d images

If multi-class, compute the loss for each class then average the losses

References

[1] “V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation” https://arxiv.org/pdf/1606.04797.pdf

forward(self, output, target)¶

Parameters

output – must have N x C x d0 x … x dn shape, where C is the total number of classes to predict
target – must have N x 1 x d0 x … x dn shape

Returns

if return_dice_by_class is False, return 1 - dice score suitable for optimization. Else, return the (numerator, cardinality) by class and by sample

class trw.train.LossFocalMulticlass(alpha=None, gamma=2, reduction='mean')¶

Bases: torch.nn.Module

This criterion is a implementation of Focal Loss, which is proposed in Focal Loss for Dense Object Detection, https://arxiv.org/pdf/1708.02002.pdf

Loss(x, class) = - alpha (1-softmax(x)[class])^gamma log(softmax(x)[class])

Parameters

alpha (1D Tensor, Variable) – the scalar factor for this criterion. One weight factor for each class.
gamma (float, double) – gamma > 0; reduces the relative loss for well-classiﬁed examples (p > .5), putting more focus on hard, misclassiﬁed examples

forward(self, outputs, targets)¶

class trw.train.LossTriplets(margin=1.0, distance=nn.PairwiseDistance(p=2))¶

Bases: torch.nn.Module

Implement a triplet loss

The goal of the triplet loss is to make sure that:

Two examples with the same label have their embeddings close together in the embedding space
Two examples with different labels have their embeddings far away.

However, we don’t want to push the train embeddings of each label to collapse into very small clusters. The only requirement is that given two positive examples of the same class and one negative example, the negative should be farther away than the positive by some margin. This is very similar to the margin used in SVMs, and here we want the clusters of each class to be separated by the margin.

The loss implements the following equation:

mathcal{L} = max(d(a, p) - d(a, n) + margin, 0)

forward(self, samples, positive_samples, negative_samples)¶

Calculate the triplet loss

Parameters

samples – the samples
positive_samples – the samples that belong to the same group as samples
negative_samples – the samples that belong to a different group than samples

Returns

a 1D tensor (N) representing the loss per sample

class trw.train.LossCenter(number_of_classes, number_of_features, alpha=1.0)¶

Bases: torch.nn.Module

Center loss, penalize the features falling further from the feature class center.

In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this loss can be used as a new supervision signal. Specifically, the center loss simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers.

An implementation of center loss: Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016.

Note

This loss must be part of a parent module or explicitly optimized by an optimizer. If not, the centers will not be modified.

forward(self, x, classes)¶

Parameters

x – the features, an arbitrary n-d tensor (N * C * …). Features should ideally be in range [0..1]
classes – a 1D integral tensor (N) representing the class of each x

Returns

a 1D tensor (N) representing the loss per sample

class trw.train.LossContrastive(margin=1.0)¶

Bases: torch.nn.Module

Implementation of the contrastive loss.

L(x0, x1, y) = 0.5 * (1 - y) * d(x0, x1)^2 + 0.5 * y * max(0, m - d(x0, x1))^2

with y = 0 for samples x0 and x1 deemed dissimilar while y = 1 for similar samples. Dissimilar pairs contribute to the loss function only if their distance is within this radius m and minimize d(x0, x1) over the set of all similar pairs.

See Dimensionality Reduction by Learning an Invariant Mapping, Raia Hadsell, Sumit Chopra, Yann LeCun, 2006.

forward(self, x0, x1, same_target)¶

Parameters

x0 – N-D tensor
x1 – N-D tensor
same_target – 0 or 1 1D tensor. 1 means the x0 and x1 belongs to the same class, while 0 means they are from a different class

Returns

a 1D tensor (N) representing the loss per sample

trw.train.total_variation_norm(x, beta)¶

Calculate the total variation norm

Parameters

x – a tensor with format (samples, components, dn, …, d0)
beta – the exponent

Returns

a scalar

class trw.train.LossCrossEntropyCsiMulticlass¶

Bases: torch.nn.Module

Optimize a metric similar to Critical Success Index (CSI) on the cross-entropy

A loss for heavily unbalanced data (order of magnitude more negative than positive) Calculate the cross-entropy and use only the loss using the TP, FP and FN. Loss from TN is simply discarded.

forward(self, outputs, targets, important_class=1)¶

Parameters

outputs – a N x C tensor with N the number of samples and C the number of classes
targets – a N integral tensor
important_class – the class to keep the cross-entropy loss even if classification is correct

Returns

a N floating tensor representing the loss of each sample

class trw.train.LossBinaryF1(eps=0.0001)¶

Bases: torch.nn.Module

The macro F1-score is non-differentiable. Instead use a surrogate that is differentiable: and correlates well with the Macro F1 score by working on the class probabilities rather than the discrete classification.
For example, if the ground truth is 1 and the model prediction is 0.8, we calculate it as 0.8 true: positive and 0.2 false negative

forward(self, outputs, targets)¶

trw.train.one_hot(targets: trw.basic_typing.TorchTensorNX, num_classes: int, dtype=torch.float32, device: Optional[torch.device] = None) → trw.basic_typing.TorchTensorNCX¶

Encode the targets (an tensor of integers representing a class) as one hot encoding.

Support target as N-dimensional data (e.g., 3D segmentation map).

Equivalent to torch.nn.functional.one_hot for backward compatibility with pytorch 1.0

Parameters

num_classes – the total number of classes
targets – a N-dimensional integral tensor (e.g., 1D for classification, 2D for 2D segmentation map…)
dtype – the type of the output tensor
device – the device of the one-hot encoded tensor. If None, use the target’s device

Returns

a one hot encoding of a N-dimentional integral tensor

class trw.train.LossMsePacked(reduction: typing_extensions.Literal[mean, none] = 'mean')¶

Bases: torch.nn.Module

Mean squared error loss with target packed as an integer (e.g., classification)

The packed_target will be one hot encoded and the mean squared error is applied with the tensor.

forward(self, tensor, packed_target)¶

Parameters

tensor – a NxCx… tensor
packed_target – a Nx1x… tensor

trw.train.create_losses_fn(datasets, generic_loss)¶

Create a dictionary of loss functions for each of the dataset

Parameters

datasets – the datasets
generic_loss – a loss function

Returns

A dictionary of losses for each of the dataset

trw.train.epoch_train_eval(options, datasets, optimizers, model, losses, schedulers, per_step_schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, force_eval_mode, eval_loop_fn=eval_loop, train_loop_fn=train_loop)¶

Parameters

options –
datasets –
optimizers –
model –
losses –
schedulers –
per_step_schedulers –
history –
callbacks_per_batch –
callbacks_per_batch_loss_terms –
run_eval –
force_eval_mode –
eval_loop_fn –
train_loop_fn –

Returns:

trw.train.eval_loop(options, device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)¶

Run the eval loop (i.e., the model parameters will NOT be updated)

Note

If callback_per_batch or callbacks_per_batch_loss_terms raise StopIteration, the eval loop will be stopped

Parameters

device –
dataset_name –
split_name –
split –
model –
loss_fn –
history –
callbacks_per_batch –
callbacks_per_batch_loss_terms –

Returns

trw.train.train_loop(options, device, dataset_name, split_name, split, optimizer, per_step_scheduler, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, gradient_scaler=None)¶

Run the train loop (i.e., the model parameters will be updated)

Note

If callbacks_per_batch or callbacks_per_batch_loss_terms raise an exception StopIteration, the train loop will be stopped

Parameters

device – the device to be used to optimize the model
dataset_name – the name of the dataset
split_name – the name of the split
split – a dictionary of feature name and values
optimizer – an optimizer to optimize the model
per_step_scheduler – scheduler to be applied per-batch
model – the model to be optimized
loss_fn – the loss function
history – a list of history step
callbacks_per_batch – the callbacks to be performed on each batch. if None, no callbacks to be run
callbacks_per_batch_loss_terms – the callbacks to be performed on each loss term. if None, no callbacks to be run
gradient_scaler – if mixed precision is enabled, this is the scale to be used for the gradient update

Notes

if optimizer is None, there MUST be a .backward() to free graph and memory.

trw.train.default_post_training_callbacks(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True, additional_callbacks=None)¶: Default callbacks to be performed after the model has been trained

trw.train.default_per_epoch_callbacks(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None, additional_callbacks=None)¶: Default callbacks to be performed at the end of each epoch

trw.train.default_pre_training_callbacks(logger=default_logger, with_lr_finder=False, with_export_augmentations=True, with_reporting_server=True, with_profiler=False, additional_callbacks=None)¶: Default callbacks to be performed before the fitting of the model

trw.train.default_sum_all_losses(dataset_name, batch, loss_terms)¶: Default loss is the sum of all loss terms

class trw.train.TrainerV2(callbacks_per_batch=None, callbacks_per_batch_loss_terms=None, callbacks_per_epoch=default_per_epoch_callbacks(), callbacks_pre_training=default_pre_training_callbacks(), callbacks_post_training=default_post_training_callbacks(), trainer_callbacks_per_batch=trainer_callbacks_per_batch, run_epoch_fn=epoch_train_eval, logging_level=logging.DEBUG, skip_eval_epoch_0=True)¶

static save_model(model, metadata: trw.train.utilities.RunMetadata, path, pickle_module=pickle)¶

Save a model to file

Parameters

model – the model to serialize
metadata – an optional result file associated with the model
path – the base path to save the model
pickle_module – the serialization module that will be used to save the model and results

static load_state(model: torch.nn.Module, path: str, device: torch.device = None, pickle_module: Any = pickle, strict: bool = True) → None¶

Load the state of a model

Parameters

model – where to load the state
path – where the model’s state was saved
device – where to locate the model
pickle_module – how to read the model parameters and metadata
strict – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function

static load_model(path: str, model_kwargs: Optional[Dict[Any, Any]] = None, with_result: bool = False, device: torch.device = None, pickle_module: Any = pickle) → Tuple[torch.nn.Module, trw.train.utilities.RunMetadata]¶

Load a previously saved model

Construct a model from the RunMetadata.class_name class and with arguments model_kwargs

Parameters

path – where to store the model. result’s will be loaded from path + ‘.result’
model_kwargs – arguments used to instantiate the model stored in RunMetadata.class_name
with_result – if True, the results of the model will be loaded
device – where to load the model. For example, models are typically trained on GPU, but for deployment, CPU might be good enough. If None, use the same device as when the model was exported
pickle_module – the de-serialization module to be used to load model and results

Returns

a tuple model, metadata

fit(self, options, datasets, model: torch.nn.Module, optimizers_fn, losses_fn=default_sum_all_losses, loss_creator=create_losses_fn, log_path=None, with_final_evaluation=True, history=None, erase_logging_folder=True, eval_every_X_epoch=1) → trw.train.utilities.RunMetadata¶

Fit the model

Parameters

options –
datasets –
a functor returning a dictionary of datasets. Alternatively, datasets infos can be specified. inputs_fn must return one of:
- datasets: dictionary of dataset
- (datasets, datasets_infos): dictionary of dataset and additional infos
We define:
- datasets: a dictionary of dataset. a dataset is a dictionary of splits. a split is a dictionary of batched features.
- Datasets infos are additional infos useful for the debugging of the dataset (e.g., class mappings, sample UIDs). Datasets infos are typically much smaller than datasets should be loaded in loadable in memory
model – a Module or a ModuleDict
optimizers_fn –
losses_fn –
loss_creator –
log_path – the path of the logs to be exported during the training of the model. if the log_path is not an absolute path, the options.workflow_options.logging_directory is used as root
with_final_evaluation –
history –
erase_logging_folder – if True, the logging will be erased when fitting starts
eval_every_X_epoch – evaluate the model every X epochs

Returns:

trw.train.create_sgd_optimizers_fn(datasets, model, learning_rate, momentum=0.9, weight_decay=0, nesterov=False, scheduler_fn=None, per_step_scheduler_fn=None)¶

Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler

Parameters

datasets – a dictionary of dataset
model – a model to optimize
learning_rate – the initial learning rate
scheduler_fn – a scheduler, or None
momentum – the momentum of the SGD
weight_decay – the weight decay
nesterov – enables Nesterov momentum
per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)

Returns

An optimizer

trw.train.create_sgd_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9, nesterov=False)¶

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

Parameters

datasets – a dictionary of dataset
model – a model to optimize
learning_rate – the initial learning rate
step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma
gamma – the factor to apply to the learning rate every step
weight_decay – the weight decay
nesterov – enables Nesterov momentum
momentum – the momentum of the SGD

Returns

An optimizer with a step scheduler

trw.train.create_scheduler_step_lr(optimizer, step_size=30, gamma=0.1)¶

Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma

Parameters

optimizer – the optimizer
step_size – every number of epochs composing one step. Each step the learning rate will be decreased
gamma – apply this factor to the learning rate every time it is adjusted

Returns

a learning rate scheduler

trw.train.create_adam_optimizers_fn(datasets, model, learning_rate, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, scheduler_fn=None, per_step_scheduler_fn=None)¶

Create an ADAM optimizer for each of the dataset with optional scheduler

Parameters

datasets – a dictionary of datasets
model – a model to optimize
learning_rate – the initial learning rate
weight_decay – the weight decay
scheduler_fn – a scheduler, or None
betas – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps – term to add to denominator to avoid division by zero
per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)

Returns

An optimizer

trw.train.create_adam_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, betas=(0.9, 0.999))¶

Create an ADAM optimizer for each of the dataset with optional scheduler

Parameters

datasets – a dictionary of dataset
model – a model to optimize
learning_rate – the initial learning rate
step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma
gamma – the factor to apply to the learning rate every step
weight_decay – the weight decay
betas – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

Returns

An optimizer with a step scheduler

trw.train.create_optimizers_fn(datasets, model, optimizer_fn, scheduler_fn=None, per_step_scheduler_fn=None)¶

Create an optimizer and scheduler

Note

if model is an instance of`ModuleDict`, then the optimizer will only consider the parameters model[dataset_name].parameters() else model.parameters()

Parameters

datasets – a dictionary of dataset
model – the model. Should be a Module or a ModuleDict
optimizer_fn – the functor to instantiate the optimizer
scheduler_fn – the functor to instantiate the scheduler to be run by epoch. May be None, in that case there will be no schedule
per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)

trw.train.create_sgd_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3, nesterov=False)¶

Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler

Parameters

datasets – a dictionary of dataset
model – a model to optimize
max_learning_rate – the maximum learning rate
epochs – The number of epochs to train for
steps_per_epoch – The number of steps per epoch. If 0 or None, the schedule will be based on mumber of epochs only
learning_rate_start_div_factor – defines the initial learning rate for the first step as initial_learning = max_learning_rate / learning_rate_start_div_factor
learning_rate_end_div_factor – defines the end learning rate for the last step as final_learning_rate = max_learning_rate / learning_rate_start_div_factor / learning_rate_end_div_factor
percentage_cycle_increase – The percentage of the cycle (in number of steps) spent increasing the learning rate
additional_scheduler_kwargs – additional arguments provided to the scheduler
weight_decay – the weight decay
nesterov – enables Nesterov momentum
momentum – the momentum of the SGD

Returns

An optimizer with a step scheduler

trw.train.create_adam_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3)¶

Create a ADAM optimizer for each of the dataset with step learning rate scheduler

Parameters

datasets – a dictionary of dataset
model – a model to optimize
max_learning_rate – the maximum learning rate
epochs – The number of epochs to train for
steps_per_epoch – The number of steps per epoch. If 0 or None, the schedule will be based on mumber of epochs only
learning_rate_start_div_factor – defines the initial learning rate for the first step as initial_learning = learning_rate_start_multiplier * max_learning_rate
learning_rate_end_div_factor – defines the end learning rate for the last step as final_learning_rate = max_learning_rate / learning_rate_start_div_factor / learning_rate_end_div_factor
percentage_cycle_increase – The percentage of the cycle (in number of steps) spent increasing the learning rate
additional_scheduler_kwargs – additional arguments provided to the scheduler
weight_decay – the weight decay
betas – betas of the ADAM optimizer
eps – eps of the ADAM optimizer

Returns

An optimizer with a step scheduler

class trw.train.ClippingGradientNorm(optimizer_base: torch.optim.Optimizer, max_norm: float = 1.0, norm_type: float = 2.0)¶

Bases: torch.optim.Optimizer

Clips the gradient norm during optimization

step(self, closure=None)¶

Performs a single optimization step (parameter update).

Parameters: closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.

class trw.train.Optimizer(optimizer_fn: Callable[[Iterator[torch.nn.parameter.Parameter]], torch.optim.Optimizer], scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]] = None, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]] = None)¶

set_scheduler_fn(self, scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]])¶

set_step_scheduler_fn(self, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]])¶

__call__(self, datasets: trw.basic_typing.Datasets, model: torch.nn.Module) → Tuple[Dict[str, torch.optim.Optimizer], Optional[Dict[str, SchedulerType]], Optional[Dict[str, StepSchedulerType]]]¶

scheduler_step_lr(self, step_size: int, gamma: float = 0.1) → Optimizer¶

Apply a scheduler on the learning rate.

Decays the learning rate of each parameter group by gamma every step_size epochs.

scheduler_cosine_annealing_warm_restart(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1) → Optimizer¶

Apply a scheduler on the learning rate.

Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.

References

https://arxiv.org/pdf/1608.03983v5.pdf

scheduler_cosine_annealing_warm_restart_decayed(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1, decay_factor=0.7) → Optimizer¶

Apply a scheduler on the learning rate. Each time the learning rate is restarted, the base learning rate is decayed

Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.

References

https://arxiv.org/pdf/1608.03983v5.pdf

scheduler_one_cycle(self, max_learning_rate: float, epochs: int, steps_per_epoch: int, learning_rate_start_div_factor: float = 25.0, learning_rate_end_div_factor: float = 10000.0, percentage_cycle_increase: float = 0.3, anneal_strategy: str = 'cos', cycle_momentum: bool = True, base_momentum: float = 0.85, max_momentum: float = 0.95)¶

This scheduler should not be used with another scheduler!

The learning rate or momentum provided by the Optimizer will be overriden by this scheduler.

clip_gradient_norm(self, max_norm: float = 1.0, norm_type: float = 2.0)¶

Clips the gradient norm during optimization

Parameters

max_norm – the maximum norm of the concatenated gradients of the optimizer. Note: the gradient is modulated by the learning rate
norm_type – type of the used p-norm. Can be 'inf' for infinity norm

See:: torch.nn.utils.clip_grad_norm_()

class trw.train.OptimizerAdam(learning_rate: float, weight_decay: float = 0, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08)¶: Bases: Optimizer

class trw.train.OptimizerSGD(learning_rate: float, momentum: float = 0.9, weight_decay: float = 0, nesterov: bool = False)¶: Bases: Optimizer

class trw.train.OptimizerAdamW(learning_rate: float, weight_decay: float = 0.01, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08)¶: Bases: Optimizer

trw.train.plot_group_histories(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) → None¶: Plot groups of histories :param root: the directory where the plot will be exported :param history_values: a map of list of list of (epoch, value) :param title: the title of the graph :param xlabel: the x label :param ylabel: the y label :param max_nb_plots_per_group: the maximum number of plots per group :param colors: the colors to be used

trw.train.confusion_matrix(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) → None¶

Plot the confusion matrix of a predicted class versus the true class

Parameters

export_path – the folder where the confusion matrix will be exported
classes_predictions – the classes that were predicted by the classifier
classes_trues – the true classes
classes – a list of labels. Label 0 for class 0, label 1 for class 1…
normalize – if True, the confusion matrix will be normalized to 1.0 per row
title – the title of the plot
cmap – the color map to use
display_numbers – if True, display the numbers within each cell of the confusion matrix
maximum_chars_per_line – the title will be split every maximum_chars_per_line characters to avoid display issues
rotate_x – if not None, indicates the rotation of the label on x axis
rotate_y – if not None, indicates the rotation of the label on y axis
display_names_x – if True, the class name, if specified, will also be displayed on the x axis
sort_by_decreasing_sample_size – if True, the confusion matrix will be sorted by decreasing number of samples. This can

be useful to show if the errors may be due to low number of samples :param excludes_classes_with_samples_less_than: if not None, the classes with

less than excludes_classes_with_samples_less_than samples will be excluded

:param normalize_unit_percentage if True, use 100% base as unit instead of 1.0 :param main_font_size: the font size of the text :param sub_font_size: the font size of the sub-elements (e.g., ticks) :param max_size_x_label: the maximum length of a label on the x-axis

trw.train.classification_report(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)¶: Summarizes the important statistics for a classification problem :param predictions: the classes predicted :param prediction_scores: the scores for each, for each sample :param trues: the true class for each sample :param class_mapping: the class mapping (class id, class name) :return: a dictionary of statistics or sub-report

trw.train.list_classes_from_mapping(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')¶

Create a contiguous list of label names ordered from 0..N from the class mapping

Parameters

mappinginv – a dictionary like structure encoded as (class id, class_name)
default_name – if there is no class name, use this as default

Returns

a list of class names ordered from class id = 0 to class id = N. If mappinginv is None, returns None

trw.train.plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)¶

Calculate the ROC and AUC of a binary classifier

Supports multiple ROC curves.

Parameters

export_path – the folder where the plot will be exported
trues – the expected class. Can be a list for multiple ROC curves
found_scores_1 – the score found for the prediction of class 1. Must be a numpy array of floats. Can be a list for multiple ROC curves
title – the title of the ROC
label_name – the name of the ROC curve. Can be a list for multiple ROC curves
colors – if None use default colors. Else, a numpy array of dim (Nx3) where N is the number of colors. Must be in [0..1] range

trw.train.boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)¶

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given measure?

Parameters

export_path – where to export the figure
features_trials – a dictionary of list. Each list representing a feature
title – the title of the plot
ylabel – the label for axis y
xlabel – the label for axis x
meanline – if True, draw a line from the center of the plot for each history name to the next
maximum_chars_per_line – the maximum of characters allowed per line of title. If exceeded, newline will be created.
plot_trials – if True, each trial of a feature will be plotted
scale – the axis scale to be used
y_range – if not None, the (min, max) of the y-axis
rotate_x – if not None, the rotation of the x axis labels in degree
showfliers – if True, plot the outliers
maximum_chars_per_line – the maximum number of characters of the title per line
title_line_height – the height of the title lines

trw.train.export_figure(path, name, maximum_length=259, dpi=None)¶

Export a figure

Parameters

path – the folder where to export the figure
name – the name of the figure.
maximum_length – the maximum length of the full path of a figure. If the full path name is greater than maximum_length, the name will be subs-ampled to the maximal allowed length
dpi – Dots Per Inch: the density of the figure

trw.train.auroc(trues: numpy.ndarray, found_1_scores: numpy.ndarray) → float¶

Calculate the area under the curve of the ROC plot (AUROC)

Parameters

trues – the expected class
found_1_scores – the score found for the class 1. Must be a numpy array of floats

Returns

the AUROC

trw.train.find_tensor_leaves_with_grad(tensor: torch.Tensor) → Sequence[torch.Tensor]¶

Find the input leaves of a tensor.

Input Leaves REQUIRES have requires_grad=True, else they will not be found

Parameters: tensor – a torch.Tensor
Returns: a list of torch.Tensor with attribute requires_grad=True that is an input of tensor

trw.train.find_last_forward_convolution(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) → Optional[Mapping]¶

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

Parameters

inputs – the input of the model so that we can call model(inputs)
model – the model
types – the types to be captured. Can be a single type or a tuple of types
relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

trw.train.find_last_forward_types(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]], relative_index: int = 0) → Optional[Mapping]¶

Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type

Parameters

inputs – the input of the model so that we can call model(inputs)
model – the model
types – the types to be captured. Can be a single type or a tuple of types
relative_index – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

trw.train.find_first_forward_convolution(model: torch.nn.Module, inputs: Any = None, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) → Optional[Mapping]¶

Perform a forward pass of the model with given inputs and retrieve the last convolutional layer

Parameters

inputs – NOT USED
model – the model
types – the types to be captured. Can be a single type or a tuple of types
relative_index (int) – indicate which module to return from the last collected module

Returns

None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found

class trw.train.GradCam(model: torch.nn.Module, find_convolution: Callable[[torch.nn.Module, Union[trw.basic_typing.Batch, torch.Tensor]], Optional[Mapping]] = graph_reflection.find_last_forward_convolution, post_process_output: Callable[[Any], torch.Tensor] = guided_back_propagation.post_process_output_id)¶

Gradient-weighted Class Activation Mapping

This is based on the paper “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization”, Ramprasaath R et al.

__call__(self, inputs: Union[trw.basic_typing.Batch, torch.Tensor], target_class_name: str = None, target_class: int = None) → Optional[Tuple[str, Mapping]]¶

Parameters

inputs – the inputs to be fed to the model
target_class_name –
the output node to be used. If None: * if model output is a single tensor then use this as target output
- else it will use the first OutputClassification output
target_class – the index of the class to explain the decision. If None, the class output will be used

Returns

a tuple (output name, a dictionary (input_name, GradCAMs))

class trw.train.GuidedBackprop(model: torch.nn.Module, unguided_gradient: bool = False, post_process_output: Callable[[Any], torch.Tensor] = post_process_output_id)¶

Produces gradients generated with guided back propagation from the given image

update_relus(self) → None¶

Updates relu activation functions so that: 1- stores output in forward pass 2- imputes zero for gradient values that are less than zero

static get_floating_inputs_with_gradients(inputs)¶

Extract inputs that have a gradient

Parameters: inputs – a tensor of dictionary of tensors
Returns: Return a list of tuple (name, input) for the input that have a gradient

__call__(self, inputs: Tuple[torch.Tensor, trw.basic_typing.Batch], target_class: int, target_class_name: str) → Optional[Tuple[str, Mapping]]¶

Generate the guided back-propagation gradient

Parameters

inputs – a tensor or dictionary of tensors
target_class – the target class to be explained
target_class_name – the name of the output class if multiple outputs

Returns

a tuple (output_name, dictionary (input, gradient))

static get_positive_negative_saliency(gradient: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor]¶

Generates positive and negative saliency maps based on the gradient

Parameters: gradient (numpy arr) – Gradient of the operation to visualize
Returns: pos_saliency ( )

trw.train.post_process_output_for_gradient_attribution(output: trw.train.outputs_trw.Output)¶

Postptocess the output to be suitable for gradient attribution.

In particular, if we have a trw.train.OutputClassification, we need to apply a softmax operation so that we can backpropagate the loss of a particular class with the appropriate value (1.0).

Parameters: output – a trw.train.OutputClassification
Returns: a torch.Tensor

class trw.train.IntegratedGradients(model: torch.nn.Module, steps: int = 100, baseline_inputs: Any = None, use_output_as_target: bool = False, post_process_output: Callable[[Any], torch.Tensor] = guided_back_propagation.post_process_output_id)¶

Implementation of Integrated gradients, a method of attributing the prediction of a deep network: to its input features.

This is implementing the paper Axiomatic Attribution for Deep Networks, Mukund Sundararajan, Ankur Taly, Qiqi Yan as described in https://arxiv.org/abs/1703.01365

__call__(self, inputs: Any, target_class_name: str, target_class: Optional[int] = None) → Optional[Tuple[str, Mapping]]¶

Generate the guided back-propagation gradient

Parameters

inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained
target_class – the index of the class to explain the decision. If None, the class output will be used
target_class_name –
the output node to be used. If None: * if model output is a single tensor then use this as target output
- else it will use the first OutputClassification output

Returns

a tuple (output_name, dictionary (input, integrated gradient))

trw.train.default_collate_fn(batch: Union[Sequence[Any], Mapping[str, Any]], device: torch.device, pin_memory: bool = False, non_blocking: bool = False)¶

Parameters

batch – a dictionary of features or a list of dictionary of features
device – the device where to create the torch.Tensor
pin_memory – if True, pin the memory. Required to be a CUDA allocated torch.Tensor
non_blocking – if True, use non blocking memory transfer

Returns

a dictionary of torch.Tensor

class trw.train.Sequence(source_split)¶

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch

abstract __iter__(self)¶

Returns: An iterator of batches

collate(self, collate_fn=default_collate_fn, device=None)¶

Aggregate the input batch as a dictionary of torch.Tensor and move the data to the appropriate device

Parameters

collate_fn – the function to collate the input batch
device – the device where to send the samples. If None, the default device is CPU

Returns

a collated sequence of batches

map(self, function_to_run, nb_workers=0, max_jobs_at_once=None, queue_timeout=default_queue_timeout, collate_fn=None, max_queue_size_pin=None)¶

Transform a sequence using a given function.

Note

The map may create more samples than the original sequence.

Parameters

function_to_run – the mapping function
nb_workers – the number of workers that will process the split. If 0, no workers will be created.
max_jobs_at_once – the maximum number of results that can be pushed in the result queue at once. If 0, no limit. If None, it will be set equal to the number of workers
queue_timeout – the timeout used to pull results from the output queue
collate_fn – a function to collate each batch of data

: param max_queue_size_pin: defines the max number of batches prefected. If None, defaulting to: a size based on the number of workers. This only controls the final queue sized of the pin thread (the workers queue can be independently set)

Returns: a sequence of batches

batch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)¶

Group several batches of samples into a single batch

Parameters

batch_size – the number of samples of the batch
discard_batch_not_full – if True and if a batch is not full, discard these
collate_fn – a function to collate the batches. If None, no collation performed

Returns

a sequence of batches

sub_batch(self, batch_size, discard_batch_not_full=False)¶

This sequence will split batches in smaller batches if the underlying sequence batch is too large.

This sequence can be useful to manage very large tensors. Indeed, this class avoids concatenating tensors (as opposed to in trw.train.SequenceReBatch). Since this operation can be costly as the tensors must be reallocated. In this case, it may be faster to work on a smaller batch by avoiding the concatenation cost.

Parameters

batch_size – the maximum size of a batch
discard_batch_not_full – if True, batch that do have size batch_size will be discarded

rebatch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)¶

Normalize a sequence to identical batch size given an input sequence with varying batch size

Parameters

batch_size – the size of the batches created by this sequence
discard_batch_not_full – if True, the last batch will be discarded if not full
collate_fn – function to merge multiple batches

max_samples(self, max_samples)¶

Virtual resize of the sequence. The sequence will terminate when a certain number: of samples produced has been reached. Restart the sequence where it was stopped.

Parameters: max_samples – the number of samples this sequence will produce before stopping

async_reservoir(self, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=sampler.SamplerSequential(), collate_fn=remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)¶

Parameters

max_reservoir_samples – the maximum number of samples of the reservoir
function_to_run – the function to run asynchronously
min_reservoir_samples – the minimum of samples of the reservoir needed before an output sequence can be created
nb_workers – the number of workers that will process function_to_run to fill the reservoir. Must be >= 1
max_jobs_at_once – the maximum number of jobs that can be started and stored by epoch by the workers. If 0, no limit. If None: set to the number of workers
reservoir_sampler – a sampler that will be used to sample the reservoir or None for sequential sampling of the reservoir
collate_fn – a function to post-process the samples into a single batch, or None if not to be collated
maximum_number_of_samples_per_epoch – the maximum number of samples that will be generated per epoch. If we reach this maximum, the sequence will be interrupted
max_reservoir_replacement_size – Specify the maximum number of samples replaced in the reservoir by epoch. If None, we will use the whole result queue. This can be useful to control explicitly how the reservoir is updated and depend less on the speed of hardware. Note that to have an effect, max_jobs_at_once should be greater than max_reservoir_replacement_size.

fill_queue(self)¶: Fill the queue jobs of the current sequence

fill_queue_all_sequences(self)¶: Go through all the sequences and fill their input queue

has_background_jobs(self)¶

Returns: True if this sequence has a background job to create the next element

has_background_jobs_previous_sequences(self)¶

Returns: the number of sequences that have background jobs currently running to create the next element

abstract subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

abstract subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

abstract close(self)¶

class trw.train.SequenceMap(source_split, nb_workers, function_to_run, max_jobs_at_once=None, queue_timeout=default_queue_timeout, debug_job_report_timeout=30.0, collate_fn=None, max_queue_size_pin=None)¶

Bases: trw.train.sequence.Sequence

A Sequence defines how to iterate the data as a sequence of small batches of data.

To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.

Examples:

data = list(range(100))
sequence = SequenceArray({'data': data}).batch(10)
for batch in sequence:
    # do something with our batch

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

fill_queue(self)¶: Fill the queue jobs of the current sequence

initializer(self)¶: Initialize the sequence to iterate through batches

__next_local(self, next_fn)¶

Get the next elements

Handles single item or list of items returned by next_fn :param next_fn: return the next elements

__next__(self)¶

has_background_jobs(self)¶

Returns: True if this sequence has a background job to create the next element

next_item(self, blocking)¶

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Finish and join the existing pool processes

class trw.train.SequenceArray(split, sampler=sampler_trw.SamplerRandom(), transforms=None, use_advanced_indexing=True, sample_uid_name=sample_uid_name)¶

Bases: trw.train.sequence.Sequence

Create a sequence of batches from numpy arrays, lists and torch.Tensor

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__iter__(self)¶

Returns: An iterator of batches

close(self)¶

class trw.train.SequenceBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)¶

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

Group several batches into a single batch

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)¶

Returns: The next batch of data

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Special method to close and clean the resources of the sequence

class trw.train.SequenceAsyncReservoir(source_split, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=None, collate_fn=sequence.remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)¶

Bases: trw.train.sequence.Sequence

This sequence will asynchronously process data and keep a reserve of loaded samples

The idea is to have long loading processes work in the background while still using as efficiently as possible the data that is currently loaded. The data is slowly being replaced by freshly loaded data over time.

Jobs are started and results retrieved at the beginning of each epoch

This sequence can be interrupted (e.g., after a certain number of batches have been returned). When the sequence is restarted, the reservoir will not be emptied.

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

reservoir_size(self)¶

Returns: The current number of samples in the reservoir

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

initializer(self)¶

fill_queue(self)¶: Fill the input queue of jobs to be completed

_retrieve_results_and_fill_queue(self)¶: Retrieve results from the output queue

_wait_for_job_completion(self)¶: Block the processing until we have enough result in the reservoir

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Finish and join the existing pool processes

class trw.train.SequenceAdaptorTorch(torch_dataloader, features=None)¶

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface

The main purpose is to enable compatibility with the torch data loader and any existing third party code.

__len__(self)¶

__iter__(self)¶

Returns: An iterator of batches

__next__(self)¶

Returns: The next batch of data

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

close(self)¶: Special method to close and clean the resources of the sequence

class trw.train.SequenceCollate(source_split, collate_fn=collate.default_collate_fn, device=None)¶

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

Group the data into a sequence of dictionary of torch.Tensor

This can be useful to combine batches of dictionaries into a single batch with all features concatenated on axis 0. Often used in conjunction of trw.train.SequenceAsyncReservoir and trw.train.SequenceMap.

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)¶

Returns: The next batch of data

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Special method to close and clean the resources of the sequence

class trw.train.SequenceReBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)¶

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

This sequence will normalize the batch size of an underlying sequence

If the underlying sequence batch is too large, it will be split in multiple batches. Conversely, if the size of the batch is too small, it several batches will be merged until we reach the expected batch size.

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)¶

Returns: The next batch of data

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Special method to close and clean the resources of the sequence

class trw.train.SequenceSubBatch(source_split, batch_size, discard_batch_not_full=False)¶

Bases: trw.train.sequence.Sequence, trw.train.sequence.SequenceIterator

This sequence will split batches in smaller batches if the underlying sequence batch is too large.

This sequence can be useful to manage very large tensors. Indeed, this class avoids concatenating tensors (as opposed to in trw.train.SequenceReBatch). Since this operation can be costly as the tensors must be reallocated. In this case, it may be faster to work on a smaller batch by avoiding the concatenation cost.

subsample(self, nb_samples)¶

Sub-sample a sequence to a fixed number of samples.

The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.

Parameters: nb_samples – the number of samples desired in the original sequence
Returns: a subsampled Sequence

subsample_uids(self, uids, uids_name, new_sampler=None)¶

Sub-sample a sequence to samples with specified UIDs.

Parameters

uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing

Returns

a subsampled Sequence

__next__(self)¶

Returns: The next batch of data

__iter__(self)¶

Returns: An iterator of batches

close(self)¶: Special method to close and clean the resources of the sequence

class trw.train.Metric¶

Bases: abc.ABC

A metric base class

Calculate interesting metric

abstract __call__(self, outputs: Dict) → Optional[Dict]¶

Parameters: outputs – the outputs of a batch
Returns: a dictionary of metric names/values or None

abstract aggregate_metrics(self, metric_by_batch: List[Dict]) → Dict[str, float]¶

Aggregate all the metrics into a consolidated metric.

Parameters: metric_by_batch – a list of metrics, one for each batch
Returns: a dictionary of result name and value

class trw.train.MetricClassificationError¶

Bases: Metric

Calculate the 1 - accuracy using the output_truth and output

__call__(self, outputs)¶

Parameters: outputs – the outputs of a batch
Returns: a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)¶

Aggregate all the metrics into a consolidated metric.

Parameters: metric_by_batch – a list of metrics, one for each batch
Returns: a dictionary of result name and value

class trw.train.MetricClassificationBinarySensitivitySpecificity¶

Bases: Metric

Calculate the sensitivity and specificity for a binary classification using the output_truth and output

__call__(self, outputs)¶

Parameters: outputs – the outputs of a batch
Returns: a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)¶

Aggregate all the metrics into a consolidated metric.

Parameters: metric_by_batch – a list of metrics, one for each batch
Returns: a dictionary of result name and value

class trw.train.MetricLoss¶

Bases: Metric

Extract the loss from the outputs

__call__(self, outputs)¶

Parameters: outputs – the outputs of a batch
Returns: a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)¶

Aggregate all the metrics into a consolidated metric.

Parameters: metric_by_batch – a list of metrics, one for each batch
Returns: a dictionary of result name and value

class trw.train.MetricClassificationBinaryAUC¶

Bases: Metric

Calculate the Area under the Receiver operating characteristic (ROC) curve.

For this, the output needs to provide an output_raw of shape [N, 2] (i.e., binary classification framed as a multi-class classification) or of shape [N, 1] (binary classification)

__call__(self, outputs)¶

Parameters: outputs – the outputs of a batch
Returns: a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)¶

Aggregate all the metrics into a consolidated metric.

Parameters: metric_by_batch – a list of metrics, one for each batch
Returns: a dictionary of result name and value

class trw.train.MetricClassificationF1(average=None)¶

Bases: Metric

A metric base class

Calculate interesting metric

__call__(self, outputs)¶

Parameters: outputs – the outputs of a batch
Returns: a dictionary of metric names/values or None

aggregate_metrics(self, metric_by_batch)¶

Aggregate all the metrics into a consolidated metric.

Parameters: metric_by_batch – a list of metrics, one for each batch
Returns: a dictionary of result name and value

class trw.train.SamplerRandom(replacement=False, nb_samples_to_generate=None, batch_size=1)¶

Bases: Sampler

Samples elements randomly. If without replacement, then sample from a shuffled dataset. If with replacement, then user can specify num_samples to draw.

initializer(self, data_source)¶

Initialize the sequence iteration

Parameters: data_source – the data source to iterate

__iter__(self)¶: Returns: an iterator the return indices of the original data source

__next__(self)¶

class trw.train.SamplerSequential(batch_size=1)¶

Bases: Sampler

Samples elements sequentially, always in the same order.

initializer(self, data_source)¶

Initialize the sequence iteration

Parameters: data_source – the data source to iterate

__iter__(self)¶: Returns: an iterator the return indices of the original data source

class trw.train.SamplerSubsetRandom(indices)¶

Bases: Sampler

Samples elements randomly from a given list of indices, without replacement.

Parameters: indices (sequence) – a sequence of indices

initializer(self, data_source)¶

Initialize the sequence iteration

Parameters: data_source – the data source to iterate

__iter__(self)¶: Returns: an iterator the return indices of the original data source

class trw.train.SamplerClassResampling(class_name, nb_samples_to_generate, reuse_class_frequencies_across_epochs=True, batch_size=1)¶

Bases: Sampler

Resample the samples so that class_name classes have equal probably of being sampled.

Classification problems rarely have balanced classes so it is often required to super-sample the minority class to avoid penalizing the under represented classes and help the classifier to learn good features (as opposed to learn the class distribution).

initializer(self, data_source)¶

Initialize the sequence iteration

Parameters: data_source – the data source to iterate

_fit(self, classes)¶

__next__(self)¶

__iter__(self)¶: Returns: an iterator the return indices of the original data source

class trw.train.Sampler¶

Bases: object

Base class for all Samplers.

Every Sampler subclass has to provide an __iter__ method, providing a way to iterate over indices of dataset elements, and a __len__ method that returns the length of the returned iterators.

abstract initializer(self, data_source)¶

Initialize the sequence iteration

Parameters: data_source – the data source to iterate

abstract __iter__(self)¶: Returns: an iterator the return indices of the original data source

class trw.train.SamplerSubsetRandomByListInterleaved(indices: Sequence[Sequence[int]])¶

Bases: Sampler

Elements from a given list of list of indices are randomly drawn without replacement, one element per list at a time.

For sequences with different sizes, the longest of the sequences will be trimmed to the size of the shortest sequence.

This can be used for example to resample without replacement imbalanced classes in a classification task.

Examples:

>>> l1 = np.asarray([1, 2])
>>> l2 = np.asarray([3, 4, 5])
>>> sampler = trw.train.SamplerSubsetRandomByListInterleaved([l1, l2])
>>> sampler.initializer(None)
>>> indices = [i for i in sampler]
# indices could be [1, 5, 2, 4]

Parameters: indices – a sequence of sequence of indices

initializer(self, data_source)¶

Initialize the sequence iteration

Parameters: data_source – the data source to iterate

__iter__(self)¶: Returns: an iterator the return indices of the original data source

class trw.train.FilterFixed(kernel: torch.Tensor, groups: int = 1, padding: int = 0)¶

Bases: torch.nn.Module

Apply a fixed filter to n-dimensional images

__call__(self, value: trw.basic_typing.TorchTensorNCX) → trw.basic_typing.TorchTensorNCX¶

class trw.train.FilterGaussian(input_channels: int, nb_dims: int, sigma: Union[float, Sequence[float]], kernel_sizes: Optional[Union[int, Sequence[int]]] = None, padding: typing_extensions.Literal[same, none] = 'same', device: Optional[torch.device] = None)¶

Bases: FilterFixed

Implement a gaussian filter as a torch.nn.Module

class trw.train.MeaningfulPerturbation(model, iterations=150, l1_coeff=0.1, tv_coeff=0.2, tv_beta=3, noise=0.2, model_output_postprocessing=functools.partial(F.softmax, dim=1), mask_reduction_factor=8, optimizer_fn=default_optimizer, information_removal_fn=default_information_removal_smoothing, export_fn=None)¶

Implementation of “Interpretable Explanations of Black Boxes by Meaningful Perturbation”, arXiv:1704.03296

Handle only 2D and 3D inputs. Other inputs will be discarded.

Deviations: - use a global smoothed image to speed up the processing

__call__(self, inputs, target_class_name, target_class=None)¶

Parameters

inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained
target_class – the index of the class to explain the decision. If None, the class output will be used
target_class_name –
the output node to be used. If None: * if model output is a single tensor then use this as target output
- else it will use the first OutputClassification output

Returns

a tuple (output_name, dictionary (input, explanation mask))

static _get_output(target_class_name, outputs, postprocessing)¶

trw.train.default_information_removal_smoothing(image, blurring_sigma=5, blurring_kernel_size=23, explanation_for=None)¶

Default information removal (smoothing).

Parameters

image – an image
blurring_sigma – the sigma of the blurring kernel used to “remove” information from the image
blurring_kernel_size – the size of the kernel to be used. This is an internal parameter to approximate the gaussian kernel. This is exposed since in 3D case, the memory consumption may be high and having a truthful gaussian blurring is not crucial.
explanation_for – the class to explain

Returns

a smoothed image

class trw.train.DataParallelExtended(*arg, **argv)¶

Bases: torch.nn.DataParallel

Customized version of torch.nn.DataParallel to support model with complex outputs such as trw.train.Output

gather(self, outputs, output_device)¶

trw.train.grid_sample(input: torch.Tensor, grid: torch.Tensor, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = None) → torch.Tensor¶

Compatibility layer for argument change between pytorch <= 1.2 and pytorch > 1.3

See torch.nn.functional.grid_sample()

trw.train¶

Submodules¶

Package Contents¶

Classes¶

Functions¶

Attributes¶

`trw.train`¶