trw.train
¶
Submodules¶
trw.train.analysis_plots
trw.train.collate
trw.train.compatibility
trw.train.data_parallel_extented
trw.train.filter_gaussian
trw.train.grad_cam
trw.train.graph_reflection
trw.train.guided_back_propagation
trw.train.integrated_gradients
trw.train.job_executor2
trw.train.losses
trw.train.meaningful_perturbation
trw.train.metrics
trw.train.optimizer_clipping
trw.train.optimizers
trw.train.optimizers_v2
trw.train.options
trw.train.outputs_trw
trw.train.sample_export
trw.train.sampler
trw.train.sequence
trw.train.sequence_adaptor
trw.train.sequence_array
trw.train.sequence_async_reservoir
trw.train.sequence_batch
trw.train.sequence_collate
trw.train.sequence_map
trw.train.sequence_max_samples
trw.train.sequence_rebatch
trw.train.sequence_sub_batch
trw.train.trainer
trw.train.trainer_v2
trw.train.utilities
Package Contents¶
Classes¶
Create default options for the training and evaluation process. |
|
Context manager that automatically track added hooks on the model and remove them when |
|
This is a tag name to find the output reference back from outputs |
|
Classification output |
|
Regression output |
|
Represent an embedding |
|
This is a tag name to find the output reference back from outputs |
|
Represent a given loss as an output. |
|
Classification output |
|
Output for binary segmentation. |
|
Classification output for binary classification |
|
Implementation of the soft Dice Loss (multi-class) for N-d images |
|
This criterion is a implementation of Focal Loss, which is proposed in |
|
Implement a triplet loss |
|
Center loss, penalize the features falling further from the feature class center. |
|
Implementation of the contrastive loss. |
|
Optimize a metric similar to |
|
The macro F1-score is non-differentiable. Instead use a surrogate that is differentiable |
|
Mean squared error loss with target packed as an integer (e.g., classification) |
|
Clips the gradient norm during optimization |
|
Gradient-weighted Class Activation Mapping |
|
Produces gradients generated with guided back propagation from the given image |
|
Implementation of Integrated gradients, a method of attributing the prediction of a deep network |
|
A Sequence defines how to iterate the data as a sequence of small batches of data. |
|
A Sequence defines how to iterate the data as a sequence of small batches of data. |
|
Create a sequence of batches from numpy arrays, lists and |
|
Group several batches into a single batch |
|
This sequence will asynchronously process data and keep a reserve of loaded samples |
|
Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface |
|
Group the data into a sequence of dictionary of torch.Tensor |
|
This sequence will normalize the batch size of an underlying sequence |
|
This sequence will split batches in smaller batches if the underlying sequence batch is too large. |
|
A metric base class |
|
Calculate the |
|
Calculate the sensitivity and specificity for a binary classification using the output_truth and output |
|
Extract the loss from the outputs |
|
Calculate the Area under the Receiver operating characteristic (ROC) curve. |
|
A metric base class |
|
Samples elements randomly. If without replacement, then sample from a shuffled dataset. |
|
Samples elements sequentially, always in the same order. |
|
Samples elements randomly from a given list of indices, without replacement. |
|
Resample the samples so that class_name classes have equal probably of being sampled. |
|
Base class for all Samplers. |
|
Elements from a given list of list of indices are randomly drawn without replacement, |
|
Apply a fixed filter to n-dimensional images |
|
Implement a gaussian filter as a |
|
Implementation of "Interpretable Explanations of Black Boxes by Meaningful Perturbation", arXiv:1704.03296 |
|
Customized version of |
Functions¶
|
Return the data root directory |
|
Check if the path exist. If yes, remove the folder then recreate the folder, else create it |
|
Set the learning rate of the optimizer to a specific value |
|
Clean the filename so that it can be used as a valid filename |
|
Return the device of a module. This may be incorrect if we have a module split accross different devices |
|
Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved. |
|
Return a good choice of dataset name and split name, possibly not the train split. |
|
|
|
Return the output mappings of a classification output from the datasets infos |
|
Return the output mappings of a classification output from the datasets infos |
|
Make random index triplets (anchor, positive, negative) such that |
|
Make random indices of pairs of samples that belongs or not to the same target. |
Return a set of unique and easily distinguishable colors |
|
Return a set of unique and easily distinguishable colors |
|
|
Apply spectral norm on every sub-modules |
|
Apply gradient clipping recursively on a module as callback. |
|
loss combining cross entropy and multi-class dice |
|
Calculate the total variation norm |
|
Encode the targets (an tensor of integers representing a class) |
|
Create a dictionary of loss functions for each of the dataset |
|
|
|
Run the eval loop (i.e., the model parameters will NOT be updated) |
|
Run the train loop (i.e., the model parameters will be updated) |
|
Default callbacks to be performed after the model has been trained |
|
Default callbacks to be performed at the end of each epoch |
|
Default callbacks to be performed before the fitting of the model |
|
Default loss is the sum of all loss terms |
|
Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler |
|
Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler |
|
Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma |
|
Create an ADAM optimizer for each of the dataset with optional scheduler |
|
Create an ADAM optimizer for each of the dataset with optional scheduler |
|
Create an optimizer and scheduler |
|
Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler |
|
Create a ADAM optimizer for each of the dataset with step learning rate scheduler |
|
Plot groups of histories |
|
Plot the confusion matrix of a predicted class versus the true class |
|
Summarizes the important statistics for a classification problem |
|
Create a contiguous list of label names ordered from 0..N from the class mapping |
|
Calculate the ROC and AUC of a binary classifier |
|
Compare different histories: e.g., compare 2 configuration, which one has the best results for a given |
|
Export a figure |
|
Calculate the area under the curve of the ROC plot (AUROC) |
|
Find the input leaves of a tensor. |
|
Perform a forward pass of the model with given inputs and retrieve the last convolutional layer |
|
Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type |
|
Perform a forward pass of the model with given inputs and retrieve the last convolutional layer |
|
Postptocess the output to be suitable for gradient attribution. |
|
|
|
Default information removal (smoothing). |
|
Compatibility layer for argument change between pytorch <= 1.2 and pytorch > 1.3 |
Attributes¶
- class trw.train.Options(logging_directory: Optional[str] = None, num_epochs: int = 50, device: Optional[torch.device] = None, mixed_precision_enabled: bool = False, gradient_update_frequency: int = 1)¶
Create default options for the training and evaluation process.
- __repr__(self) str ¶
Return repr(self).
- trw.train.get_logging_root(logging_root: Optional[str] = None) str ¶
Return the data root directory
- trw.train.create_or_recreate_folder(path, nb_tries=3, wait_time_between_tries=2.0)¶
Check if the path exist. If yes, remove the folder then recreate the folder, else create it
- Parameters
path – the path to create or recreate
nb_tries – the number of tries to be performed before failure
wait_time_between_tries – the time to wait before the next try
- Returns
True
if successful orFalse
if failed.
- trw.train.set_optimizer_learning_rate(optimizer, learning_rate)¶
Set the learning rate of the optimizer to a specific value
- Parameters
optimizer – the optimizer to update
learning_rate – the learning rate to set
- Returns
None
- class trw.train.CleanAddedHooks(model)¶
Context manager that automatically track added hooks on the model and remove them when the context is released
- __enter__(self)¶
- __exit__(self, type, value, traceback)¶
- static record_hooks(module_source)¶
Record hooks :param module_source: the module to track the hooks
- Returns
at tuple (forward, backward). forward and backward are a dictionary of hooks ID by module
- trw.train.safe_filename(filename)¶
Clean the filename so that it can be used as a valid filename
- trw.train.get_device(module, batch=None)¶
Return the device of a module. This may be incorrect if we have a module split accross different devices
- trw.train.transfer_batch_to_device(batch, device, non_blocking=True)¶
Transfer the Tensors and numpy arrays to the specified device. Other types will not be moved.
- Parameters
batch – the batch of data to be transferred
device – the device to move the tensors to
non_blocking – non blocking memory transfer to GPU
- Returns
a batch of data on the specified device
- trw.train.find_default_dataset_and_split_names(datasets, default_dataset_name=None, default_split_name=None, train_split_name=None)¶
Return a good choice of dataset name and split name, possibly not the train split.
- Parameters
datasets – the datasets
default_dataset_name – a possible dataset name. If None, find a suitable dataset, if not, the dataset must be present
default_split_name – a possible split name. If None, find a suitable split, if not, the dataset must be present. if train_split_name is specified, the selected split name will be different from train_split_name
train_split_name – if not None, exclude the train split
- Returns
a tuple (dataset_name, split_name)
- trw.train.get_class_name(mapping, classid)¶
- trw.train.get_classification_mapping(datasets_infos, dataset_name, split_name, output_name)¶
Return the output mappings of a classification output from the datasets infos
- Parameters
datasets_infos – the info of the datasets
dataset_name – the name of the dataset
split_name – the split name
output_name – the output name
- Returns
a dictionary {‘mapping’: {name->ID}, ‘mappinginv’: {ID->name}}
- trw.train.get_classification_mappings(datasets_infos, dataset_name, split_name)¶
Return the output mappings of a classification output from the datasets infos
- Parameters
datasets_infos – the info of the datasets
dataset_name – the name of the dataset
split_name – the split name
- Returns
a dictionary {outputs: {‘mapping’: {name->ID}, ‘mappinginv’: {ID->name}}}
- trw.train.make_triplet_indices(targets)¶
- Make random index triplets (anchor, positive, negative) such that
anchor
andpositive
belong to the same target while
negative
belongs to a different target
- Parameters
targets – a 1D integral tensor in range [0..C]
- Returns
a tuple of indices (samples, samples_positive, samples_negative)
- Make random index triplets (anchor, positive, negative) such that
- trw.train.make_pair_indices(targets, same_target_ratio=0.5)¶
Make random indices of pairs of samples that belongs or not to the same target.
- Parameters
same_target_ratio – specify the ratio of same target to be generated for sample pairs
targets – a 1D integral tensor in range [0..C] to be used to group the samples into same or different target
- Returns
a tuple with (samples_0 indices, samples_1 indices, same_target)
- trw.train.make_unique_colors()¶
Return a set of unique and easily distinguishable colors :return: a list of RBG colors
- trw.train.make_unique_colors_f()¶
Return a set of unique and easily distinguishable colors :return: a list of RBG colors
- trw.train.apply_spectral_norm(module, n_power_iterations=1, eps=1e-12, dim=None, name='weight', discard_layer_types=(torch.nn.InstanceNorm2d, torch.nn.InstanceNorm3d))¶
Apply spectral norm on every sub-modules
- Parameters
module – the parent module to apply spectral norm
discard_layer_types – the layers_legacy of this type will not have spectral norm applied
n_power_iterations – number of power iterations to calculate spectral norm
eps – epsilon for numerical stability in calculating norms
dim – dimension corresponding to number of outputs, the default is
0
, except for modules that are instances of ConvTranspose{1,2,3}d, when it is1
name – name of weight parameter
- Returns
the same module as input module
- trw.train.apply_gradient_clipping(module: torch.nn.Module, value)¶
Apply gradient clipping recursively on a module as callback.
Every time the gradient is calculated, it is intercepted and clipping applied.
- Parameters
module – a module where sub-modules will have their gradients clipped
value – the maximum value of the gradient
- class trw.train.Output(metrics, output, criterion_fn, collect_output=False, sample_uid_name=None)¶
This is a tag name to find the output reference back from outputs
- output_ref_tag = output_ref¶
- evaluate_batch(self, batch, is_training)¶
Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)
- loss_term_cleanup(self, loss_term)¶
This function is called for each batch just before switching to another batch.
It can be used to clean up large arrays stored or release CUDA memory
- class trw.train.OutputClassification(output, output_truth, *, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics: List[OutputClassification.__init__.metrics] = metrics.default_classification_metrics(), loss_reduction=torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1, keepdim=True), maybe_optional=False, classes_name='unknown', sample_uid_name=default_sample_uid_name)¶
Bases:
Output
Classification output
- evaluate_batch(self, batch, is_training)¶
Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)
- loss_term_cleanup(self, loss_term)¶
This function is called for each batch just before switching to another batch.
It can be used to clean up large arrays stored or release CUDA memory
- class trw.train.OutputRegression(output, output_truth, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics=metrics.default_regression_metrics(), loss_reduction=mean_all, weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., target_name=None, sample_uid_name=default_sample_uid_name)¶
Bases:
Output
Regression output
- evaluate_batch(self, batch, is_training)¶
Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)
- class trw.train.OutputEmbedding(output, clean_loss_term_each_batch=False, sample_uid_name=default_sample_uid_name, functor=None)¶
Bases:
Output
Represent an embedding
This is only used to record a tensor that we consider an embedding (e.g., to be exported to tensorboard)
- evaluate_batch(self, batch, is_training)¶
Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)
- loss_term_cleanup(self, loss_term)¶
This function is called for each batch just before switching to another batch.
It can be used to clean up large arrays stored or release CUDA memory
- trw.train.default_sample_uid_name = sample_uid¶
- trw.train.segmentation_criteria_ce_dice(output, truth, per_voxel_weights=None, ce_weight=0.5, per_class_weights=None, power=1.0, smooth=1.0, focal_gamma=None)¶
loss combining cross entropy and multi-class dice
- Parameters
output – the output value, with shape [N, C, Dn…D0]
truth – the truth, with shape [N, 1, Dn..D0]
ce_weight – the weight of the cross entropy to use. This controls the importance of the cross entropy loss to the overall segmentation loss. Range in [0..1]
per_class_weights – the weight per class. A 1D vector of size C indicating the weight of the classes. This will be used for the cross-entropy loss
per_voxel_weights – the weight of each truth voxel. Must be of shape [N, Dn..D0]
- Returns
a torch tensor
- class trw.train.OutputTriplets(samples, positive_samples, negative_samples, criterion_fn=lambda : ..., metrics=metrics.default_generic_metrics(), loss_reduction=mean_all, weight_name=None, loss_scaling=1.0, sample_uid_name=default_sample_uid_name)¶
Bases:
Output
This is a tag name to find the output reference back from outputs
- evaluate_batch(self, batch, is_training)¶
Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)
- class trw.train.OutputLoss(losses, loss_reduction=torch.mean, metrics=metrics.default_generic_metrics(), sample_uid_name=default_sample_uid_name)¶
Bases:
Output
Represent a given loss as an output.
This can be useful to add additional regularizer to the training (e.g.,
trw.train.LossCenter
).- evaluate_batch(self, batch, is_training)¶
Evaluate a batch of data and extract important outputs :param batch: the batch of data :param is_training: if True, this was a training batch :return: tuple(a dictionary of values, dictionary of metrics)
- loss_term_cleanup(self, loss_term)¶
This function is called for each batch just before switching to another batch.
It can be used to clean up large arrays stored or release CUDA memory
- class trw.train.OutputSegmentation(output: torch.Tensor, output_truth: torch.Tensor, criterion_fn: Callable[[], Any] = LossDiceMulticlass, collect_output: bool = False, collect_only_non_training_output: bool = False, metrics: List[OutputSegmentation.__init__.metrics] = metrics.default_segmentation_metrics(), loss_reduction: Callable[[torch.Tensor], torch.Tensor] = torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=functools.partial(torch.argmax, dim=1, keepdim=True), maybe_optional=False, sample_uid_name=default_sample_uid_name)¶
Bases:
OutputClassification
Classification output
- class trw.train.OutputSegmentationBinary(output: torch.Tensor, output_truth: torch.Tensor, criterion_fn: Callable[[], Any] = LossDiceMulticlass, collect_output: bool = False, collect_only_non_training_output: bool = False, metrics: List[OutputSegmentationBinary.__init__.metrics] = metrics.default_segmentation_metrics(), loss_reduction: Callable[[torch.Tensor], torch.Tensor] = torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., maybe_optional=False, sample_uid_name=default_sample_uid_name)¶
Bases:
OutputSegmentation
Output for binary segmentation.
- Parameters
output – shape N * 1 * X format, must be raw logits
output_truth – should have N * 1 * X format, with values 0 or 1
- class trw.train.OutputClassificationBinary(output, output_truth, *, criterion_fn=lambda : ..., collect_output=True, collect_only_non_training_output=False, metrics: List[OutputClassificationBinary.__init__.metrics] = metrics.default_classification_metrics(), loss_reduction=torch.mean, weights=None, per_voxel_weights=None, loss_scaling=1.0, output_postprocessing=lambda x: ..., maybe_optional=False, classes_name='unknown', sample_uid_name=default_sample_uid_name)¶
Bases:
OutputClassification
Classification output for binary classification
- Parameters
output – the output with shape [N, 1, {X}], without any activation applied (i.e., logits)
output_truth – the truth with shape [N, 1, {X}]
- class trw.train.LossDiceMulticlass(normalization_fn: Callable[[torch.Tensor], torch.Tensor] = partial(nn.Softmax, dim=1), eps: float = 1e-05, return_dice_by_class: bool = False, smooth: float = 0.001, power: float = 1.0, per_class_weights: Sequence[float] = None, discard_background_loss: bool = True)¶
Bases:
torch.nn.Module
Implementation of the soft Dice Loss (multi-class) for N-d images
If multi-class, compute the loss for each class then average the losses
References
[1] “V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation” https://arxiv.org/pdf/1606.04797.pdf
- forward(self, output, target)¶
- Parameters
output – must have N x C x d0 x … x dn shape, where C is the total number of classes to predict
target – must have N x 1 x d0 x … x dn shape
- Returns
if return_dice_by_class is False, return 1 - dice score suitable for optimization. Else, return the (numerator, cardinality) by class and by sample
- class trw.train.LossFocalMulticlass(alpha=None, gamma=2, reduction='mean')¶
Bases:
torch.nn.Module
This criterion is a implementation of Focal Loss, which is proposed in Focal Loss for Dense Object Detection, https://arxiv.org/pdf/1708.02002.pdf
Loss(x, class) = - alpha (1-softmax(x)[class])^gamma log(softmax(x)[class])
- Parameters
alpha (1D Tensor, Variable) – the scalar factor for this criterion. One weight factor for each class.
gamma (float, double) – gamma > 0; reduces the relative loss for well-classified examples (p > .5), putting more focus on hard, misclassified examples
- forward(self, outputs, targets)¶
- class trw.train.LossTriplets(margin=1.0, distance=nn.PairwiseDistance(p=2))¶
Bases:
torch.nn.Module
Implement a triplet loss
The goal of the triplet loss is to make sure that:
Two examples with the same label have their embeddings close together in the embedding space
Two examples with different labels have their embeddings far away.
However, we don’t want to push the train embeddings of each label to collapse into very small clusters. The only requirement is that given two positive examples of the same class and one negative example, the negative should be farther away than the positive by some margin. This is very similar to the margin used in SVMs, and here we want the clusters of each class to be separated by the margin.
The loss implements the following equation:
mathcal{L} = max(d(a, p) - d(a, n) + margin, 0)
- forward(self, samples, positive_samples, negative_samples)¶
Calculate the triplet loss
- Parameters
samples – the samples
positive_samples – the samples that belong to the same group as samples
negative_samples – the samples that belong to a different group than samples
- Returns
a 1D tensor (N) representing the loss per sample
- class trw.train.LossCenter(number_of_classes, number_of_features, alpha=1.0)¶
Bases:
torch.nn.Module
Center loss, penalize the features falling further from the feature class center.
In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this loss can be used as a new supervision signal. Specifically, the center loss simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers.
An implementation of center loss: Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016.
Note
This loss must be part of a parent module or explicitly optimized by an optimizer. If not, the centers will not be modified.
- forward(self, x, classes)¶
- Parameters
x – the features, an arbitrary n-d tensor (N * C * …). Features should ideally be in range [0..1]
classes – a 1D integral tensor (N) representing the class of each
x
- Returns
a 1D tensor (N) representing the loss per sample
- class trw.train.LossContrastive(margin=1.0)¶
Bases:
torch.nn.Module
Implementation of the contrastive loss.
L(x0, x1, y) = 0.5 * (1 - y) * d(x0, x1)^2 + 0.5 * y * max(0, m - d(x0, x1))^2
with y = 0 for samples x0 and x1 deemed dissimilar while y = 1 for similar samples. Dissimilar pairs contribute to the loss function only if their distance is within this radius
m
and minimize d(x0, x1) over the set of all similar pairs.See Dimensionality Reduction by Learning an Invariant Mapping, Raia Hadsell, Sumit Chopra, Yann LeCun, 2006.
- forward(self, x0, x1, same_target)¶
- Parameters
x0 – N-D tensor
x1 – N-D tensor
same_target –
0
or1
1D tensor.1
means thex0
andx1
belongs to the same class, while0
means they are from a different class
- Returns
a 1D tensor (N) representing the loss per sample
- trw.train.total_variation_norm(x, beta)¶
Calculate the total variation norm
- Parameters
x – a tensor with format (samples, components, dn, …, d0)
beta – the exponent
- Returns
a scalar
- class trw.train.LossCrossEntropyCsiMulticlass¶
Bases:
torch.nn.Module
Optimize a metric similar to
Critical Success Index
(CSI) on the cross-entropyA loss for heavily unbalanced data (order of magnitude more negative than positive) Calculate the cross-entropy and use only the loss using the TP, FP and FN. Loss from TN is simply discarded.
- forward(self, outputs, targets, important_class=1)¶
- Parameters
outputs – a N x C tensor with
N
the number of samples andC
the number of classestargets – a
N
integral tensorimportant_class – the class to keep the cross-entropy loss even if classification is correct
- Returns
a
N
floating tensor representing the loss of each sample
- class trw.train.LossBinaryF1(eps=0.0001)¶
Bases:
torch.nn.Module
- The macro F1-score is non-differentiable. Instead use a surrogate that is differentiable
and correlates well with the Macro F1 score by working on the class probabilities rather than the discrete classification.
- For example, if the ground truth is 1 and the model prediction is 0.8, we calculate it as 0.8 true
positive and 0.2 false negative
- forward(self, outputs, targets)¶
- trw.train.one_hot(targets: trw.basic_typing.TorchTensorNX, num_classes: int, dtype=torch.float32, device: Optional[torch.device] = None) trw.basic_typing.TorchTensorNCX ¶
Encode the targets (an tensor of integers representing a class) as one hot encoding.
Support target as N-dimensional data (e.g., 3D segmentation map).
Equivalent to torch.nn.functional.one_hot for backward compatibility with pytorch 1.0
- Parameters
num_classes – the total number of classes
targets – a N-dimensional integral tensor (e.g., 1D for classification, 2D for 2D segmentation map…)
dtype – the type of the output tensor
device – the device of the one-hot encoded tensor. If None, use the target’s device
- Returns
a one hot encoding of a N-dimentional integral tensor
- class trw.train.LossMsePacked(reduction: typing_extensions.Literal[mean, none] = 'mean')¶
Bases:
torch.nn.Module
Mean squared error loss with target packed as an integer (e.g., classification)
The
packed_target
will be one hot encoded and the mean squared error is applied with thetensor
.- forward(self, tensor, packed_target)¶
- Parameters
tensor – a NxCx… tensor
packed_target – a Nx1x… tensor
- trw.train.create_losses_fn(datasets, generic_loss)¶
Create a dictionary of loss functions for each of the dataset
- Parameters
datasets – the datasets
generic_loss – a loss function
- Returns
A dictionary of losses for each of the dataset
- trw.train.epoch_train_eval(options, datasets, optimizers, model, losses, schedulers, per_step_schedulers, history, callbacks_per_batch, callbacks_per_batch_loss_terms, run_eval, force_eval_mode, eval_loop_fn=eval_loop, train_loop_fn=train_loop)¶
- Parameters
options –
datasets –
optimizers –
model –
losses –
schedulers –
per_step_schedulers –
history –
callbacks_per_batch –
callbacks_per_batch_loss_terms –
run_eval –
force_eval_mode –
eval_loop_fn –
train_loop_fn –
Returns:
- trw.train.eval_loop(options, device, dataset_name, split_name, split, model, loss_fn, history, callbacks_per_batch=None, callbacks_per_batch_loss_terms=None)¶
Run the eval loop (i.e., the model parameters will NOT be updated)
Note
If callback_per_batch or callbacks_per_batch_loss_terms raise StopIteration, the eval loop will be stopped
- Parameters
device –
dataset_name –
split_name –
split –
model –
loss_fn –
history –
callbacks_per_batch –
callbacks_per_batch_loss_terms –
- Returns
- trw.train.train_loop(options, device, dataset_name, split_name, split, optimizer, per_step_scheduler, model, loss_fn, history, callbacks_per_batch, callbacks_per_batch_loss_terms, gradient_scaler=None)¶
Run the train loop (i.e., the model parameters will be updated)
Note
If callbacks_per_batch or callbacks_per_batch_loss_terms raise an exception StopIteration, the train loop will be stopped
- Parameters
device – the device to be used to optimize the model
dataset_name – the name of the dataset
split_name – the name of the split
split – a dictionary of feature name and values
optimizer – an optimizer to optimize the model
per_step_scheduler – scheduler to be applied per-batch
model – the model to be optimized
loss_fn – the loss function
history – a list of history step
callbacks_per_batch – the callbacks to be performed on each batch. if None, no callbacks to be run
callbacks_per_batch_loss_terms – the callbacks to be performed on each loss term. if None, no callbacks to be run
gradient_scaler – if mixed precision is enabled, this is the scale to be used for the gradient update
Notes
if
optimizer
is None, there MUST be a.backward()
to free graph and memory.
- trw.train.default_post_training_callbacks(embedding_name='embedding', dataset_name=None, split_name=None, discard_train_error_export=False, export_errors=True, explain_decision=True, additional_callbacks=None)¶
Default callbacks to be performed after the model has been trained
- trw.train.default_per_epoch_callbacks(logger=default_logger, with_worst_samples_by_epoch=True, with_activation_statistics=False, convolutional_kernel_export_frequency=None, additional_callbacks=None)¶
Default callbacks to be performed at the end of each epoch
- trw.train.default_pre_training_callbacks(logger=default_logger, with_lr_finder=False, with_export_augmentations=True, with_reporting_server=True, with_profiler=False, additional_callbacks=None)¶
Default callbacks to be performed before the fitting of the model
- trw.train.default_sum_all_losses(dataset_name, batch, loss_terms)¶
Default loss is the sum of all loss terms
- class trw.train.TrainerV2(callbacks_per_batch=None, callbacks_per_batch_loss_terms=None, callbacks_per_epoch=default_per_epoch_callbacks(), callbacks_pre_training=default_pre_training_callbacks(), callbacks_post_training=default_post_training_callbacks(), trainer_callbacks_per_batch=trainer_callbacks_per_batch, run_epoch_fn=epoch_train_eval, logging_level=logging.DEBUG, skip_eval_epoch_0=True)¶
- static save_model(model, metadata: trw.train.utilities.RunMetadata, path, pickle_module=pickle)¶
Save a model to file
- Parameters
model – the model to serialize
metadata – an optional result file associated with the model
path – the base path to save the model
pickle_module – the serialization module that will be used to save the model and results
- static load_state(model: torch.nn.Module, path: str, device: torch.device = None, pickle_module: Any = pickle, strict: bool = True) None ¶
Load the state of a model
- Parameters
model – where to load the state
path – where the model’s state was saved
device – where to locate the model
pickle_module – how to read the model parameters and metadata
strict – whether to strictly enforce that the keys in
state_dict
match the keys returned by this module’sstate_dict()
function
- static load_model(path: str, model_kwargs: Optional[Dict[Any, Any]] = None, with_result: bool = False, device: torch.device = None, pickle_module: Any = pickle) Tuple[torch.nn.Module, trw.train.utilities.RunMetadata] ¶
Load a previously saved model
Construct a model from the
RunMetadata.class_name
class and with argumentsmodel_kwargs
- Parameters
path – where to store the model. result’s will be loaded from path + ‘.result’
model_kwargs – arguments used to instantiate the model stored in
RunMetadata.class_name
with_result – if True, the results of the model will be loaded
device – where to load the model. For example, models are typically trained on GPU, but for deployment, CPU might be good enough. If None, use the same device as when the model was exported
pickle_module – the de-serialization module to be used to load model and results
- Returns
a tuple model, metadata
- fit(self, options, datasets, model: torch.nn.Module, optimizers_fn, losses_fn=default_sum_all_losses, loss_creator=create_losses_fn, log_path=None, with_final_evaluation=True, history=None, erase_logging_folder=True, eval_every_X_epoch=1) trw.train.utilities.RunMetadata ¶
Fit the model
- Parameters
options –
datasets –
a functor returning a dictionary of datasets. Alternatively, datasets infos can be specified. inputs_fn must return one of:
datasets: dictionary of dataset
(datasets, datasets_infos): dictionary of dataset and additional infos
We define:
datasets: a dictionary of dataset. a dataset is a dictionary of splits. a split is a dictionary of batched features.
Datasets infos are additional infos useful for the debugging of the dataset (e.g., class mappings, sample UIDs). Datasets infos are typically much smaller than datasets should be loaded in loadable in memory
model – a Module or a ModuleDict
optimizers_fn –
losses_fn –
loss_creator –
log_path – the path of the logs to be exported during the training of the model. if the log_path is not an absolute path, the options.workflow_options.logging_directory is used as root
with_final_evaluation –
history –
erase_logging_folder – if True, the logging will be erased when fitting starts
eval_every_X_epoch – evaluate the model every X epochs
Returns:
- trw.train.create_sgd_optimizers_fn(datasets, model, learning_rate, momentum=0.9, weight_decay=0, nesterov=False, scheduler_fn=None, per_step_scheduler_fn=None)¶
Create a Stochastic gradient descent optimizer for each of the dataset with optional scheduler
- Parameters
datasets – a dictionary of dataset
model – a model to optimize
learning_rate – the initial learning rate
scheduler_fn – a scheduler, or None
momentum – the momentum of the SGD
weight_decay – the weight decay
nesterov – enables Nesterov momentum
per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)
- Returns
An optimizer
- trw.train.create_sgd_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, momentum=0.9, nesterov=False)¶
Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler
- Parameters
datasets – a dictionary of dataset
model – a model to optimize
learning_rate – the initial learning rate
step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma
gamma – the factor to apply to the learning rate every step
weight_decay – the weight decay
nesterov – enables Nesterov momentum
momentum – the momentum of the SGD
- Returns
An optimizer with a step scheduler
- trw.train.create_scheduler_step_lr(optimizer, step_size=30, gamma=0.1)¶
Create a learning rate scheduler. Every step_size, the learning late will be multiplied by gamma
- Parameters
optimizer – the optimizer
step_size – every number of epochs composing one step. Each step the learning rate will be decreased
gamma – apply this factor to the learning rate every time it is adjusted
- Returns
a learning rate scheduler
- trw.train.create_adam_optimizers_fn(datasets, model, learning_rate, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, scheduler_fn=None, per_step_scheduler_fn=None)¶
Create an ADAM optimizer for each of the dataset with optional scheduler
- Parameters
datasets – a dictionary of datasets
model – a model to optimize
learning_rate – the initial learning rate
weight_decay – the weight decay
scheduler_fn – a scheduler, or None
betas – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps – term to add to denominator to avoid division by zero
per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)
- Returns
An optimizer
- trw.train.create_adam_optimizers_scheduler_step_lr_fn(datasets, model, learning_rate, step_size, gamma, weight_decay=0, betas=(0.9, 0.999))¶
Create an ADAM optimizer for each of the dataset with optional scheduler
- Parameters
datasets – a dictionary of dataset
model – a model to optimize
learning_rate – the initial learning rate
step_size – the number of epoch composing a step. Each step the learning rate will be multiplied by gamma
gamma – the factor to apply to the learning rate every step
weight_decay – the weight decay
betas – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
- Returns
An optimizer with a step scheduler
- trw.train.create_optimizers_fn(datasets, model, optimizer_fn, scheduler_fn=None, per_step_scheduler_fn=None)¶
Create an optimizer and scheduler
Note
if model is an instance of`ModuleDict`, then the optimizer will only consider the parameters model[dataset_name].parameters() else model.parameters()
- Parameters
datasets – a dictionary of dataset
model – the model. Should be a Module or a ModuleDict
optimizer_fn – the functor to instantiate the optimizer
scheduler_fn – the functor to instantiate the scheduler to be run by epoch. May be None, in that case there will be no schedule
per_step_scheduler_fn – the functor to instantiate scheduler to be run per-step (batch)
- trw.train.create_sgd_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3, nesterov=False)¶
Create a Stochastic gradient descent optimizer for each of the dataset with step learning rate scheduler
- Parameters
datasets – a dictionary of dataset
model – a model to optimize
max_learning_rate – the maximum learning rate
epochs – The number of epochs to train for
steps_per_epoch – The number of steps per epoch. If 0 or None, the schedule will be based on mumber of epochs only
learning_rate_start_div_factor – defines the initial learning rate for the first step as initial_learning = max_learning_rate / learning_rate_start_div_factor
learning_rate_end_div_factor – defines the end learning rate for the last step as final_learning_rate = max_learning_rate / learning_rate_start_div_factor / learning_rate_end_div_factor
percentage_cycle_increase – The percentage of the cycle (in number of steps) spent increasing the learning rate
additional_scheduler_kwargs – additional arguments provided to the scheduler
weight_decay – the weight decay
nesterov – enables Nesterov momentum
momentum – the momentum of the SGD
- Returns
An optimizer with a step scheduler
- trw.train.create_adam_optimizers_scheduler_one_cycle_lr_fn(datasets, model, max_learning_rate, epochs, steps_per_epoch, additional_scheduler_kwargs=None, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, learning_rate_start_div_factor=25, learning_rate_end_div_factor=10000, percentage_cycle_increase=0.3)¶
Create a ADAM optimizer for each of the dataset with step learning rate scheduler
- Parameters
datasets – a dictionary of dataset
model – a model to optimize
max_learning_rate – the maximum learning rate
epochs – The number of epochs to train for
steps_per_epoch – The number of steps per epoch. If 0 or None, the schedule will be based on mumber of epochs only
learning_rate_start_div_factor – defines the initial learning rate for the first step as initial_learning = learning_rate_start_multiplier * max_learning_rate
learning_rate_end_div_factor – defines the end learning rate for the last step as final_learning_rate = max_learning_rate / learning_rate_start_div_factor / learning_rate_end_div_factor
percentage_cycle_increase – The percentage of the cycle (in number of steps) spent increasing the learning rate
additional_scheduler_kwargs – additional arguments provided to the scheduler
weight_decay – the weight decay
betas – betas of the ADAM optimizer
eps – eps of the ADAM optimizer
- Returns
An optimizer with a step scheduler
- class trw.train.ClippingGradientNorm(optimizer_base: torch.optim.Optimizer, max_norm: float = 1.0, norm_type: float = 2.0)¶
Bases:
torch.optim.Optimizer
Clips the gradient norm during optimization
- step(self, closure=None)¶
Performs a single optimization step (parameter update).
- Parameters
closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.
Note
Unless otherwise specified, this function should not modify the
.grad
field of the parameters.
- class trw.train.Optimizer(optimizer_fn: Callable[[Iterator[torch.nn.parameter.Parameter]], torch.optim.Optimizer], scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]] = None, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]] = None)¶
- set_scheduler_fn(self, scheduler_fn: Optional[Callable[[torch.optim.Optimizer], SchedulerType]])¶
- set_step_scheduler_fn(self, step_scheduler_fn: Optional[Callable[[torch.optim.Optimizer], StepSchedulerType]])¶
- __call__(self, datasets: trw.basic_typing.Datasets, model: torch.nn.Module) Tuple[Dict[str, torch.optim.Optimizer], Optional[Dict[str, SchedulerType]], Optional[Dict[str, StepSchedulerType]]] ¶
- scheduler_step_lr(self, step_size: int, gamma: float = 0.1) Optimizer ¶
Apply a scheduler on the learning rate.
Decays the learning rate of each parameter group by gamma every step_size epochs.
- scheduler_cosine_annealing_warm_restart(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1) Optimizer ¶
Apply a scheduler on the learning rate.
Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.
References
- scheduler_cosine_annealing_warm_restart_decayed(self, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch=- 1, decay_factor=0.7) Optimizer ¶
Apply a scheduler on the learning rate. Each time the learning rate is restarted, the base learning rate is decayed
Restart the learning rate every T_0 * (T_mult)^(#restart) epochs.
References
- scheduler_one_cycle(self, max_learning_rate: float, epochs: int, steps_per_epoch: int, learning_rate_start_div_factor: float = 25.0, learning_rate_end_div_factor: float = 10000.0, percentage_cycle_increase: float = 0.3, anneal_strategy: str = 'cos', cycle_momentum: bool = True, base_momentum: float = 0.85, max_momentum: float = 0.95)¶
This scheduler should not be used with another scheduler!
The learning rate or momentum provided by the Optimizer will be overriden by this scheduler.
- clip_gradient_norm(self, max_norm: float = 1.0, norm_type: float = 2.0)¶
Clips the gradient norm during optimization
- Parameters
max_norm – the maximum norm of the concatenated gradients of the optimizer. Note: the gradient is modulated by the learning rate
norm_type – type of the used p-norm. Can be
'inf'
for infinity norm
- See:
torch.nn.utils.clip_grad_norm_()
- class trw.train.OptimizerAdam(learning_rate: float, weight_decay: float = 0, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08)¶
Bases:
Optimizer
- class trw.train.OptimizerSGD(learning_rate: float, momentum: float = 0.9, weight_decay: float = 0, nesterov: bool = False)¶
Bases:
Optimizer
- class trw.train.OptimizerAdamW(learning_rate: float, weight_decay: float = 0.01, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08)¶
Bases:
Optimizer
- trw.train.plot_group_histories(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) None ¶
Plot groups of histories :param root: the directory where the plot will be exported :param history_values: a map of list of list of (epoch, value) :param title: the title of the graph :param xlabel: the x label :param ylabel: the y label :param max_nb_plots_per_group: the maximum number of plots per group :param colors: the colors to be used
- trw.train.confusion_matrix(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) None ¶
Plot the confusion matrix of a predicted class versus the true class
- Parameters
export_path – the folder where the confusion matrix will be exported
classes_predictions – the classes that were predicted by the classifier
classes_trues – the true classes
classes – a list of labels. Label 0 for class 0, label 1 for class 1…
normalize – if True, the confusion matrix will be normalized to 1.0 per row
title – the title of the plot
cmap – the color map to use
display_numbers – if True, display the numbers within each cell of the confusion matrix
maximum_chars_per_line – the title will be split every maximum_chars_per_line characters to avoid display issues
rotate_x – if not None, indicates the rotation of the label on x axis
rotate_y – if not None, indicates the rotation of the label on y axis
display_names_x – if True, the class name, if specified, will also be displayed on the x axis
sort_by_decreasing_sample_size – if True, the confusion matrix will be sorted by decreasing number of samples. This can
be useful to show if the errors may be due to low number of samples :param excludes_classes_with_samples_less_than: if not None, the classes with
less than excludes_classes_with_samples_less_than samples will be excluded
:param normalize_unit_percentage if True, use 100% base as unit instead of 1.0 :param main_font_size: the font size of the text :param sub_font_size: the font size of the sub-elements (e.g., ticks) :param max_size_x_label: the maximum length of a label on the x-axis
- trw.train.classification_report(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)¶
Summarizes the important statistics for a classification problem :param predictions: the classes predicted :param prediction_scores: the scores for each, for each sample :param trues: the true class for each sample :param class_mapping: the class mapping (class id, class name) :return: a dictionary of statistics or sub-report
- trw.train.list_classes_from_mapping(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')¶
Create a contiguous list of label names ordered from 0..N from the class mapping
- Parameters
mappinginv – a dictionary like structure encoded as (class id, class_name)
default_name – if there is no class name, use this as default
- Returns
a list of class names ordered from class id = 0 to class id = N. If mappinginv is None, returns None
- trw.train.plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)¶
Calculate the ROC and AUC of a binary classifier
Supports multiple ROC curves.
- Parameters
export_path – the folder where the plot will be exported
trues – the expected class. Can be a list for multiple ROC curves
found_scores_1 – the score found for the prediction of class 1. Must be a numpy array of floats. Can be a list for multiple ROC curves
title – the title of the ROC
label_name – the name of the ROC curve. Can be a list for multiple ROC curves
colors – if None use default colors. Else, a numpy array of dim (Nx3) where N is the number of colors. Must be in [0..1] range
- trw.train.boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)¶
Compare different histories: e.g., compare 2 configuration, which one has the best results for a given measure?
- Parameters
export_path – where to export the figure
features_trials – a dictionary of list. Each list representing a feature
title – the title of the plot
ylabel – the label for axis y
xlabel – the label for axis x
meanline – if True, draw a line from the center of the plot for each history name to the next
maximum_chars_per_line – the maximum of characters allowed per line of title. If exceeded, newline will be created.
plot_trials – if True, each trial of a feature will be plotted
scale – the axis scale to be used
y_range – if not None, the (min, max) of the y-axis
rotate_x – if not None, the rotation of the x axis labels in degree
showfliers – if True, plot the outliers
maximum_chars_per_line – the maximum number of characters of the title per line
title_line_height – the height of the title lines
- trw.train.export_figure(path, name, maximum_length=259, dpi=None)¶
Export a figure
- Parameters
path – the folder where to export the figure
name – the name of the figure.
maximum_length – the maximum length of the full path of a figure. If the full path name is greater than maximum_length, the name will be subs-ampled to the maximal allowed length
dpi – Dots Per Inch: the density of the figure
- trw.train.auroc(trues: numpy.ndarray, found_1_scores: numpy.ndarray) float ¶
Calculate the area under the curve of the ROC plot (AUROC)
- Parameters
trues – the expected class
found_1_scores – the score found for the class 1. Must be a numpy array of floats
- Returns
the AUROC
- trw.train.find_tensor_leaves_with_grad(tensor: torch.Tensor) Sequence[torch.Tensor] ¶
Find the input leaves of a tensor.
Input Leaves REQUIRES have requires_grad=True, else they will not be found
- Parameters
tensor – a torch.Tensor
- Returns
a list of torch.Tensor with attribute requires_grad=True that is an input of tensor
- trw.train.find_last_forward_convolution(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) Optional[Mapping] ¶
Perform a forward pass of the model with given inputs and retrieve the last convolutional layer
- Parameters
inputs – the input of the model so that we can call model(inputs)
model – the model
types – the types to be captured. Can be a single type or a tuple of types
relative_index (int) – indicate which module to return from the last collected module
- Returns
None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found
- trw.train.find_last_forward_types(model: torch.nn.Module, inputs: Any, types: Union[Any, Tuple[Any]], relative_index: int = 0) Optional[Mapping] ¶
Perform a forward pass of the model with given inputs and retrieve the last layer of the specified type
- Parameters
inputs – the input of the model so that we can call model(inputs)
model – the model
types – the types to be captured. Can be a single type or a tuple of types
relative_index – indicate which module to return from the last collected module
- Returns
None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found
- trw.train.find_first_forward_convolution(model: torch.nn.Module, inputs: Any = None, types: Union[Any, Tuple[Any]] = (nn.Conv2d, nn.Conv3d, nn.Conv1d), relative_index=0) Optional[Mapping] ¶
Perform a forward pass of the model with given inputs and retrieve the last convolutional layer
- Parameters
inputs – NOT USED
model – the model
types – the types to be captured. Can be a single type or a tuple of types
relative_index (int) – indicate which module to return from the last collected module
- Returns
None if no layer found or a dictionary of (outputs, matched_module, matched_module_input, matched_module_output) if found
- class trw.train.GradCam(model: torch.nn.Module, find_convolution: Callable[[torch.nn.Module, Union[trw.basic_typing.Batch, torch.Tensor]], Optional[Mapping]] = graph_reflection.find_last_forward_convolution, post_process_output: Callable[[Any], torch.Tensor] = guided_back_propagation.post_process_output_id)¶
Gradient-weighted Class Activation Mapping
This is based on the paper “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization”, Ramprasaath R et al.
- __call__(self, inputs: Union[trw.basic_typing.Batch, torch.Tensor], target_class_name: str = None, target_class: int = None) Optional[Tuple[str, Mapping]] ¶
- Parameters
inputs – the inputs to be fed to the model
target_class_name –
the output node to be used. If None: * if model output is a single tensor then use this as target output
else it will use the first OutputClassification output
target_class – the index of the class to explain the decision. If None, the class output will be used
- Returns
a tuple (output name, a dictionary (input_name, GradCAMs))
- class trw.train.GuidedBackprop(model: torch.nn.Module, unguided_gradient: bool = False, post_process_output: Callable[[Any], torch.Tensor] = post_process_output_id)¶
Produces gradients generated with guided back propagation from the given image
- update_relus(self) None ¶
- Updates relu activation functions so that
1- stores output in forward pass 2- imputes zero for gradient values that are less than zero
- static get_floating_inputs_with_gradients(inputs)¶
Extract inputs that have a gradient
- Parameters
inputs – a tensor of dictionary of tensors
- Returns
Return a list of tuple (name, input) for the input that have a gradient
- __call__(self, inputs: Tuple[torch.Tensor, trw.basic_typing.Batch], target_class: int, target_class_name: str) Optional[Tuple[str, Mapping]] ¶
Generate the guided back-propagation gradient
- Parameters
inputs – a tensor or dictionary of tensors
target_class – the target class to be explained
target_class_name – the name of the output class if multiple outputs
- Returns
a tuple (output_name, dictionary (input, gradient))
- static get_positive_negative_saliency(gradient: torch.Tensor) Tuple[torch.Tensor, torch.Tensor] ¶
Generates positive and negative saliency maps based on the gradient
- Parameters
gradient (numpy arr) – Gradient of the operation to visualize
- Returns
pos_saliency ( )
- trw.train.post_process_output_for_gradient_attribution(output: trw.train.outputs_trw.Output)¶
Postptocess the output to be suitable for gradient attribution.
In particular, if we have a
trw.train.OutputClassification
, we need to apply a softmax operation so that we can backpropagate the loss of a particular class with the appropriate value (1.0).- Parameters
output – a
trw.train.OutputClassification
- Returns
a
torch.Tensor
- class trw.train.IntegratedGradients(model: torch.nn.Module, steps: int = 100, baseline_inputs: Any = None, use_output_as_target: bool = False, post_process_output: Callable[[Any], torch.Tensor] = guided_back_propagation.post_process_output_id)¶
- Implementation of Integrated gradients, a method of attributing the prediction of a deep network
to its input features.
This is implementing the paper Axiomatic Attribution for Deep Networks, Mukund Sundararajan, Ankur Taly, Qiqi Yan as described in https://arxiv.org/abs/1703.01365
- __call__(self, inputs: Any, target_class_name: str, target_class: Optional[int] = None) Optional[Tuple[str, Mapping]] ¶
Generate the guided back-propagation gradient
- Parameters
inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained
target_class – the index of the class to explain the decision. If None, the class output will be used
target_class_name –
the output node to be used. If None: * if model output is a single tensor then use this as target output
else it will use the first OutputClassification output
- Returns
a tuple (output_name, dictionary (input, integrated gradient))
- trw.train.default_collate_fn(batch: Union[Sequence[Any], Mapping[str, Any]], device: torch.device, pin_memory: bool = False, non_blocking: bool = False)¶
- Parameters
batch – a dictionary of features or a list of dictionary of features
device – the device where to create the torch.Tensor
pin_memory – if True, pin the memory. Required to be a CUDA allocated torch.Tensor
non_blocking – if True, use non blocking memory transfer
- Returns
a dictionary of torch.Tensor
- class trw.train.Sequence(source_split)¶
A Sequence defines how to iterate the data as a sequence of small batches of data.
To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.
Examples:
data = list(range(100)) sequence = SequenceArray({'data': data}).batch(10) for batch in sequence: # do something with our batch
- abstract __iter__(self)¶
- Returns
An iterator of batches
- collate(self, collate_fn=default_collate_fn, device=None)¶
Aggregate the input batch as a dictionary of torch.Tensor and move the data to the appropriate device
- Parameters
collate_fn – the function to collate the input batch
device – the device where to send the samples. If None, the default device is CPU
- Returns
a collated sequence of batches
- map(self, function_to_run, nb_workers=0, max_jobs_at_once=None, queue_timeout=default_queue_timeout, collate_fn=None, max_queue_size_pin=None)¶
Transform a sequence using a given function.
Note
The map may create more samples than the original sequence.
- Parameters
function_to_run – the mapping function
nb_workers – the number of workers that will process the split. If 0, no workers will be created.
max_jobs_at_once – the maximum number of results that can be pushed in the result queue at once. If 0, no limit. If None, it will be set equal to the number of workers
queue_timeout – the timeout used to pull results from the output queue
collate_fn – a function to collate each batch of data
- : param max_queue_size_pin: defines the max number of batches prefected. If None, defaulting to
a size based on the number of workers. This only controls the final queue sized of the pin thread (the workers queue can be independently set)
- Returns
a sequence of batches
- batch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)¶
Group several batches of samples into a single batch
- Parameters
batch_size – the number of samples of the batch
discard_batch_not_full – if True and if a batch is not full, discard these
collate_fn – a function to collate the batches. If None, no collation performed
- Returns
a sequence of batches
- sub_batch(self, batch_size, discard_batch_not_full=False)¶
This sequence will split batches in smaller batches if the underlying sequence batch is too large.
This sequence can be useful to manage very large tensors. Indeed, this class avoids concatenating tensors (as opposed to in
trw.train.SequenceReBatch
). Since this operation can be costly as the tensors must be reallocated. In this case, it may be faster to work on a smaller batch by avoiding the concatenation cost.- Parameters
batch_size – the maximum size of a batch
discard_batch_not_full – if
True
, batch that do have sizebatch_size
will be discarded
- rebatch(self, batch_size, discard_batch_not_full=False, collate_fn=default_collate_list_of_dicts)¶
Normalize a sequence to identical batch size given an input sequence with varying batch size
- Parameters
batch_size – the size of the batches created by this sequence
discard_batch_not_full – if True, the last batch will be discarded if not full
collate_fn – function to merge multiple batches
- max_samples(self, max_samples)¶
- Virtual resize of the sequence. The sequence will terminate when a certain number
of samples produced has been reached. Restart the sequence where it was stopped.
- Parameters
max_samples – the number of samples this sequence will produce before stopping
- async_reservoir(self, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=sampler.SamplerSequential(), collate_fn=remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)¶
- Parameters
max_reservoir_samples – the maximum number of samples of the reservoir
function_to_run – the function to run asynchronously
min_reservoir_samples – the minimum of samples of the reservoir needed before an output sequence can be created
nb_workers – the number of workers that will process function_to_run to fill the reservoir. Must be >= 1
max_jobs_at_once – the maximum number of jobs that can be started and stored by epoch by the workers. If 0, no limit. If None: set to the number of workers
reservoir_sampler – a sampler that will be used to sample the reservoir or None for sequential sampling of the reservoir
collate_fn – a function to post-process the samples into a single batch, or None if not to be collated
maximum_number_of_samples_per_epoch – the maximum number of samples that will be generated per epoch. If we reach this maximum, the sequence will be interrupted
max_reservoir_replacement_size – Specify the maximum number of samples replaced in the reservoir by epoch. If None, we will use the whole result queue. This can be useful to control explicitly how the reservoir is updated and depend less on the speed of hardware. Note that to have an effect, max_jobs_at_once should be greater than max_reservoir_replacement_size.
- fill_queue(self)¶
Fill the queue jobs of the current sequence
- fill_queue_all_sequences(self)¶
Go through all the sequences and fill their input queue
- has_background_jobs(self)¶
- Returns
True if this sequence has a background job to create the next element
- has_background_jobs_previous_sequences(self)¶
- Returns
the number of sequences that have background jobs currently running to create the next element
- abstract subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- abstract subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- abstract close(self)¶
- class trw.train.SequenceMap(source_split, nb_workers, function_to_run, max_jobs_at_once=None, queue_timeout=default_queue_timeout, debug_job_report_timeout=30.0, collate_fn=None, max_queue_size_pin=None)¶
Bases:
trw.train.sequence.Sequence
A Sequence defines how to iterate the data as a sequence of small batches of data.
To train a deep learning model, it is often necessary to split our original data into small chunks. This is because storing all at once the forward pass of our model is memory hungry, instead, we calculate the forward and backward pass on a small chunk of data. This is the interface for batching a dataset.
Examples:
data = list(range(100)) sequence = SequenceArray({'data': data}).batch(10) for batch in sequence: # do something with our batch
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- fill_queue(self)¶
Fill the queue jobs of the current sequence
- initializer(self)¶
Initialize the sequence to iterate through batches
- __next_local(self, next_fn)¶
Get the next elements
Handles single item or list of items returned by next_fn :param next_fn: return the next elements
- __next__(self)¶
- has_background_jobs(self)¶
- Returns
True if this sequence has a background job to create the next element
- next_item(self, blocking)¶
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Finish and join the existing pool processes
- class trw.train.SequenceArray(split, sampler=sampler_trw.SamplerRandom(), transforms=None, use_advanced_indexing=True, sample_uid_name=sample_uid_name)¶
Bases:
trw.train.sequence.Sequence
Create a sequence of batches from numpy arrays, lists and
torch.Tensor
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
- class trw.train.SequenceBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)¶
Bases:
trw.train.sequence.Sequence
,trw.train.sequence.SequenceIterator
Group several batches into a single batch
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- __next__(self)¶
- Returns
The next batch of data
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Special method to close and clean the resources of the sequence
- class trw.train.SequenceAsyncReservoir(source_split, max_reservoir_samples, function_to_run, *, min_reservoir_samples=1, nb_workers=1, max_jobs_at_once=None, reservoir_sampler=None, collate_fn=sequence.remove_nested_list, maximum_number_of_samples_per_epoch=None, max_reservoir_replacement_size=None)¶
Bases:
trw.train.sequence.Sequence
This sequence will asynchronously process data and keep a reserve of loaded samples
The idea is to have long loading processes work in the background while still using as efficiently as possible the data that is currently loaded. The data is slowly being replaced by freshly loaded data over time.
Jobs are started and results retrieved at the beginning of each epoch
This sequence can be interrupted (e.g., after a certain number of batches have been returned). When the sequence is restarted, the reservoir will not be emptied.
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- reservoir_size(self)¶
- Returns
The current number of samples in the reservoir
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- initializer(self)¶
- fill_queue(self)¶
Fill the input queue of jobs to be completed
- _retrieve_results_and_fill_queue(self)¶
Retrieve results from the output queue
- _wait_for_job_completion(self)¶
Block the processing until we have enough result in the reservoir
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Finish and join the existing pool processes
- class trw.train.SequenceAdaptorTorch(torch_dataloader, features=None)¶
Bases:
trw.train.sequence.Sequence
,trw.train.sequence.SequenceIterator
Adapt a torch.utils.data.DataLoader to a trw.train.Sequence interface
The main purpose is to enable compatibility with the torch data loader and any existing third party code.
- __len__(self)¶
- __iter__(self)¶
- Returns
An iterator of batches
- __next__(self)¶
- Returns
The next batch of data
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- close(self)¶
Special method to close and clean the resources of the sequence
- class trw.train.SequenceCollate(source_split, collate_fn=collate.default_collate_fn, device=None)¶
Bases:
trw.train.sequence.Sequence
,trw.train.sequence.SequenceIterator
Group the data into a sequence of dictionary of torch.Tensor
This can be useful to combine batches of dictionaries into a single batch with all features concatenated on axis 0. Often used in conjunction of
trw.train.SequenceAsyncReservoir
andtrw.train.SequenceMap
.- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- __next__(self)¶
- Returns
The next batch of data
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Special method to close and clean the resources of the sequence
- class trw.train.SequenceReBatch(source_split, batch_size, discard_batch_not_full=False, collate_fn=sequence.default_collate_list_of_dicts)¶
Bases:
trw.train.sequence.Sequence
,trw.train.sequence.SequenceIterator
This sequence will normalize the batch size of an underlying sequence
If the underlying sequence batch is too large, it will be split in multiple batches. Conversely, if the size of the batch is too small, it several batches will be merged until we reach the expected batch size.
- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- __next__(self)¶
- Returns
The next batch of data
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Special method to close and clean the resources of the sequence
- class trw.train.SequenceSubBatch(source_split, batch_size, discard_batch_not_full=False)¶
Bases:
trw.train.sequence.Sequence
,trw.train.sequence.SequenceIterator
This sequence will split batches in smaller batches if the underlying sequence batch is too large.
This sequence can be useful to manage very large tensors. Indeed, this class avoids concatenating tensors (as opposed to in
trw.train.SequenceReBatch
). Since this operation can be costly as the tensors must be reallocated. In this case, it may be faster to work on a smaller batch by avoiding the concatenation cost.- subsample(self, nb_samples)¶
Sub-sample a sequence to a fixed number of samples.
The purpose is to obtain a smaller sequence, this is particularly useful for the export of augmentations, samples.
- Parameters
nb_samples – the number of samples desired in the original sequence
- Returns
a subsampled Sequence
- subsample_uids(self, uids, uids_name, new_sampler=None)¶
Sub-sample a sequence to samples with specified UIDs.
- Parameters
uids (list) – the uids. If new_sampler keeps the ordering, then the samples of the resampled sequence should follow uids ordering
uids_name (str) – the name of the UIDs
new_sampler (Sampler) – the sampler to be used for the subsampler sequence. If None, re-use the existing
- Returns
a subsampled Sequence
- __next__(self)¶
- Returns
The next batch of data
- __iter__(self)¶
- Returns
An iterator of batches
- close(self)¶
Special method to close and clean the resources of the sequence
- class trw.train.Metric¶
Bases:
abc.ABC
A metric base class
Calculate interesting metric
- abstract __call__(self, outputs: Dict) Optional[Dict] ¶
- Parameters
outputs – the outputs of a batch
- Returns
a dictionary of metric names/values or None
- abstract aggregate_metrics(self, metric_by_batch: List[Dict]) Dict[str, float] ¶
Aggregate all the metrics into a consolidated metric.
- Parameters
metric_by_batch – a list of metrics, one for each batch
- Returns
a dictionary of result name and value
- class trw.train.MetricClassificationError¶
Bases:
Metric
Calculate the
1 - accuracy
using the output_truth and output- __call__(self, outputs)¶
- Parameters
outputs – the outputs of a batch
- Returns
a dictionary of metric names/values or None
- aggregate_metrics(self, metric_by_batch)¶
Aggregate all the metrics into a consolidated metric.
- Parameters
metric_by_batch – a list of metrics, one for each batch
- Returns
a dictionary of result name and value
- class trw.train.MetricClassificationBinarySensitivitySpecificity¶
Bases:
Metric
Calculate the sensitivity and specificity for a binary classification using the output_truth and output
- __call__(self, outputs)¶
- Parameters
outputs – the outputs of a batch
- Returns
a dictionary of metric names/values or None
- aggregate_metrics(self, metric_by_batch)¶
Aggregate all the metrics into a consolidated metric.
- Parameters
metric_by_batch – a list of metrics, one for each batch
- Returns
a dictionary of result name and value
- class trw.train.MetricLoss¶
Bases:
Metric
Extract the loss from the outputs
- __call__(self, outputs)¶
- Parameters
outputs – the outputs of a batch
- Returns
a dictionary of metric names/values or None
- aggregate_metrics(self, metric_by_batch)¶
Aggregate all the metrics into a consolidated metric.
- Parameters
metric_by_batch – a list of metrics, one for each batch
- Returns
a dictionary of result name and value
- class trw.train.MetricClassificationBinaryAUC¶
Bases:
Metric
Calculate the Area under the Receiver operating characteristic (ROC) curve.
For this, the output needs to provide an
output_raw
of shape [N, 2] (i.e., binary classification framed as a multi-class classification) or of shape [N, 1] (binary classification)- __call__(self, outputs)¶
- Parameters
outputs – the outputs of a batch
- Returns
a dictionary of metric names/values or None
- aggregate_metrics(self, metric_by_batch)¶
Aggregate all the metrics into a consolidated metric.
- Parameters
metric_by_batch – a list of metrics, one for each batch
- Returns
a dictionary of result name and value
- class trw.train.MetricClassificationF1(average=None)¶
Bases:
Metric
A metric base class
Calculate interesting metric
- __call__(self, outputs)¶
- Parameters
outputs – the outputs of a batch
- Returns
a dictionary of metric names/values or None
- aggregate_metrics(self, metric_by_batch)¶
Aggregate all the metrics into a consolidated metric.
- Parameters
metric_by_batch – a list of metrics, one for each batch
- Returns
a dictionary of result name and value
- class trw.train.SamplerRandom(replacement=False, nb_samples_to_generate=None, batch_size=1)¶
Bases:
Sampler
Samples elements randomly. If without replacement, then sample from a shuffled dataset. If with replacement, then user can specify
num_samples
to draw.- initializer(self, data_source)¶
Initialize the sequence iteration
- Parameters
data_source – the data source to iterate
- __iter__(self)¶
Returns: an iterator the return indices of the original data source
- __next__(self)¶
- class trw.train.SamplerSequential(batch_size=1)¶
Bases:
Sampler
Samples elements sequentially, always in the same order.
- initializer(self, data_source)¶
Initialize the sequence iteration
- Parameters
data_source – the data source to iterate
- __iter__(self)¶
Returns: an iterator the return indices of the original data source
- class trw.train.SamplerSubsetRandom(indices)¶
Bases:
Sampler
Samples elements randomly from a given list of indices, without replacement.
- Parameters
indices (sequence) – a sequence of indices
- initializer(self, data_source)¶
Initialize the sequence iteration
- Parameters
data_source – the data source to iterate
- __iter__(self)¶
Returns: an iterator the return indices of the original data source
- class trw.train.SamplerClassResampling(class_name, nb_samples_to_generate, reuse_class_frequencies_across_epochs=True, batch_size=1)¶
Bases:
Sampler
Resample the samples so that class_name classes have equal probably of being sampled.
Classification problems rarely have balanced classes so it is often required to super-sample the minority class to avoid penalizing the under represented classes and help the classifier to learn good features (as opposed to learn the class distribution).
- initializer(self, data_source)¶
Initialize the sequence iteration
- Parameters
data_source – the data source to iterate
- _fit(self, classes)¶
- __next__(self)¶
- __iter__(self)¶
Returns: an iterator the return indices of the original data source
- class trw.train.Sampler¶
Bases:
object
Base class for all Samplers.
Every Sampler subclass has to provide an __iter__ method, providing a way to iterate over indices of dataset elements, and a __len__ method that returns the length of the returned iterators.
- abstract initializer(self, data_source)¶
Initialize the sequence iteration
- Parameters
data_source – the data source to iterate
- abstract __iter__(self)¶
Returns: an iterator the return indices of the original data source
- class trw.train.SamplerSubsetRandomByListInterleaved(indices: Sequence[Sequence[int]])¶
Bases:
Sampler
Elements from a given list of list of indices are randomly drawn without replacement, one element per list at a time.
For sequences with different sizes, the longest of the sequences will be trimmed to the size of the shortest sequence.
This can be used for example to resample without replacement imbalanced classes in a classification task.
Examples:
>>> l1 = np.asarray([1, 2]) >>> l2 = np.asarray([3, 4, 5]) >>> sampler = trw.train.SamplerSubsetRandomByListInterleaved([l1, l2]) >>> sampler.initializer(None) >>> indices = [i for i in sampler] # indices could be [1, 5, 2, 4]
- Parameters
indices – a sequence of sequence of indices
- initializer(self, data_source)¶
Initialize the sequence iteration
- Parameters
data_source – the data source to iterate
- __iter__(self)¶
Returns: an iterator the return indices of the original data source
- class trw.train.FilterFixed(kernel: torch.Tensor, groups: int = 1, padding: int = 0)¶
Bases:
torch.nn.Module
Apply a fixed filter to n-dimensional images
- __call__(self, value: trw.basic_typing.TorchTensorNCX) trw.basic_typing.TorchTensorNCX ¶
- class trw.train.FilterGaussian(input_channels: int, nb_dims: int, sigma: Union[float, Sequence[float]], kernel_sizes: Optional[Union[int, Sequence[int]]] = None, padding: typing_extensions.Literal[same, none] = 'same', device: Optional[torch.device] = None)¶
Bases:
FilterFixed
Implement a gaussian filter as a
torch.nn.Module
- class trw.train.MeaningfulPerturbation(model, iterations=150, l1_coeff=0.1, tv_coeff=0.2, tv_beta=3, noise=0.2, model_output_postprocessing=functools.partial(F.softmax, dim=1), mask_reduction_factor=8, optimizer_fn=default_optimizer, information_removal_fn=default_information_removal_smoothing, export_fn=None)¶
Implementation of “Interpretable Explanations of Black Boxes by Meaningful Perturbation”, arXiv:1704.03296
Handle only 2D and 3D inputs. Other inputs will be discarded.
Deviations: - use a global smoothed image to speed up the processing
- __call__(self, inputs, target_class_name, target_class=None)¶
- Parameters
inputs – a tensor or dictionary of tensors. Must have require_grads for the inputs to be explained
target_class – the index of the class to explain the decision. If None, the class output will be used
target_class_name –
the output node to be used. If None: * if model output is a single tensor then use this as target output
else it will use the first OutputClassification output
- Returns
a tuple (output_name, dictionary (input, explanation mask))
- static _get_output(target_class_name, outputs, postprocessing)¶
- trw.train.default_information_removal_smoothing(image, blurring_sigma=5, blurring_kernel_size=23, explanation_for=None)¶
Default information removal (smoothing).
- Parameters
image – an image
blurring_sigma – the sigma of the blurring kernel used to “remove” information from the image
blurring_kernel_size – the size of the kernel to be used. This is an internal parameter to approximate the gaussian kernel. This is exposed since in 3D case, the memory consumption may be high and having a truthful gaussian blurring is not crucial.
explanation_for – the class to explain
- Returns
a smoothed image
- class trw.train.DataParallelExtended(*arg, **argv)¶
Bases:
torch.nn.DataParallel
Customized version of
torch.nn.DataParallel
to support model with complex outputs such astrw.train.Output
- gather(self, outputs, output_device)¶
- trw.train.grid_sample(input: torch.Tensor, grid: torch.Tensor, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = None) torch.Tensor ¶
Compatibility layer for argument change between pytorch <= 1.2 and pytorch > 1.3
See
torch.nn.functional.grid_sample()