trw.callbacks

Submodules

Package Contents

Classes

Callback

Defines a callback function that may be called before training, during training, after training

CallbackDebugProcesses

Defines a callback function that may be called before training, during training, after training

CallbackEpochSummary

Summarizes the last epoch and display useful information such as metric per dataset/split

CallbackExplainDecision

Explain the decision of a model

ExplainableAlgorithm

Generic enumeration.

CallbackExportClassificationReport

Export the main classification measures for the classification outputs of the model

CallbackExportConvolutionKernel

Simply export convolutional kernel.

CallbackExportHistory

Summarize the training history of a model (i.e., as a function of iteration)

CallbackLearningRateFinder

Identify a good range for the learning rate parameter.

CallbackStopEpoch

Utility callback counting the number of samples. When maximum is reached, stop the iteration

CallbackLearningRateRecorder

Record the learning rate of the optimizers.

CallbackReportingAugmentations

Export sample augmentations.

CallbackReportingBestMetrics

Report the best value of the history and epoch for each metric

CallbackReportingClassificationErrors

Defines a callback function that may be called before training, during training, after training

CallbackReportingDatasetSummary

Summarizes the data (min value, max value, number of batches, shapes) for each split of each dataset

CallbackReportingRecordHistory

This callback records the history to the reporting layer

CallbackReportingExportSamples

Defines a callback function that may be called before training, during training, after training

CallbackReportingLayerStatistics

Report the activation and gradient statistics layer by layer

CallbackReportingLayerWeights

Report the weight statistics of each layer

CallbackReportingModelSummary

Defines a callback function that may be called before training, during training, after training

CallbackReportingStartServer

Defines a callback function that may be called before training, during training, after training

CallbackSaveLastModel

Save the current model to disk as well as metadata (history, outputs, infos).

ModelWithLowestMetric

CallbackSkipEpoch

Run its callbacks every few epochs

CallbackClearTensorboardLog

Remove any existing logger

CallbackTensorboardBased

Tensorboard based callback. Manages a single tensorboardX.SummaryWriter instance

CallbackTensorboardEmbedding

This callback records the embedding to be displayed with tensorboard

CallbackTensorboardRecordHistory

This callback records the history to a tensorboard readable log

CallbackTensorboardRecordModel

This callback will export the model to tensorboard

CallbackWorstSamplesByEpoch

The purpose of this callback is to track the samples with the worst loss during the training of the model

CallbackZipSources

Record important info relative to the training such as the sources & configuration info

CallbackEarlyStopping

Use historical runs to evaluate if a run is promising. If not, early stop will raise ExceptionAbortRun

CallbackReportingLearningRateRecorder

Report the weight statistics of each layer

CallbackProfiler

Run the torch.profiler while training the model

Functions

default_identify_learning_rate_section(lines_x, lines_y, loss_ratio_to_discard=0.8)

Find a good section for the learning rate.

select_classification_errors(batch, loss_terms)

class trw.callbacks.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackDebugProcesses(filename='process_stack_dumps', frequency_seconds=10.0, timeout=10.0, delayed_init=True)

Bases: trw.callbacks.callback.Callback

Defines a callback function that may be called before training, during training, after training

_init(self, root='')
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
close(self)
__del__(self)
class trw.callbacks.CallbackEpochSummary(logger=log_and_print, track_best_so_far=True)

Bases: trw.callbacks.callback.Callback

Summarizes the last epoch and display useful information such as metric per dataset/split

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackExplainDecision(max_samples=10, dirname='explained', dataset_name=None, split_name=None, algorithm=(ExplainableAlgorithm.MeaningfulPerturbations, ExplainableAlgorithm.GuidedBackPropagation, ExplainableAlgorithm.GradCAM, ExplainableAlgorithm.Gradient, ExplainableAlgorithm.IntegratedGradients), output_name=None, nb_explanations=1, algorithms_kwargs=default_algorithm_args(), average_filters=True)

Bases: trw.callbacks.callback.Callback

Explain the decision of a model

first_time(self, datasets, options)
static find_output_name(outputs, name)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.ExplainableAlgorithm

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

GuidedBackPropagation
GradCAM
Gradient
IntegratedGradients
MeaningfulPerturbations
class trw.callbacks.CallbackExportClassificationReport(with_confusion_matrix=True, with_ROC=True, with_report=True)

Bases: trw.callbacks.callback.Callback

Export the main classification measures for the classification outputs of the model

This include: * text report (e.g., accuracy, sensitivity, specificity, F1, typical errors & confusion matrix) * confusion matrix plot * ROC & AUC for binary classification problems

max_class_names = 40
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackExportConvolutionKernel(export_frequency=500, dirname='convolution_kernels', find_convolution_fn=graph_reflection.find_first_forward_convolution, dataset_name=None, split_name=None, export_filter_fn=default_export_filter)

Bases: trw.callbacks.callback.Callback

Simply export convolutional kernel.

This can be useful to check over the time if the weights have converger.

first_time(self, options, datasets, model)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackExportHistory(export_dirname='history', dicarded_metrics=default_dicarded_metrics())

Bases: trw.callbacks.callback.Callback

Summarize the training history of a model (i.e., as a function of iteration)

  • One plot per dataset

  • splits are plotted together

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackLearningRateFinder(nb_samples_per_learning_rate=1000, learning_rate_start=1e-06, learning_rate_stop=10.0, learning_rate_mul=1.2, learning_rate_final_multiplier=0.8, dataset_name=None, split_name=None, dirname='lr_finder', identify_learning_rate_section=default_identify_learning_rate_section, set_new_learning_rate=False, param_maximum_loss_ratio=0.8)

Bases: trw.callbacks.callback.Callback

Identify a good range for the learning rate parameter.

See “Cyclical Learning Rates for Training Neural Networks”, Leslie N. Smith. https://arxiv.org/abs/1506.01186

Start from a small learning rate and every iteration, increase the learning rate by a factor. At the same time record the loss per epoch. Suitable learning rates will make the loss function decrease. We should select the highest learning rate which decreases the loss function.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)

Note

The model will be deep copied so that we don’t influence the training

Parameters

**kwargs – required optimizers_fn

trw.callbacks.default_identify_learning_rate_section(lines_x, lines_y, loss_ratio_to_discard=0.8)

Find a good section for the learning rate.

Heuristic rules to find the best learning rate:

  1. worst loss is loss at epoch 0

  2. initially, the loss may not decrease due to small random variation, especially with small number of samples so tolerate that the initial LR may not be good

  3. after some epochs, the loss decrease to reach some minimum, then will increase significantly. Discard anything after this point

  4. find the LR achieving the minimum loss. This is our optimal LR

class trw.callbacks.CallbackStopEpoch(nb_samples)

Utility callback counting the number of samples. When maximum is reached, stop the iteration

reset(self)
__call__(self, dataset_name, split_name, batch)
class trw.callbacks.CallbackLearningRateRecorder(dirname='lr_recorder')

Bases: trw.callbacks.callback.Callback

Record the learning rate of the optimizers.

This is useful as a debugging tool.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
__del__(self)
class trw.callbacks.CallbackReportingAugmentations(nb_samples=10, nb_augmentation=5, table_name='augmentations', split_name=None, uid_name='sample_uid')

Bases: trw.callbacks.callback.Callback

Export sample augmentations.

Augmentation are detected using the uid_name of a sample. Samples with the same uid over several epochs

first_epoch(self, options)
create_or_recreate_table(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingBestMetrics(table_name='best_metrics', metric_to_discard=None, epoch_start=0)

Bases: trw.callbacks.callback.Callback

Report the best value of the history and epoch for each metric

This can be useful to accurately get the best value of a metric and in particular at which step it occurred.

first_epoch(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingClassificationErrors(max_samples=10, table_name='errors', loss_terms_inclusion=None, feature_exclusions=None, dataset_exclusions=None, split_exclusions=None, clear_previously_exported_samples=True, format='{dataset_name}_{split_name}_s{id}_e{epoch}', reporting_config_keep_last_n_rows=None, reporting_config_subsampling_factor=1.0)

Bases: trw.callbacks.callback.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
trw.callbacks.select_classification_errors(batch, loss_terms)
class trw.callbacks.CallbackReportingDatasetSummary(max_nb_samples=None, table_name='data_summary')

Bases: trw.callbacks.callback.Callback

Summarizes the data (min value, max value, number of batches, shapes) for each split of each dataset

first_epoch(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingRecordHistory(table_name='history')

Bases: trw.callbacks.callback.Callback

This callback records the history to the reporting layer

first_epoch(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingExportSamples(max_samples=50, table_name='samples', loss_terms_inclusion=None, feature_exclusions=None, dataset_exclusions=None, split_exclusions=None, clear_previously_exported_samples=True, format='{dataset_name}_{split_name}_s{id}_e{epoch}', reporting_config_keep_last_n_rows=None, reporting_config_subsampling_factor=1.0, reporting_scatter_x='split_name', reporting_scatter_y='dataset_name', reporting_color_by=None, reporting_display_with=None, reporting_binning_x_axis=None, reporting_binning_selection=None, select_sample_to_export=select_all)

Bases: trw.callbacks.callback.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingLayerStatistics(dataset_name=None, split_name=None, nb_samples=500, table_name='layer')

Bases: trw.callbacks.callback.Callback

Report the activation and gradient statistics layer by layer

first_time(self, options, datasets)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingLayerWeights(dataset_name=None, split_name=None, table_name='layer_weights')

Bases: trw.callbacks.callback.Callback

Report the weight statistics of each layer

first_time(self, options, datasets)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingModelSummary(dataset_name=None, split_name=None)

Bases: trw.callbacks.callback.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingStartServer(reporting_options=create_default_reporting_options(embedded=True, config={}), show_app=True, port=0, keep_alive_until_client_disconnect=True)

Bases: trw.callbacks.callback.Callback

Defines a callback function that may be called before training, during training, after training

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
__del__(self)
class trw.callbacks.CallbackSaveLastModel(model_name='last', with_outputs=False, is_versioned=False, rolling_size=None, keep_model_with_best_metric: ModelWithLowestMetric = None, revert_if_nan_metrics: Optional[Sequence[str]] = ('loss',), post_process_outputs: Optional[Callable[[trw.basic_typing.Datasets], trw.basic_typing.Datasets]] = exclude_large_embeddings)

Bases: trw.callbacks.callback.Callback

Save the current model to disk as well as metadata (history, outputs, infos).

This callback can be used during training (e.g., checkpoint) or at the end of the training.

Optionally, record the best model for a given dataset, split, output and metric.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.ModelWithLowestMetric(dataset_name, split_name, output_name, metric_name, minimum_metric=0.2)

Bases: ModelWithLowestMetricBase

update(self, metric_value, model, metadata, root_path)

Check the metrics and export the model if thresholds are satisfied

class trw.callbacks.CallbackSkipEpoch(nb_epochs, callbacks, include_epoch_zero=False)

Bases: trw.callbacks.callback.Callback

Run its callbacks every few epochs

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackClearTensorboardLog

Bases: CallbackTensorboardBased

Remove any existing logger

This is useful when we train multiple models so that they have their own tensorboard log file

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackTensorboardBased

Bases: trw.callbacks.callback.Callback

Tensorboard based callback. Manages a single tensorboardX.SummaryWriter instance

_tensorboard_logger
static create_logger(path)

Create a tensorboardX.SummaryWriter instance. If an instance already exists or tensorboardX could not be imported, no logger will be created :param path: where to write the tensorboard log :return: a logger or None if logger creation failed

static get_tensorboard_logger()
Returns

None if the tensorboad logger was not created or a tensorboardX.SummaryWriter

static remove_tensorboard_logger()

Remove the current tensorboardX.SummaryWriter

class trw.callbacks.CallbackTensorboardEmbedding(embedding_name, dataset_name=None, split_name=None, image_name=None, maximum_samples=2000, keep_features_fn=keep_small_features)

Bases: trw.callbacks.callback_tensorboard.CallbackTensorboardBased

This callback records the embedding to be displayed with tensorboard

Note: we must recalculate the embedding as we need to associate a specific input (i.e., we can’t store everything in memory so we need to collect what we need batch by batch)

first_time(self, datasets, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackTensorboardRecordHistory

Bases: trw.callbacks.callback_tensorboard.CallbackTensorboardBased

This callback records the history to a tensorboard readable log

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackTensorboardRecordModel(dataset_name=None, split_name=None, onnx_folder='onnx')

Bases: trw.callbacks.callback_tensorboard.CallbackTensorboardBased

This callback will export the model to tensorboard

@TODO ONNX is probably adding hooks and are not removed. To be investigated.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackWorstSamplesByEpoch(split_names=None, output_name=None, dataset_name=None, dirname='worst_samples_by_epoch', sort_samples_by_loss_error=True, worst_k_samples=1000, export_top_k_samples=50, uids_name=sequence_array.sample_uid_name, output_of_interest=(trw_outputs.OutputClassification, trw_outputs.OutputSegmentation, trw_outputs.OutputRegression))

Bases: trw.callbacks.callback.Callback

The purpose of this callback is to track the samples with the worst loss during the training of the model

It is interesting to understand what are the difficult samples (train and test split), are they always wrongly during the training or random? Are they the same samples with different models (i.e., initialization or model dependent)?

first_time(self, datasets, outputs)
static sort_split_data(errors_by_sample, worst_k_samples, discard_first_n_epochs=0)

Helper function to sort the samples

Parameters
  • errors_by_sample – the data

  • worst_k_samples – the number of samples to select or None

  • discard_first_n_epochs – the first few epochs are typically very noisy, so don’t use these

Returns

sorted data

export_stats(self, model, losses, datasets, datasets_infos, options, callbacks_per_batch)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackZipSources(folders_to_record, extensions=('.py', '.sh', '.bat', '.json'), filename='sources.zip', max_width=200, exclusions=('.mypy_cache', '.svn', '.git', '__pycache__'))

Bases: trw.callbacks.callback.Callback

Record important info relative to the training such as the sources & configuration info

This is to make sure a result can always be easily reproduced. Any configuration option can be safely appended in options.runtime

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackEarlyStopping(store: trw.hparams.RunStore, loss_fn: Callable[[trw.basic_typing.HistoryStep], float], raise_stop_fn: Optional[Callable[[float, trw.basic_typing.History], Tuple[bool, str]]] = None, checkpoints: Sequence[float] = (0.1, 0.25, 0.5, 0.75), discard_if_among_worst_X_performers: float = 0.6, only_consider_full_run: bool = True, min_number_of_runs: int = 10)

Bases: trw.callbacks.callback.Callback

Use historical runs to evaluate if a run is promising. If not, early stop will raise ExceptionAbortRun

_initialize(self, num_epochs: int) None
__call__(self, options, history: trw.basic_typing.History, model, **kwargs)
__del__(self)
class trw.callbacks.CallbackReportingLearningRateRecorder

Bases: trw.callbacks.callback.Callback

Report the weight statistics of each layer

first_time(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackProfiler(dataset_name=None, split_name=None, table_name='model_profiler', with_preprocessed_batch=False, schedule_kwargs=None)

Bases: trw.callbacks.callback.Callback

Run the torch.profiler while training the model

A profiler log will be created in the folder <output_root>/static/<table_name>

To visualize the output: - pip install torch_tb_profiler - tensorboard –logdir=<output_root>/static/model_profiler - in a browser: http://localhost:6006/#pytorch_profiler

Alternatively, traces can be loaded using chrome partially: - open chrome and open page: chrome://tracing - load trace chrome_trace.json

first_time(self, options, datasets)
__call__(self, options, history, model_orig, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)