Defines a callback function that may be called before training, during training, after training


Summarizes the last epoch and display useful information such as metric per dataset/split


Explain the decision of a model


Generic enumeration.


Export the main classification measures for the classification outputs of the model


Simply export convolutional kernel.


Summarize the training history of a model (i.e., as a function of iteration)


Identify a good range for the learning rate parameter.


Utility callback counting the number of samples. When maximum is reached, stop the iteration


Record the learning rate of the optimizers.


Export sample augmentations.


Report the best value of the history and epoch for each metric


Summarizes the data (min value, max value, number of batches, shapes) for each split of each dataset


This callback records the history to the reporting layer


Report the activation and gradient statistics layer by layer


Report the weight statistics of each layer


Save the current model to disk as well as metadata (history, outputs, infos).



Run its callbacks every few epochs


Remove any existing logger


Tensorboard based callback. Manages a single tensorboardX.SummaryWriter instance


This callback records the embedding to be displayed with tensorboard


This callback records the history to a tensorboard readable log


This callback will export the model to tensorboard


The purpose of this callback is to track the samples with the worst loss during the training of the model


Record important info relative to the training such as the sources & configuration info


Use historical runs to evaluate if a run is promising. If not, early stop will raise ExceptionAbortRun


Report the weight statistics of each layer


Run the torch.profiler while training the model


default_identify_learning_rate_section(lines_x, lines_y, loss_ratio_to_discard=0.8)

Find a good section for the learning rate.

select_classification_errors(batch, loss_terms)

class trw.callbacks.Callback

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackDebugProcesses(filename='process_stack_dumps', frequency_seconds=10.0, timeout=10.0, delayed_init=True)

Bases: trw.callbacks.callback.Callback

_init(self, root='')
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackEpochSummary(logger=log_and_print, track_best_so_far=True)

Bases: trw.callbacks.callback.Callback

Summarizes the last epoch and display useful information such as metric per dataset/split

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackExplainDecision(max_samples=10, dirname='explained', dataset_name=None, split_name=None, algorithm=(ExplainableAlgorithm.MeaningfulPerturbations, ExplainableAlgorithm.GuidedBackPropagation, ExplainableAlgorithm.GradCAM, ExplainableAlgorithm.Gradient, ExplainableAlgorithm.IntegratedGradients), output_name=None, nb_explanations=1, algorithms_kwargs=default_algorithm_args(), average_filters=True)

Bases: trw.callbacks.callback.Callback

Explain the decision of a model

first_time(self, datasets, options)
static find_output_name(outputs, name)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.ExplainableAlgorithm

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

class trw.callbacks.CallbackExportClassificationReport(with_confusion_matrix=True, with_ROC=True, with_report=True)

Bases: trw.callbacks.callback.Callback

Export the main classification measures for the classification outputs of the model

This include: * text report (e.g., accuracy, sensitivity, specificity, F1, typical errors & confusion matrix) * confusion matrix plot * ROC & AUC for binary classification problems

max_class_names = 40
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackExportConvolutionKernel(export_frequency=500, dirname='convolution_kernels', find_convolution_fn=graph_reflection.find_first_forward_convolution, dataset_name=None, split_name=None, export_filter_fn=default_export_filter)

Bases: trw.callbacks.callback.Callback

Simply export convolutional kernel.

This can be useful to check over the time if the weights have converger.

first_time(self, options, datasets, model)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackExportHistory(export_dirname='history', dicarded_metrics=default_dicarded_metrics())

Bases: trw.callbacks.callback.Callback

Summarize the training history of a model (i.e., as a function of iteration)

  • One plot per dataset

  • splits are plotted together

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackLearningRateFinder(nb_samples_per_learning_rate=1000, learning_rate_start=1e-06, learning_rate_stop=10.0, learning_rate_mul=1.2, learning_rate_final_multiplier=0.8, dataset_name=None, split_name=None, dirname='lr_finder', identify_learning_rate_section=default_identify_learning_rate_section, set_new_learning_rate=False, param_maximum_loss_ratio=0.8)

Bases: trw.callbacks.callback.Callback

Identify a good range for the learning rate parameter.

See “Cyclical Learning Rates for Training Neural Networks”, Leslie N. Smith.

Start from a small learning rate and every iteration, increase the learning rate by a factor. At the same time record the loss per epoch. Suitable learning rates will make the loss function decrease. We should select the highest learning rate which decreases the loss function.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)


The model will be deep copied so that we don’t influence the training


**kwargs – required optimizers_fn

trw.callbacks.default_identify_learning_rate_section(lines_x, lines_y, loss_ratio_to_discard=0.8)

Find a good section for the learning rate.

Heuristic rules to find the best learning rate:

  1. worst loss is loss at epoch 0

  2. initially, the loss may not decrease due to small random variation, especially with small number of samples so tolerate that the initial LR may not be good

  3. after some epochs, the loss decrease to reach some minimum, then will increase significantly. Discard anything after this point

  4. find the LR achieving the minimum loss. This is our optimal LR

class trw.callbacks.CallbackStopEpoch(nb_samples)

Utility callback counting the number of samples. When maximum is reached, stop the iteration

__call__(self, dataset_name, split_name, batch)
class trw.callbacks.CallbackLearningRateRecorder(dirname='lr_recorder')

Bases: trw.callbacks.callback.Callback

Record the learning rate of the optimizers.

This is useful as a debugging tool.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingAugmentations(nb_samples=10, nb_augmentation=5, table_name='augmentations', split_name=None, uid_name='sample_uid')

Bases: trw.callbacks.callback.Callback

Export sample augmentations.

Augmentation are detected using the uid_name of a sample. Samples with the same uid over several epochs

first_epoch(self, options)
create_or_recreate_table(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingBestMetrics(table_name='best_metrics', metric_to_discard=None, epoch_start=0)

Bases: trw.callbacks.callback.Callback

Report the best value of the history and epoch for each metric

This can be useful to accurately get the best value of a metric and in particular at which step it occurred.

first_epoch(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingClassificationErrors(max_samples=10, table_name='errors', loss_terms_inclusion=None, feature_exclusions=None, dataset_exclusions=None, split_exclusions=None, clear_previously_exported_samples=True, format='{dataset_name}_{split_name}_s{id}_e{epoch}', reporting_config_keep_last_n_rows=None, reporting_config_subsampling_factor=1.0)

Bases: trw.callbacks.callback.Callback

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
trw.callbacks.select_classification_errors(batch, loss_terms)
class trw.callbacks.CallbackReportingDatasetSummary(max_nb_samples=None, table_name='data_summary')

Bases: trw.callbacks.callback.Callback

Summarizes the data (min value, max value, number of batches, shapes) for each split of each dataset

first_epoch(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingRecordHistory(table_name='history')

Bases: trw.callbacks.callback.Callback

This callback records the history to the reporting layer

first_epoch(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingExportSamples(max_samples=50, table_name='samples', loss_terms_inclusion=None, feature_exclusions=None, dataset_exclusions=None, split_exclusions=None, clear_previously_exported_samples=True, format='{dataset_name}_{split_name}_s{id}_e{epoch}', reporting_config_keep_last_n_rows=None, reporting_config_subsampling_factor=1.0, reporting_scatter_x='split_name', reporting_scatter_y='dataset_name', reporting_color_by=None, reporting_display_with=None, reporting_binning_x_axis=None, reporting_binning_selection=None, select_sample_to_export=select_all)

Bases: trw.callbacks.callback.Callback

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingLayerStatistics(dataset_name=None, split_name=None, nb_samples=500, table_name='layer')

Bases: trw.callbacks.callback.Callback

Report the activation and gradient statistics layer by layer

first_time(self, options, datasets)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingLayerWeights(dataset_name=None, split_name=None, table_name='layer_weights')

Bases: trw.callbacks.callback.Callback

Report the weight statistics of each layer

first_time(self, options, datasets)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingModelSummary(dataset_name=None, split_name=None)

Bases: trw.callbacks.callback.Callback

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackReportingStartServer(reporting_options=create_default_reporting_options(embedded=True, config={}), show_app=True, port=0, keep_alive_until_client_disconnect=True)

Bases: trw.callbacks.callback.Callback

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackSaveLastModel(model_name='last', with_outputs=False, is_versioned=False, rolling_size=None, keep_model_with_best_metric: ModelWithLowestMetric = None, revert_if_nan_metrics: Optional[Sequence[str]] = ('loss',), post_process_outputs: Optional[Callable[[trw.basic_typing.Datasets], trw.basic_typing.Datasets]] = exclude_large_embeddings)

Bases: trw.callbacks.callback.Callback

Save the current model to disk as well as metadata (history, outputs, infos).

This callback can be used during training (e.g., checkpoint) or at the end of the training.

Optionally, record the best model for a given dataset, split, output and metric.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.ModelWithLowestMetric(dataset_name, split_name, output_name, metric_name, minimum_metric=0.2)

Bases: ModelWithLowestMetricBase

update(self, metric_value, model, metadata, root_path)

Check the metrics and export the model if thresholds are satisfied

class trw.callbacks.CallbackSkipEpoch(nb_epochs, callbacks, include_epoch_zero=False)

Bases: trw.callbacks.callback.Callback

Run its callbacks every few epochs

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackClearTensorboardLog

Bases: CallbackTensorboardBased

Remove any existing logger

This is useful when we train multiple models so that they have their own tensorboard log file

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackTensorboardBased

Bases: trw.callbacks.callback.Callback

Tensorboard based callback. Manages a single tensorboardX.SummaryWriter instance

static create_logger(path)

Create a tensorboardX.SummaryWriter instance. If an instance already exists or tensorboardX could not be imported, no logger will be created :param path: where to write the tensorboard log :return: a logger or None if logger creation failed

static get_tensorboard_logger()

None if the tensorboad logger was not created or a tensorboardX.SummaryWriter

static remove_tensorboard_logger()

Remove the current tensorboardX.SummaryWriter

class trw.callbacks.CallbackTensorboardEmbedding(embedding_name, dataset_name=None, split_name=None, image_name=None, maximum_samples=2000, keep_features_fn=keep_small_features)

Bases: trw.callbacks.callback_tensorboard.CallbackTensorboardBased

This callback records the embedding to be displayed with tensorboard

Note: we must recalculate the embedding as we need to associate a specific input (i.e., we can’t store everything in memory so we need to collect what we need batch by batch)

first_time(self, datasets, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackTensorboardRecordHistory

Bases: trw.callbacks.callback_tensorboard.CallbackTensorboardBased

This callback records the history to a tensorboard readable log

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackTensorboardRecordModel(dataset_name=None, split_name=None, onnx_folder='onnx')

Bases: trw.callbacks.callback_tensorboard.CallbackTensorboardBased

This callback will export the model to tensorboard

@TODO ONNX is probably adding hooks and are not removed. To be investigated.

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackWorstSamplesByEpoch(split_names=None, output_name=None, dataset_name=None, dirname='worst_samples_by_epoch', sort_samples_by_loss_error=True, worst_k_samples=1000, export_top_k_samples=50, uids_name=sequence_array.sample_uid_name, output_of_interest=(trw_outputs.OutputClassification, trw_outputs.OutputSegmentation, trw_outputs.OutputRegression))

Bases: trw.callbacks.callback.Callback

The purpose of this callback is to track the samples with the worst loss during the training of the model

It is interesting to understand what are the difficult samples (train and test split), are they always wrongly during the training or random? Are they the same samples with different models (i.e., initialization or model dependent)?

first_time(self, datasets, outputs)
static sort_split_data(errors_by_sample, worst_k_samples, discard_first_n_epochs=0)

Helper function to sort the samples

  • errors_by_sample – the data

  • worst_k_samples – the number of samples to select or None

  • discard_first_n_epochs – the first few epochs are typically very noisy, so don’t use these


sorted data

export_stats(self, model, losses, datasets, datasets_infos, options, callbacks_per_batch)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackZipSources(folders_to_record, extensions=('.py', '.sh', '.bat', '.json'), filename='', max_width=200, exclusions=('.mypy_cache', '.svn', '.git', '__pycache__'))

Bases: trw.callbacks.callback.Callback

Record important info relative to the training such as the sources & configuration info

This is to make sure a result can always be easily reproduced. Any configuration option can be safely appended in options.runtime

__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackEarlyStopping(store: trw.hparams.RunStore, loss_fn: Callable[[trw.basic_typing.HistoryStep], float], raise_stop_fn: Optional[Callable[[float, trw.basic_typing.History], Tuple[bool, str]]] = None, checkpoints: Sequence[float] = (0.1, 0.25, 0.5, 0.75), discard_if_among_worst_X_performers: float = 0.6, only_consider_full_run: bool = True, min_number_of_runs: int = 10)

Bases: trw.callbacks.callback.Callback

Use historical runs to evaluate if a run is promising. If not, early stop will raise ExceptionAbortRun

_initialize(self, num_epochs: int) None
__call__(self, options, history: trw.basic_typing.History, model, **kwargs)
class trw.callbacks.CallbackReportingLearningRateRecorder

Bases: trw.callbacks.callback.Callback

Report the weight statistics of each layer

first_time(self, options)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)
class trw.callbacks.CallbackProfiler(dataset_name=None, split_name=None, table_name='model_profiler', with_preprocessed_batch=False, schedule_kwargs=None)

Bases: trw.callbacks.callback.Callback

Run the torch.profiler while training the model

A profiler log will be created in the folder <output_root>/static/<table_name>

To visualize the output: - pip install torch_tb_profiler - tensorboard –logdir=<output_root>/static/model_profiler - in a browser: http://localhost:6006/#pytorch_profiler

Alternatively, traces can be loaded using chrome partially: - open chrome and open page: chrome://tracing - load trace chrome_trace.json

first_time(self, options, datasets)
__call__(self, options, history, model_orig, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)