trw.callbacks.callback_worst_samples_by_epoch

Module Contents

Classes

CallbackWorstSamplesByEpoch

The purpose of this callback is to track the samples with the worst loss during the training of the model

Functions

get_first_output_of_interest(outputs, dataset_name, split_name, output_of_interest)

Return the first output of interest of a given dataset name

export_samples_v2(dataset_name, split_name, device, split, model, losses, root, datasets_infos, max_samples, callbacks_per_batch)

Attributes

logger

trw.callbacks.callback_worst_samples_by_epoch.logger
trw.callbacks.callback_worst_samples_by_epoch.get_first_output_of_interest(outputs, dataset_name, split_name, output_of_interest)

Return the first output of interest of a given dataset name

Parameters
  • outputs – a dictionary (datasets) of dictionary (splits) of dictionary (outputs)

  • dataset_name – the dataset to consider. If None, the first dataset is considered

  • split_name – the split name to consider. If None, the first split is selected

  • output_of_interest – the output to consider

Returns:

trw.callbacks.callback_worst_samples_by_epoch.export_samples_v2(dataset_name, split_name, device, split, model, losses, root, datasets_infos, max_samples, callbacks_per_batch)
class trw.callbacks.callback_worst_samples_by_epoch.CallbackWorstSamplesByEpoch(split_names=None, output_name=None, dataset_name=None, dirname='worst_samples_by_epoch', sort_samples_by_loss_error=True, worst_k_samples=1000, export_top_k_samples=50, uids_name=sequence_array.sample_uid_name, output_of_interest=(trw_outputs.OutputClassification, trw_outputs.OutputSegmentation, trw_outputs.OutputRegression))

Bases: trw.callbacks.callback.Callback

The purpose of this callback is to track the samples with the worst loss during the training of the model

It is interesting to understand what are the difficult samples (train and test split), are they always wrongly during the training or random? Are they the same samples with different models (i.e., initialization or model dependent)?

first_time(self, datasets, outputs)
static sort_split_data(errors_by_sample, worst_k_samples, discard_first_n_epochs=0)

Helper function to sort the samples

Parameters
  • errors_by_sample – the data

  • worst_k_samples – the number of samples to select or None

  • discard_first_n_epochs – the first few epochs are typically very noisy, so don’t use these

Returns

sorted data

export_stats(self, model, losses, datasets, datasets_infos, options, callbacks_per_batch)
__call__(self, options, history, model, losses, outputs, datasets, datasets_infos, callbacks_per_batch, **kwargs)