trw.train.analysis_plots

Defines the main plots and reports used for the analysis of our models

Module Contents

Functions

fig_tight_layout(fig)

Make the figures not overlapping

auroc(trues: numpy.ndarray, found_1_scores: numpy.ndarray) → float

Calculate the area under the curve of the ROC plot (AUROC)

gallery(images_y_then_x: List[List[numpy.ndarray]], x_axis_text: List[str], y_axis_text: List[str], title: Optional[str] = None, save_path: Optional[str] = None, dpi: Optional[int] = None)

Create a gallery of images

export_figure(path, name, maximum_length=259, dpi=None)

Export a figure

boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given

plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)

Calculate the ROC and AUC of a binary classifier

list_classes_from_mapping(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')

Create a contiguous list of label names ordered from 0..N from the class mapping

classification_report(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)

Summarizes the important statistics for a classification problem

plot_group_histories(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) → None

Plot groups of histories

confusion_matrix(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) → None

Plot the confusion matrix of a predicted class versus the true class

Attributes

logger

trw.train.analysis_plots.logger
trw.train.analysis_plots.fig_tight_layout(fig)

Make the figures not overlapping

trw.train.analysis_plots.auroc(trues: numpy.ndarray, found_1_scores: numpy.ndarray) float

Calculate the area under the curve of the ROC plot (AUROC)

Parameters
  • trues – the expected class

  • found_1_scores – the score found for the class 1. Must be a numpy array of floats

Returns

the AUROC

trw.train.analysis_plots.gallery(images_y_then_x: List[List[numpy.ndarray]], x_axis_text: List[str], y_axis_text: List[str], title: Optional[str] = None, save_path: Optional[str] = None, dpi: Optional[int] = None)

Create a gallery of images

Parameters
  • images_y_then_x – an array of y * x images

  • x_axis_text – the text for each x

  • y_axis_text – the text for each y

  • title – the title of the gallery

  • save_path – where to save the figure

  • dpi – dpi of the figure

Returns

a figure

trw.train.analysis_plots.export_figure(path, name, maximum_length=259, dpi=None)

Export a figure

Parameters
  • path – the folder where to export the figure

  • name – the name of the figure.

  • maximum_length – the maximum length of the full path of a figure. If the full path name is greater than maximum_length, the name will be subs-ampled to the maximal allowed length

  • dpi – Dots Per Inch: the density of the figure

trw.train.analysis_plots.boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)

Compare different histories: e.g., compare 2 configuration, which one has the best results for a given measure?

Parameters
  • export_path – where to export the figure

  • features_trials – a dictionary of list. Each list representing a feature

  • title – the title of the plot

  • ylabel – the label for axis y

  • xlabel – the label for axis x

  • meanline – if True, draw a line from the center of the plot for each history name to the next

  • maximum_chars_per_line – the maximum of characters allowed per line of title. If exceeded, newline will be created.

  • plot_trials – if True, each trial of a feature will be plotted

  • scale – the axis scale to be used

  • y_range – if not None, the (min, max) of the y-axis

  • rotate_x – if not None, the rotation of the x axis labels in degree

  • showfliers – if True, plot the outliers

  • maximum_chars_per_line – the maximum number of characters of the title per line

  • title_line_height – the height of the title lines

trw.train.analysis_plots.plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)

Calculate the ROC and AUC of a binary classifier

Supports multiple ROC curves.

Parameters
  • export_path – the folder where the plot will be exported

  • trues – the expected class. Can be a list for multiple ROC curves

  • found_scores_1 – the score found for the prediction of class 1. Must be a numpy array of floats. Can be a list for multiple ROC curves

  • title – the title of the ROC

  • label_name – the name of the ROC curve. Can be a list for multiple ROC curves

  • colors – if None use default colors. Else, a numpy array of dim (Nx3) where N is the number of colors. Must be in [0..1] range

trw.train.analysis_plots.list_classes_from_mapping(mappinginv: Optional[collections.Mapping], default_name: str = 'unknown')

Create a contiguous list of label names ordered from 0..N from the class mapping

Parameters
  • mappinginv – a dictionary like structure encoded as (class id, class_name)

  • default_name – if there is no class name, use this as default

Returns

a list of class names ordered from class id = 0 to class id = N. If mappinginv is None, returns None

trw.train.analysis_plots.classification_report(predictions: numpy.ndarray, prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: Optional[collections.Mapping] = None)

Summarizes the important statistics for a classification problem :param predictions: the classes predicted :param prediction_scores: the scores for each, for each sample :param trues: the true class for each sample :param class_mapping: the class mapping (class id, class name) :return: a dictionary of statistics or sub-report

trw.train.analysis_plots.plot_group_histories(root: str, history_values: List[List[Tuple[int, numbers.Number]]], title: str, xlabel: str, ylabel: str, max_nb_plots_per_group: int = 5, colors: Sequence[tuple] = utilities.make_unique_colors_f()) None

Plot groups of histories :param root: the directory where the plot will be exported :param history_values: a map of list of list of (epoch, value) :param title: the title of the graph :param xlabel: the x label :param ylabel: the y label :param max_nb_plots_per_group: the maximum number of plots per group :param colors: the colors to be used

trw.train.analysis_plots.confusion_matrix(export_path: str, classes_predictions: numpy.ndarray, classes_trues: numpy.ndarray, classes: Sequence[str] = None, normalize: bool = False, title: str = 'Confusion matrix', cmap=plt.cm.Greens, display_numbers: bool = True, maximum_chars_per_line: int = 50, rotate_x: Optional[int] = None, rotate_y: Optional[int] = None, display_names_x: bool = True, sort_by_decreasing_sample_size: bool = True, excludes_classes_with_samples_less_than: bool = None, main_font_size: int = 16, sub_font_size: int = 8, normalize_unit_percentage: bool = False, max_size_x_label: int = 10) None

Plot the confusion matrix of a predicted class versus the true class

Parameters
  • export_path – the folder where the confusion matrix will be exported

  • classes_predictions – the classes that were predicted by the classifier

  • classes_trues – the true classes

  • classes – a list of labels. Label 0 for class 0, label 1 for class 1…

  • normalize – if True, the confusion matrix will be normalized to 1.0 per row

  • title – the title of the plot

  • cmap – the color map to use

  • display_numbers – if True, display the numbers within each cell of the confusion matrix

  • maximum_chars_per_line – the title will be split every maximum_chars_per_line characters to avoid display issues

  • rotate_x – if not None, indicates the rotation of the label on x axis

  • rotate_y – if not None, indicates the rotation of the label on y axis

  • display_names_x – if True, the class name, if specified, will also be displayed on the x axis

  • sort_by_decreasing_sample_size – if True, the confusion matrix will be sorted by decreasing number of samples. This can

be useful to show if the errors may be due to low number of samples :param excludes_classes_with_samples_less_than: if not None, the classes with

less than excludes_classes_with_samples_less_than samples will be excluded

:param normalize_unit_percentage if True, use 100% base as unit instead of 1.0 :param main_font_size: the font size of the text :param sub_font_size: the font size of the sub-elements (e.g., ticks) :param max_size_x_label: the maximum length of a label on the x-axis