trw.train.analysis_plots
¶
Defines the main plots and reports used for the analysis of our models
Module Contents¶
Functions¶
|
Make the figures not overlapping |
|
Calculate the area under the curve of the ROC plot (AUROC) |
|
Export a figure |
|
Compare different histories: e.g., compare 2 configuration, which one has the best results for a given |
|
Calculate the ROC and AUC of a binary classifier |
|
Create a contiguous list of label names ordered from 0..N from the class mapping |
|
Summarizes the important statistics for a classification problem |
|
Plot groups of histories |
|
Plot the confusion matrix of a predicted class versus the true class |
Attributes¶
- trw.train.analysis_plots.logger¶
- trw.train.analysis_plots.fig_tight_layout(fig)¶
Make the figures not overlapping
- trw.train.analysis_plots.auroc(trues, found_1_scores)¶
Calculate the area under the curve of the ROC plot (AUROC)
- Parameters
trues – the expected class
found_1_scores – the score found for the class 1. Must be a numpy array of floats
- Returns
the AUROC
- trw.train.analysis_plots.export_figure(path, name, maximum_length=259, dpi=300)¶
Export a figure
- Parameters
path – the folder where to export the figure
name – the name of the figure.
maximum_length – the maximum length of the full path of a figure. If the full path name is greater than maximum_length, the name will be subs-ampled to the maximal allowed length
dpi – Dots Per Inch: the density of the figure
- trw.train.analysis_plots.boxplots(export_path, features_trials, title, xlabel, ylabel, meanline=False, plot_trials=True, scale='linear', y_range=None, rotate_x=None, showfliers=False, maximum_chars_per_line=50, title_line_height=0.055)¶
Compare different histories: e.g., compare 2 configuration, which one has the best results for a given measure?
- Parameters
export_path – where to export the figure
features_trials – a dictionary of list. Each list representing a feature
title – the title of the plot
ylabel – the label for axis y
xlabel – the label for axis x
meanline – if True, draw a line from the center of the plot for each history name to the next
maximum_chars_per_line – the maximum of characters allowed per line of title. If exceeded, newline will be created.
plot_trials – if True, each trial of a feature will be plotted
scale – the axis scale to be used
y_range – if not None, the (min, max) of the y-axis
rotate_x – if not None, the rotation of the x axis labels in degree
showfliers – if True, plot the outliers
maximum_chars_per_line – the maximum number of characters of the title per line
title_line_height – the height of the title lines
- trw.train.analysis_plots.plot_roc(export_path, trues, found_scores_1, title, label_name=None, colors=None)¶
Calculate the ROC and AUC of a binary classifier
Supports multiple ROC curves.
- Parameters
export_path – the folder where the plot will be exported
trues – the expected class. Can be a list for multiple ROC curves
found_scores_1 – the score found for the prediction of class 1. Must be a numpy array of floats. Can be a list for multiple ROC curves
title – the title of the ROC
label_name – the name of the ROC curve. Can be a list for multiple ROC curves
colors – if None use default colors. Else, a numpy array of dim (Nx3) where N is the number of colors. Must be in [0..1] range
- trw.train.analysis_plots.list_classes_from_mapping(mappinginv: collections.Mapping, default_name='unknown')¶
Create a contiguous list of label names ordered from 0..N from the class mapping
- Parameters
mappinginv – a dictionary like structure encoded as (class id, class_name)
default_name – if there is no class name, use this as default
- Returns
a list of class names ordered from class id = 0 to class id = N. If mappinginv is None, returns None
- trw.train.analysis_plots.classification_report(prediction_scores: numpy.ndarray, trues: collections.Sequence, class_mapping: collections.Mapping = None)¶
Summarizes the important statistics for a classification problem :param prediction_scores: the scores for each, for each sample :param trues: the true class for each sample :param class_mapping: the class mapping (class id, class name) :return: a dictionary of statistics or sub-report
- trw.train.analysis_plots.plot_group_histories(root, history_values, title, xlabel, ylabel, max_nb_plots_per_group=5, colors=utilities.make_unique_colors_f())¶
Plot groups of histories :param root: the directory where the plot will be exported :param history_values: a map of list of list of (epoch, value) :param title: the title of the graph :param xlabel: the x label :param ylabel: the y label :param max_nb_plots_per_group: the maximum number of plots per group :param colors: the colors to be used
- trw.train.analysis_plots.confusion_matrix(export_path, classes_predictions, classes_trues, classes: list = None, normalize=False, title='Confusion matrix', cmap=plt.cm.plasma, display_numbers=True, maximum_chars_per_line=50, rotate_x=None, rotate_y=None, display_names_x=True, sort_by_decreasing_sample_size=True, excludes_classes_with_samples_less_than=None, main_font_size=16, sub_font_size=8, normalize_unit_percentage=False)¶
Plot the confusion matrix of a predicted class versus the true class
- Parameters
export_path – the folder where the confusion matrix will be exported
classes_predictions – the classes that were predicted by the classifier
classes_trues – the true classes
classes – a list of labels. Label 0 for class 0, label 1 for class 1…
normalize – if True, the confusion matrix will be normalized to 1.0 per row
title – the title of the plot
cmap – the color map to use
display_numbers – if True, display the numbers within each cell of the confusion matrix
maximum_chars_per_line – the title will be split every maximum_chars_per_line characters to avoid display issues
rotate_x – if not None, indicates the rotation of the label on x axis
rotate_y – if not None, indicates the rotation of the label on y axis
display_names_x – if True, the class name, if specified, will also be displayed on the x axis
sort_by_decreasing_sample_size – if True, the confusion matrix will be sorted by decreasing number of samples. This can be useful to show if the errors may be due to low number of samples
excludes_classes_with_samples_less_than – if not None, the classes with less than excludes_classes_with_samples_less_than samples will be excluded
normalize_unit_percentage – if True, use 100% base as unit instead of 1.0
main_font_size – the font size of the text
sub_font_size – the font size of the sub-elements (e.g., ticks)