`trw.hparams.interpret_params`¶

Module Contents¶

Functions¶

`is_discrete`(values)	Test if a list of values is discrete or contiguous
`median_by_category`(categories, values)	Calculate the median for each categorical attribute
`_plot_scatter`(plot_name, x_values, x_name, y_values, y_name, discrete_random_jitter=0.2, x_ticks=None, y_ticks=None, median_max_num=20)	scatter plot with optional named ticks (x, y) and median display (x, y)
`_plot_importance`(plot_name, x_names, y_values, y_name, y_errors=None, x_name='hyper-parameters')
`_plot_param_covariance`(plot_name, x_name, x_values, y_name, y_values, xy_values, discrete_random_jitter=0.2, x_ticks=None, y_ticks=None)
`discretize`(values)	Map string to a int and record the mapping
`analyse_hyperparameters`(hprams_path_pattern, output_path, hparams_to_visualize=None, params_forest_n_estimators=5000, params_forest_max_features_ratio=0.6, top_k_covariance=5, create_graphs=True, verbose=True, dpi=300)	Importance hyper-pramaeter estimation using random forest regressors

Attributes¶

logger

trw.hparams.interpret_params.logger¶

trw.hparams.interpret_params.is_discrete(values)¶: Test if a list of values is discrete or contiguous :param values: the list to test :return: True if discrete, False else

trw.hparams.interpret_params.median_by_category(categories, values)¶: Calculate the median for each categorical attribute :param categories: the categories :param values: the values :return: list of tuple (category, median value)

trw.hparams.interpret_params._plot_scatter(plot_name, x_values, x_name, y_values, y_name, discrete_random_jitter=0.2, x_ticks=None, y_ticks=None, median_max_num=20)¶: scatter plot with optional named ticks (x, y) and median display (x, y) :param plot_name: :param x_values: :param x_name: :param y_values: :param y_name: :param discrete_random_jitter: :param x_ticks: :param y_ticks: :param median_max_num: :return:

trw.hparams.interpret_params._plot_importance(plot_name, x_names, y_values, y_name, y_errors=None, x_name='hyper-parameters')¶

trw.hparams.interpret_params._plot_param_covariance(plot_name, x_name, x_values, y_name, y_values, xy_values, discrete_random_jitter=0.2, x_ticks=None, y_ticks=None)¶

trw.hparams.interpret_params.discretize(values)¶: Map string to a int and record the mapping :param values: :return: (values, mapping)

trw.hparams.interpret_params.analyse_hyperparameters(hprams_path_pattern, output_path, hparams_to_visualize=None, params_forest_n_estimators=5000, params_forest_max_features_ratio=0.6, top_k_covariance=5, create_graphs=True, verbose=True, dpi=300)¶

Importance hyper-pramaeter estimation using random forest regressors

From simulation, the ordering of hyper-parameters importance is correct, but the importance value itself may be over-estimated (for the best param) and underestimated (for the others).

The scatter plot for each hparam is useful to understand in what direction the hyper-parameter should be modified

The covariance plot can be used to understand the relation between most important hyper-parameter

WARNING: [1] With correlated features, strong features can end up with low scores and the method can be biased towards variables with many categories. See for more details: see http://blog.datadive.net/selecting-good-features-part-iii-random-forests/ and https://link.springer.com/article/10.1186%2F1471-2105-8-25

Parameters

params_forest_n_estimators – number of trees used to estimate the loss from the hyperparameters
params_forest_max_features_ratio – the maximum number of features to be used. Note we don’t want to select all the features to limit the correlation importance decrease effect [1]
hprams_path_pattern – a pattern (globing) to be used to select the hyper parameter files
hparams_to_visualize – a list of hparam names to visualize or None. If None, display from the most important (i.e., causing the most loss variation) to the least
create_graphs – if True, export matplotlib visualizations
top_k_covariance – export the parameter covariance for the most important k hyper-parameters
output_path – where to export the graph
dpi – the resolution of the exported graph
verbose – if True, display additional information

Returns

trw.hparams.interpret_params¶

Module Contents¶

Functions¶

Attributes¶

`trw.hparams.interpret_params`¶