Binary Black Hole Algorithm
¶
This algorithm was implemented following the article:
Elnaz Pashaei, Nizamettin Aydin, Binary black hole algorithm for feature selection and classification on biological data, Applied Soft Computing, Volume 56, 2017, Pages 94-106, ISSN 1568-4946,
(http://www.sciencedirect.com/science/article/pii/S1568494617301242)
-
class
feature_selection.binary_black_hole.
BinaryBlackHole
(classifier=None, number_gen=10, size_pop=40, verbose=False, repeat=1, make_logbook=False, random_state=None, parallel=False, cv_metric_fuction=None, features_metric_function=None)¶ Implementation of Binary Black Hole for Feature Selection
Parameters: - classifier : sklearn classifier , (default=SVM)
Any classifier that adheres to the scikit-learn API
- number_gen : positive integer, (default=10)
Number of generations
- size_pop : positive integer, (default=40)
Number of individuals in the population
- verbose : boolean, (default=False)
If true, print information in every generation
- repeat : positive int, (default=1)
Number of times to repeat the fitting process
- make_logbook : boolean, (default=False)
If True, a logbook from DEAP will be made
- parallel : boolean, (default=False)
Set to True if you want to use multiprocessors
- cv_metric_fuction : callable, (default=matthews_corrcoef)
A metric score function as stated in the sklearn http://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter
- features_metric_function : callable, (default=pow(sum(mask)/(len(mask)*5), 2))
A function that return a float from the binary mask of features
Methods
fit
([X, y, normalize])Fit method fit_transform
(X, y[, normalize])Fit to data, then transform it. get_params
([deep])Get parameters for this estimator. get_support
([indices])Get a mask, or integer index, of the features selected Parameters ———- indices : boolean (default False) If True, the return value will be an array of integers, rather than a boolean mask. plot_results
()This method plots all the statistics for each repetition in a graph. safe_mask
(x, mask)Return a mask which is safe to use on X. score
(X, y[, sample_weight])Returns the mean accuracy on the given test data and labels. score_func_to_gridsearch
(estimator[, …])Function to be given as a scorer function to Grid Search Method. set_params
(**params)Set the parameters of this estimator. transform
(X[, mask])Reduce X to the selected features. adaptative_binary_mutation predict -
fit
(X=None, y=None, normalize=False, **arg)¶ Fit method
Parameters: - X : array of shape [n_samples, n_features]
The input samples
- y : array of shape [n_samples, 1]
The input of labels
- normalize : boolean, (default=False)
If true, StandardScaler will be applied to X
- **arg : parameters
Set parameters
-
fit_transform
(X, y, normalize=False, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X : numpy array of shape [n_samples, n_features]
Training set.
- y : numpy array of shape [n_samples]
Target values.
Returns: - X_new : numpy array of shape [n_samples, n_features_new]
Transformed array.
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
get_support
(indices=False)¶ Get a mask, or integer index, of the features selected Parameters ———- indices : boolean (default False)
If True, the return value will be an array of integers, rather than a boolean mask.- support : array
- An index that selects the retained features from a feature vector. If indices is False, this is a boolean array of shape [# input features], in which an element is True iff its corresponding feature is selected for retention. If indices is True, this is an integer array of shape [# output features] whose values are indices into the input feature vector.
-
plot_results
()¶ This method plots all the statistics for each repetition in a graph.
The curves are minimun, average and maximun accuracy
-
static
safe_mask
(x, mask)¶ Return a mask which is safe to use on X. Parameters ———- X : {array-like, sparse matrix}
Data on which to apply mask.- mask : array
- Mask to be used on X.
mask
-
score
(X, y, sample_weight=None)¶ Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: - X : array-like, shape = (n_samples, n_features)
Test samples.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True labels for X.
- sample_weight : array-like, shape = [n_samples], optional
Sample weights.
Returns: - score : float
Mean accuracy of self.predict(X) wrt. y.
-
static
score_func_to_gridsearch
(estimator, X_test=None, y_test=None)¶ Function to be given as a scorer function to Grid Search Method. It is going to transform the matrix os predicts generated by ‘all’ option to an final accuracy score. Use a high value to CV
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self
-
transform
(X, mask=None)¶ Reduce X to the selected features. Parameters ———- X : array of shape [n_samples, n_features]
The input samples.- X_r : array of shape [n_samples, n_selected_features]
- The input samples with only the selected features.