ddl.deep.DeepDestructorCV

class ddl.deep.DeepDestructorCV(canonical_destructor=None, init_destructor=None, cv=None, stop_tol=0.001, max_canonical_destructors=None, n_extend=1, refit=True, silent=False, log_prefix='', random_state=None)[source]

Bases: ddl.deep.DeepDestructor

Deep destructor whose number of destructors/layers is determined by CV.

Nearly the same as DeepDestructor except that the number of canonical destructors (i.e. the number of layers) is automatically determined using cross validation. The likelihood of held-out data in each CV fold is used to determine the number of parameters.

This destructor is computationally more efficient than using sklearn.model_selection.GridSearchCV because the deep destructor can be built one layer at a time and the test likelihood can be accumulated one layer at a time.

Parameters:
canonical_destructor : estimator or list

The canonical destructor(s) that will be cloned to build up a deep destructor. Parameter canonical_destructor can be a list of canonical destructors. The list will be cycled through to get as many canonical destructors as needed.

init_destructor : estimator, optional

Initial destructor (e.g. preprocessing or just to project to canonical domain).

cv : int, cross-validation generator or an iterable, default=None

Determines the cross-validation splitting strategy. Possible inputs for cv are:

  • None, to use the default 3-fold cross validation,
  • integer, to specify the number of folds in a (Stratified)KFold,
  • An object to be used as a cross-validation generator.
  • An iterable yielding train, test splits.
stop_tol : float, default=1e-3

Relative difference at which to stop adding destructors. For example, if set to 0.0, then the algorithm will stop if the test log likelihood ever decreases.

max_canonical_destructors : int or None, default=None

The maximum number of destructors (including the initial destructor) to add to the deep destructor. If set to None, then the number of destructors is unbounded.

n_extend : int, default=1

The number of destructors/layers to extend even after the stopping tolerance defined by stop_tol has been reached. This could be useful if the destructors are random or not gauranteed to always increase likelihood. If n_extend is 1, then the optimization will stop as soon as the test log likelihood decreases.

refit : bool, default=True

Whether to refit the entire deep destructor with the selected number of layers or just extract the fit from the first fold.

silent : bool, default=False

Whether to output debug messages via logging.logger. Note that logging messages are not output to standard out automatically. Please see the Python module logging for more information.

log_prefix : str, default=’‘

Prefix of debug logging messages via logging.logger. See silent parameter.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by numpy.random.

Attributes:
fitted_destructors_ : array, shape = [n_layers]

Array of fitted destructors. See fitted_destructors_ of base.CompositeDestructor.

density_ : estimator

Implicit density of deep destructor.

cv_train_scores_ : array, shape = [n_layers, n_splits]

Cross validation train scores (mean log-likelihood).

cv_test_scores_ : array, shape = [n_layers, n_splits]

Cross validation test scores (mean log-likelihood).

best_n_layers_ : int

Best number of layers as selected by cross validation.

See also

DeepDestructor

Methods

create_fitted(fitted_destructors, \*\*kwargs) Create fitted destructor.
fit(self, X[, y, X_test, first_score_zero]) [Placeholder].
fit_transform(self, X[, y]) Fit estimator to X and then transform X.
get_domain(self) Get the domain of this destructor.
get_params(self[, deep]) Get parameters for this estimator.
inverse_transform(self, X[, y, partial_idx]) Apply inverse destructive transformation to X.
sample(self[, n_samples, y, random_state]) Sample from composite destructor.
score(self, X[, y, partial_idx]) Override super class to allow for partial_idx.
score_layers(self, X[, y, partial_idx]) Override super class to allow for partial_idx.
score_samples(self, X[, y, partial_idx]) Compute log-likelihood (or log(det(Jacobian))) for each sample.
score_samples_layers(self, X[, y, partial_idx]) [Placeholder].
set_params(self, \*\*params) Set the parameters of this estimator.
transform(self, X[, y, partial_idx]) Apply destructive transformation to X.
__init__(self, canonical_destructor=None, init_destructor=None, cv=None, stop_tol=0.001, max_canonical_destructors=None, n_extend=1, refit=True, silent=False, log_prefix='', random_state=None)[source]

Initialize self. See help(type(self)) for accurate signature.

classmethod create_fitted(fitted_destructors, **kwargs)

Create fitted destructor.

Parameters:
fitted_destructors : array-like of Destructor

Fitted destructors.

**kwargs

Other parameters to pass to constructor.

Returns:
fitted_transformer : Transformer

Fitted transformer.

fit(self, X, y=None, X_test=None, first_score_zero=False, **fit_params)[source]

[Placeholder].

Parameters:
X :
y :
X_test :
fit_params :
first_score_zero : bool

Hack so that init destructor is not taken into account for determining when to stop for classifier destructors.

Returns:
obj : object
fit_transform(self, X, y=None, **fit_params)[source]

Fit estimator to X and then transform X.

Parameters:
X : array-like, shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used in the fitting process but kept for compatibility.

fit_params : dict, optional

Parameters to pass to the fit method.

Returns:
X_new : array-like, shape (n_samples, n_features)

Transformed data.

get_domain(self)

Get the domain of this destructor.

Returns:
domain : array-like, shape (2,) or shape (n_features, 2)

If shape is (2, ), then domain[0] is the minimum and domain[1] is the maximum for all features. If shape is (n_features, 2), then each feature’s domain (which could be different for each feature) is given similar to the first case.

get_params(self, deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

inverse_transform(self, X, y=None, partial_idx=None)

Apply inverse destructive transformation to X.

Parameters:
X : array-like, shape (n_samples, n_features)

New data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used in the transformation but kept for compatibility.

partial_idx : list or None, default=None

List of indices of the fitted destructor to use in the transformation. The default of None uses all the fitted destructors. Mainly used for visualization or debugging.

Returns:
X_new : array-like, shape (n_samples, n_features)

Transformed data (possibly only partial transformation).

sample(self, n_samples=1, y=None, random_state=None)

Sample from composite destructor.

Nearly the same as DestructorMixin.sample but the number of features is found from first fitted destructor to avoid recursion.

score(self, X, y=None, partial_idx=None)

Override super class to allow for partial_idx.

score_layers(self, X, y=None, partial_idx=None)

Override super class to allow for partial_idx.

score_samples(self, X, y=None, partial_idx=None)

Compute log-likelihood (or log(det(Jacobian))) for each sample.

Parameters:
X : array-like, shape (n_samples, n_features)

New data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used but kept for compatibility.

partial_idx : list or None, default=None

List of indices of the fitted destructor to use in the computing the log likelihood. The default of None uses all the fitted destructors. Mainly used for visualization or debugging.

Returns:
log_likelihood : array, shape (n_samples,)

Log likelihood of each data point in X.

score_samples_layers(self, X, y=None, partial_idx=None)

[Placeholder].

Parameters:
X :
y :
partial_idx :
Returns:
obj : object
set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
self
transform(self, X, y=None, partial_idx=None)

Apply destructive transformation to X.

Parameters:
X : array-like, shape (n_samples, n_features)

New data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used in the transformation but kept for compatibility.

partial_idx : list or None, default=None

List of indices of the fitted destructor to use in the transformation. The default of None uses all the fitted destructors. Mainly used for visualization or debugging.

Returns:
X_new : array-like, shape (n_samples, n_features)

Transformed data (possibly only partial transformation).