ddl.externals.mlpack.MlpackDensityTreeEstimator

class ddl.externals.mlpack.MlpackDensityTreeEstimator(max_leaf_nodes=None, max_depth=None, min_samples_leaf=1)[source]

Bases: sklearn.base.BaseEstimator

Density tree estimator via mlpack (mlpack.org).

This estimator leverages the methods for Density Estimation Trees (DET, see Ram & Gray 2011 paper below) that are implemented in mlpack (see the DET method in mlpack’s documentation at mlpack.org). Essentially, this class provides a simple wrapper around the C++ functions in mlpack and thus must be compiled with mlpack source code.

Parameters:
max_leaf_nodes : int or None, default=None

Maximum number of leaf nodes in final tree. The tree will be fully grown based on min_samples_leaf and then pruned until the number of leaf nodes is less than max_leaf_nodes. If None, then max_leaf_nodes is considered to be infinite. This parameter can be useful for simple regularization of the density tree.

max_depth : int or None, default=None

Maximum depth of final tree. The tree will be fully grown based on min_samples_leaf and then pruned until the depth of the tree is less than max_depth. If None, then max_depth is considered to be infinite. This parameter can be useful for simple regularization of the density tree.

min_samples_leaf : int, default=1

Minimum number of samples required at all leaf nodes. Main parameter for growing the tree initially before pruning. This parameter is mainly here for computational reasons on large datasets. This parameter could also be used as regularization.

Attributes:
tree_ : arrayed_tree

The tree structure represented using arrays similar to the trees used in sklearn (e.g. sklearn.tree.DecisionTreeClassifier).

References

Ram, P. and Gray, A. G. Density Estimation Trees. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011.

Methods

fit(self, X[, y]) Fit estimator to X.
get_params(self[, deep]) Get parameters for this estimator.
set_params(self, \*\*params) Set the parameters of this estimator.
__init__(self, max_leaf_nodes=None, max_depth=None, min_samples_leaf=1)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(self, X, y=None)[source]

Fit estimator to X.

Parameters:
X : array-like, shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used in the fitting process but kept for compatibility.

Returns:
self : estimator

Returns the instance itself.

get_params(self, deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
self