ddl.tree
.TreeDensity¶
-
class
ddl.tree.
TreeDensity
(tree_estimator=None, get_tree=None, node_destructor=None, uniform_weight=1e-06)[source]¶ Bases:
sklearn.base.BaseEstimator
,ddl.base.ScoreMixin
Tree density estimator defined on the unit hypercube.
This density estimator first estimates the tree structure via the tree_estimator parameter. Then the estimator constructs a density tree by counting the number of training data points that fall into the leaves. The empirical counts are regularized by the uniform_weight which regularizes the tree towards the uniform density—essentially a mixture between the empirical tree and a uniform density. Optionally, a node destructor can be specified to be estimated and applied at each leaf node.
Parameters: - tree_estimator : estimator, defaults to RandomTreeEstimator
Tree estimator defaults to
RandomTreeEstimator
but other estimators could be used or developed such as the mlpack density estimation tree (DET) estimatorddl.externals.mlpack.MlpackDensityTreeEstimator
.- get_tree : func
Function that extracts the tree structure from the fitted tree estimator:
tree = get_tree(fitted_tree_estimator)
. Default is to extract ansklearn.tree
arrayed tree from the estimator such as from the estimatorsklearn.tree.ExtraTreeRegressor
.- node_destructor : estimator, optional
Optional destructor that can be fitted and applied at each leaf node of the tree. For example, this could be an independent histogram density (via
IndependentDestructor
withHistogramUnivariateDensity
densities). With a node destructor, the tree destructor is no longer piecewise uniform but rather a more general piecewise density.- uniform_weight : float, between 0 and 1
The mixture weight of a uniform density used to regularize the empirical tree density. For example, if
uniform_weight=1
, the density estimate trivially reduces to the uniform density. On the other hand, ifuniform_weight=0
, no regularization is performed on the empirical density estimate. Anything in between 0 and 1 regularizes the density partially.
Methods
fit
(self, X[, y, fitted_tree_estimator])Fit estimator to X. get_params
(self[, deep])Get parameters for this estimator. get_support
(self)Get the support of this density (i.e. sample
(self[, n_samples, random_state, shuffle])[Placeholder]. score
(self, X[, y])Return the mean log likelihood (or log(det(Jacobian))). score_samples
(self, X[, y])Compute log-likelihood (or log(det(Jacobian))) for each sample. set_params
(self, \*\*params)Set the parameters of this estimator. create_fitted -
__init__
(self, tree_estimator=None, get_tree=None, node_destructor=None, uniform_weight=1e-06)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
fit
(self, X, y=None, fitted_tree_estimator=None)[source]¶ Fit estimator to X.
Parameters: - X : array-like, shape (n_samples, n_features)
Training data, where n_samples is the number of samples and n_features is the number of features.
- y : None, default=None
Not used in the fitting process but kept for compatibility.
- fitted_tree_estimator : estimator
[Placeholder].
Returns: - self : estimator
Returns the instance itself.
-
get_params
(self, deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
get_support
(self)[source]¶ Get the support of this density (i.e. the positive density region).
Returns: - support : array-like, shape (2,) or shape (n_features, 2)
If shape is (2, ), then
support[0]
is the minimum andsupport[1]
is the maximum for all features. If shape is (n_features, 2), then each feature’s support (which could be different for each feature) is given similar to the first case.
-
sample
(self, n_samples=1, random_state=None, shuffle=True)[source]¶ [Placeholder].
Parameters: - n_samples :
- random_state :
- shuffle :
Returns: - obj : object
-
score
(self, X, y=None)¶ Return the mean log likelihood (or log(det(Jacobian))).
Parameters: - X : array-like, shape (n_samples, n_features)
New data, where n_samples is the number of samples and n_features is the number of features.
- y : None, default=None
Not used but kept for compatibility.
Returns: - log_likelihood : float
Mean log likelihood data points in X.
-
score_samples
(self, X, y=None)[source]¶ Compute log-likelihood (or log(det(Jacobian))) for each sample.
Parameters: - X : array-like, shape (n_samples, n_features)
New data, where n_samples is the number of samples and n_features is the number of features.
- y : None, default=None
Not used but kept for compatibility.
Returns: - log_likelihood : array, shape (n_samples,)
Log likelihood of each data point in X.
-
set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self