ddl.univariate.ScipyUnivariateDensity

class ddl.univariate.ScipyUnivariateDensity(scipy_rv=None, scipy_fit_kwargs=None)[source]

Bases: sklearn.base.BaseEstimator, ddl.base.ScoreMixin

Density estimator via random variables defined in scipy.stats.

A univariate density estimator that can fit any distribution defined in scipy.stats. This includes common distributions such as Gaussian, laplace, beta, gamma and log-normal distributions but also many other distributions as well.

Note that this density estimator is strictly univariate and therefore expects the input data to be a single array with shape (n_samples, 1).

Parameters:
scipy_rv : object or None, default=None

Default random variable is a Gaussian (i.e. scipy.stats.norm) if scipy_rv=None. Other examples include scipy.stats.gamma or scipy.stats.beta.

scipy_fit_kwargs : dict or None, optional

Keyword arguments as a dictionary for the fit function of the scipy random variable (e.g. dict(floc=0, fscale=1) to fix the location and scale parameters to 0 and 1 respectively). Defaults are different depending on scipy_rv parameter. For example for the scipy.stats.beta we set floc=0 and fscale=1, i.e. fix the location and scale of the beta distribution.

Attributes:
rv_ : object

Frozen scipy.stats random variable object. Fitted parameters of distribution can be accessed via args property.

See also

scipy.stats

Methods

cdf(self, X[, y]) [Placeholder].
create_fitted([scipy_rv_params]) Create fitted density.
fit(self, X[, y]) Fit estimator to X.
get_params(self[, deep]) Get parameters for this estimator.
get_support(self) Get the support of this density (i.e.
inverse_cdf(self, X[, y]) [Placeholder].
sample(self[, n_samples, random_state]) Generate random samples from this density/destructor.
score(self, X[, y]) Return the mean log likelihood (or log(det(Jacobian))).
score_samples(self, X[, y]) Compute log-likelihood (or log(det(Jacobian))) for each sample.
set_params(self, \*\*params) Set the parameters of this estimator.
__init__(self, scipy_rv=None, scipy_fit_kwargs=None)[source]

Initialize self. See help(type(self)) for accurate signature.

cdf(self, X, y=None)[source]

[Placeholder].

Parameters:
X :
y :
Returns:
obj : object
classmethod create_fitted(scipy_rv_params=None, **kwargs)[source]

Create fitted density.

Parameters:
scipy_rv : object or None, default=None

Default random variable is a Gaussian (i.e. scipy.stats.norm) if scipy_rv=None. Other examples include scipy.stats.gamma or scipy.stats.beta.

scipy_rv_params : dict, optional

Parameters to pass to scipy_rv when creating frozen random variable. Default parameters have been set for various distributions.

**kwargs

Other parameters to pass to object constructor.

Returns:
fitted_density : Density

Fitted density.

fit(self, X, y=None, **fit_params)[source]

Fit estimator to X.

Parameters:
X : array-like, shape (n_samples, 1)

Training data, where n_samples is the number of samples. Note that the shape must have a second dimension of 1 since this is a univariate density estimator.

y : None, default=None

Not used in the fitting process but kept for compatibility.

fit_params : dict, optional

Optional extra fit parameters.

Returns:
self : estimator

Returns the instance itself.

get_params(self, deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

get_support(self)[source]

Get the support of this density (i.e. the positive density region).

Returns:
support : array-like, shape (2,) or shape (n_features, 2)

If shape is (2, ), then support[0] is the minimum and support[1] is the maximum for all features. If shape is (n_features, 2), then each feature’s support (which could be different for each feature) is given similar to the first case.

inverse_cdf(self, X, y=None)[source]

[Placeholder].

Parameters:
X :
y :
Returns:
obj : object
sample(self, n_samples=1, random_state=None)[source]

Generate random samples from this density/destructor.

Parameters:
n_samples : int, default=1

Number of samples to generate. Defaults to 1.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by numpy.random.

Returns:
X : array, shape (n_samples, n_features)

Randomly generated sample.

score(self, X, y=None)

Return the mean log likelihood (or log(det(Jacobian))).

Parameters:
X : array-like, shape (n_samples, n_features)

New data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used but kept for compatibility.

Returns:
log_likelihood : float

Mean log likelihood data points in X.

score_samples(self, X, y=None)[source]

Compute log-likelihood (or log(det(Jacobian))) for each sample.

Parameters:
X : array-like, shape (n_samples, n_features)

New data, where n_samples is the number of samples and n_features is the number of features.

y : None, default=None

Not used but kept for compatibility.

Returns:
log_likelihood : array, shape (n_samples,)

Log likelihood of each data point in X.

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
self