ddl.validation.check_destructor

ddl.validation.check_destructor(destructor, fitted_density=None, is_canonical=True, properties_to_skip=None, random_state=0)[source]

Check for the required interface and properties of a destructor.

First, check the interface of the destructor (see below) and then check the 4 properties of a destructor (see Inouye & Ravikumar 2018 paper). A destructor estimator should have the following interface:

Required methods (error if fails or does not exist):

  • fit
  • transform
  • inverse_transform

Optional methods primary (warn if does not exist, error if fails):

  • sample (mixin, uniform->inverse)
  • score_samples

Optional methods secondary (warn if does not exist, warn if fails):

  • fit_from_density (better for uniformability test than fit)
  • get_domain (mixin with density_, required to pass tests if domain is not real-valued unbounded)
  • get_support (required to pass tests if support is not real-valued unbounded)
  • score (mixin)

Optional attributes secondary (warn if does not exist, warn if fails):

  • density_ attribute (provides default for uniformability test if fitted_density not given)

In addition to checking the interface, this function will empirically check to see if the destructor seems to have the following 4 properties (see Inouye & Ravikumar 2018 paper). Note that these are just empirical sanity checks based on sampling or the like and thus do not ensure that this estimator always has these properties.

  1. Uniformability (required)
  2. Invertibility (required)
  3. Canonical domain (optional if canonical = False)
  4. Identity element (optional if canonical = False)
Parameters:
destructor : estimator

Destructor estimator to check.

fitted_density : estimator, optional

A fitted density estimator to use in some checks. If None (default), then use simple but interesting distributions based on domain of destructor. Can be used to check a specific density and destructor pair.

is_canonical : bool, default=True

Whether this destructor should be checked for canonical destructor properties including canonical domain and identity element.

properties_to_skip : list, optional

Optional list of properties to skip designated by the following strings [‘uniformability’, ‘invertibility’, ‘canonical-domain’, ‘identity-element’]

random_state : int, RandomState instance or None, optional (default=0)

Note that random_state defaults to 0 so that checks are reproducible by default.

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by numpy.random.

See also

check_density

Notes

Uniformability check - Numerically and approximately checks one density/destructor pair to determine if the destructor appropriately destroys the density (i.e. transforms the samples to be uniform). Note that this does not mean it will work for all cases but it does provide a basic sanity check on at least one case. This will also run sklearn’s estimator checks, i.e. sklearn.utils.estimator_checks.check_estimator.

Invertibility check - Simple numerical check for invertiblity by applying the transformation and then applying the inverse transformation, and vice versa to see if the original data is recovered.

Canonical domain check - Simply check if the domain is the unit hypercube. Checks if warnings are issued when data is passed that is not on the unit hypercube.

Identity element check - Numerical approximation for checking if the destructor class includes an identity element. Note this check fits the destructor on uniform samples and then check whether the learned transformation is the identity. Thus, if the test fails, there can be two possible causes: 1) The fitting procedure overfitted the uniform samples such that the implied density is far from uniform. 2) The transformation does not appropriately produce an identity transformation. This is a bit stricter than the official property since we train on uniform samples. However, this is probably a better check because we want the destructor to fit an identity transformation if the distribution is truly uniform.

destructor.score_samples(x) = abs(det(J(x))) should be close to density.score_samples(x) and equal if no approximation (such as a piecewise linear function) is made.

destructor.sample(n_samples) will produce slightly different samples than destructor.density_.sample(n_samples) unless no approximation is used.

Other fitted parameters other than density_ should be strictly destructive destructorformation parameters.

Some of the property tests rely on calculating the Wasserstein distance between two sets of samples and thus the python optimal transport module (pot) is used.

References

D. I. Inouye, P. Ravikumar. Deep Density Destructors. In International Conference on Machine Learning (ICML), 2018.