KernelHerdingTensorized¶

class otkerneldesign.KernelHerdingTensorized(kernel=None, distribution=None, candidate_set_size=None, candidate_set=None, initial_design=None, is_greedy=False)¶

Incrementally select new design points with tensorized kernel herding. The main difference with the KernelHerding class is contained in the compute_target_potential() method. It requires the kernel to be a product of one-dimensional kernels and the input random variables to be independent. Exploiting these properties, it can compute the target potential as a product of univariate potentials, which is much faster.

Parameters:

kernelopenturns.CovarianceModel: Covariance kernel used to define potentials. Must be a product of one-dimensional kernels. By default a product of Matern kernels with smoothness 5/2.
distributionopenturns.Distribution: Distribution the design points must represent. Must have an independent copula. If not specified, then candidate_set must be specified instead. Even if candidate_set is specified, can be useful if it allows the use of tensorized formulas.
candidate_set_sizepositive int: Size of the set of all candidate points. Unnecessary if candidate_set is specified. Otherwise, $2^{12}$ by default.
candidate_set2-d list of float: Large sample that empirically represents a distribution. If not specified, then distribution and candidate_set_size must be in order to generate it automatically.
initial_design2-d list of float: Sample of points that must be included in the design. Empty by default.
is_greedyBoolean: Set to False by default, then the criterion is the difference between the current and target potential. When set to True, the MMD minimization is strictly greedy. In practice, the two criteria are very close, only for the greedy one the current potential is multiplied by $(\frac{m}{m+1})$ .

Examples

>>> import openturns as ot
>>> import otkerneldesign as otkd
>>> distribution = ot.ComposedDistribution([ot.Normal(0.5, 0.1)] * 2)
>>> dimension = distribution.getDimension()
>>> # Kernel definition
>>> ker_list = [ot.MaternModel([0.1], [1.0], 2.5)] * dimension
>>> kernel = ot.ProductCovarianceModel(ker_list)
>>> # Tensorized kernel herding design
>>> kht = otkd.KernelHerdingTensorized(kernel=kernel, distribution=distribution)
>>> kht_design, _ = kht.select_design(20)

Methods

compute_criterion(design_indices)¶

Compute the criterion on a design. At any point of the candidate set, this criterion is simply given by the difference between the target potential and the potential of a discrete measure defined by a given design.

Parameters:

design_indiceslist of positive int: List of the indices of the selected points in the Sample of candidate points

Returns:

current_potential - target_potentialnumpy.array: Vector of the values taken by the criterion on all candidate points

compute_current_energy(design_indices)¶

Compute the energy of the discrete measure defined by the design $\mat{X}_n$ . Considering the discrete measure $\zeta_n = \frac{1}{n} \sum_{i=1}^{n} \delta(\vect{x}^{(i)})$ , its energy is defined as

$E_{\zeta_n} := \frac{1}{n^2} \sum_{i=1}^{n} \sum_{j=1}^{n} k(\vect{x}^{(i)}, \vect{x}^{(j)}).$

Parameters:

design_indiceslist of positive int: List of the indices of the selected points in the Sample of candidate points

Returns:

potentialfloat: Energy of the discrete measure defined by the design

compute_current_potential(design_indices)¶

Compute the potential of the discrete measure (a.k.a, kernel mean embedding) defined by the design $\mat{X}_n$ . Considering the discrete measure $\zeta_n = \frac{1}{n} \sum_{i=1}^{n} \delta(\vect{x}^{(i)})$ , its potential is defined as

$P_{\zeta_n}(x) = \frac{1}{n} \sum_{i=1}^{n} k(\vect{x}, \vect{x}^{(i)}).$

Parameters:

design_indiceslist of positive int: List of the indices of the selected points in the Sample of candidate points

Returns:

potentialpotential of the measure defined by the design $\mat{X}_n$ .

compute_mmd(design_indices)¶

Compute Maximum Mean Discrepancy between $\mu$ and $\zeta_n = \frac{1}{n} \sum_{i=1}^{n} \delta(\vect{x}^{(i)})$ .

Parameters:

design_indiceslist of positive int: List of the indices of the selected points in the Sample of candidate points

Returns:

mmdfloat: Maximum Mean Discrepancy between target and current measure.

compute_target_energy()¶

Compute the energy of the target probability measure $\mu$ .

Returns:

potentialfloat: Energy of the measure $\mu$ defined by

$E_{\mu} := \int \int k(\vect{x}, \vect{x}') d \mu(\vect{x}) d \mu(\vect{x}').$

compute_target_potential()¶

Compute the potential of the target probability measure $\mu$ . In the case of independent input variables, this implementation is more efficient that the one offered by the KernelHerding class.

Let $\cX$ be a cross product of one-dimensional sets $\cX_{[i]}$ , $\cX=\cX_{[1]}\times\cdots\times\cX_{[d]}$ , and let the measure $\mu$ be the product of its marginals $\mu_{[i]}$ on the $\cX_{[i]}$ . When the kernel $k$ is the product of one-dimensional kernels $k_{[i]}$ , then for all $\vect{x}=(x_1,\ldots,x_d)\in\cX$ , the potential $P_{k,\mu}(\vect{x})$ can be expressed as

$P_{k,\mu}(\vect{x}) := \int_\cX k(\vect{x}, \vect{x}') d \mu(\vect{x}') = \prod_{i=1}^d \int_{\cX_{[i]}} k_{[i]}(x_i, x_i') d \mu_{[i]}(x_i') = \prod_{i=1}^d P_{k_{[i]},\mu_{[i]}}(x_i),$

where for each $i\in\{1,\ldots,d\}$ , $P_{k_{[i]},\mu_{[i]}}$ is the one-dimensional potential with respect to the distribution $\mu_{[i]}$ and the kernel $k_{[i]}$ .

This method exploits this property by computing the potential as a product of univariate potentials, individually estimated by regular grids.

Returns:

potentialnumpy.array: Potential of the measure $\mu$ computed as

$P_{k,\mu}(\vect{x}) = \prod_{i=1}^d P_{k_{[i]},\mu_{[i]}}(x_i).$

draw_energy_convergence(design_indices)¶

Draws the convergence of the energy for a set of points selected among the candidate set.

Parameters:

design_indiceslist of positive int: List of the indices of the selected points in the Sample of candidate points

Returns:

figmatplotlib.Figure: Energy convergence of the design of experiments
plot_datadata used to plot the figure

draw_mmd_convergence(design_indices)¶

Draws the convergence of the MMD between a discrete measure and the target measure.

Parameters:

design_indiceslist of positive int: List of the indices of the selected points in the Sample of candidate points

Returns:

figmatplotlib.Figure: MMD convergence of the design of experiments
plot_datadata used to plot the figure

get_candidate_set()¶

Accessor to the candidate set.

Returns:

candidate_setopenturns.Sample: A deepcopy of the candidate set.

get_indices(sample)¶

When provided a subsample of the candidate set, returns the indices of its points in the candidate set.

Parameters:

sample2-d list of float: A subsample of the candidate set.

Returns:

indiceslist of int: Indices of the points of the sample within the candidate set.

select_design(size)¶

Select a design with tensorized kernel herding.

Parameters:

sizepositive int: Number of points to be selected
design_indiceslist of positive int: List of the indices of already selected points (empty by default) in the Sample of candidate points

Returns:

designopenturns.Sample: Sample of all selected points
design_indiceslist of positive int or None: List of the indices of the selected points in the Sample of candidate points

otkerneldesign

Module otkerneldesign

Table of Contents

Previous topic

Next topic

This Page

KernelHerdingTensorized¶