KernelHerding

class otkerneldesign.KernelHerding(kernel=None, distribution=None, candidate_set_size=None, candidate_set=None, initial_design=None, is_greedy=False)

Incrementally select new design points with kernel herding.

Parameters:
kernelopenturns.CovarianceModel

Covariance kernel used to define potentials. By default a product of Matern kernels with smoothness 5/2.

distributionopenturns.Distribution

Distribution the design points must represent. If not specified, then candidate_set must be specified instead. Even if candidate_set is specified, can be useful if it allows the use of analytical formulas.

candidate_set_sizepositive int

Size of the set of all candidate points. Unnecessary if candidate_set is specified. Otherwise, 2^{12} by default.

candidate_set2-d list of float

Large sample that empirically represents a distribution. If not specified, then distribution and candidate_set_size must be in order to generate it automatically.

initial_design2-d list of float

Sample of points that must be included in the design. Empty by default.

is_greedyBoolean

Set to False by default, then the criterion is the difference between the current and target potential. When set to True, the MMD minimization is strictly greedy. In practice, the two criteria are very close, only for the greedy one the current potential is multiplied by (\frac{m}{m+1}).

Examples

>>> import openturns as ot
>>> import otkerneldesign as otkd
>>> distribution = ot.ComposedDistribution([ot.Normal(0.5, 0.1)] * 2)
>>> dimension = distribution.getDimension()
>>> # Kernel definition
>>> ker_list = [ot.MaternModel([0.1], [1.0], 2.5)] * dimension
>>> kernel = ot.ProductCovarianceModel(ker_list)
>>> # Kernel herding design
>>> kh = otkd.KernelHerding(kernel=kernel, distribution=distribution)
>>> kh_design, _ = kh.select_design(size=20)

Methods

compute_criterion(design_indices)

Compute the criterion on a design \mat{X}_n. At any point of the candidate set, this criterion is given by the difference between the target potential and the potential of a discrete measure defined by the given design.

Parameters:
design_indiceslist of positive int

List of the indices of the selected points \mat{X}_n in the Sample of candidate points

Returns:
current_potential - target_potentialnumpy.array

Vector of the values taken by the criterion on all candidate points

compute_current_energy(design_indices)

Compute the energy of the discrete measure defined by the design \mat{X}_n. Considering the discrete measure \zeta_n = \frac{1}{n} \sum_{i=1}^{n} \delta(\vect{x}^{(i)}), its energy is defined as

E_{\zeta_n} := \frac{1}{n^2} \sum_{i=1}^{n} \sum_{j=1}^{n} k(\vect{x}^{(i)}, \vect{x}^{(j)}).

Parameters:
design_indiceslist of positive int

List of the indices of the selected points in the Sample of candidate points

Returns:
potentialfloat

Energy of the discrete measure defined by the design

compute_current_potential(design_indices)

Compute the potential of the discrete measure (a.k.a, kernel mean embedding) defined by the design \mat{X}_n. Considering the discrete measure \zeta_n = \frac{1}{n} \sum_{i=1}^{n} \delta(\vect{x}^{(i)}), its potential is defined as

P_{\zeta_n}(\vect{x}) := \frac{1}{n} \sum_{i=1}^{n} k(\vect{x}, \vect{x}^{(i)}).

Parameters:
design_indiceslist of positive int

List of the indices of the selected points in the Sample of candidate points

Returns:
potentialnumpy.array

Potential of the discrete measure defined by the design (a.k.a, kernel mean embedding)

compute_mmd(design_indices)

Compute Maximum Mean Discrepancy between \mu and \zeta_n = \frac{1}{n} \sum_{i=1}^{n} \delta(\vect{x}^{(i)}).

Parameters:
design_indiceslist of positive int

List of the indices of the selected points in the Sample of candidate points

Returns:
mmdfloat

Maximum Mean Discrepancy between target and current measure.

compute_target_energy()

Compute the energy of the target probability measure \mu.

Returns:
potentialfloat

Energy of the measure \mu defined by

E_{\mu} := \int \int k(\vect{x}, \vect{x}') d \mu(\vect{x}) d \mu(\vect{x}').

compute_target_potential()

Compute the potential of the target probability measure \mu.

Returns:
potentialnumpy.array

Potential of the measure \mu defined by

P_{\mu}(\vect{x}) := \int k(\vect{x}, \vect{x}') d \mu(\vect{x}').

draw_energy_convergence(design_indices)

Draws the convergence of the energy for a set of points selected among the candidate set.

Parameters:
design_indiceslist of positive int

List of the indices of the selected points in the Sample of candidate points

Returns:
figmatplotlib.Figure

Energy convergence of the design of experiments

plot_datadata used to plot the figure
draw_mmd_convergence(design_indices)

Draws the convergence of the MMD between a discrete measure and the target measure.

Parameters:
design_indiceslist of positive int

List of the indices of the selected points in the Sample of candidate points

Returns:
figmatplotlib.Figure

MMD convergence of the design of experiments

plot_datadata used to plot the figure
get_candidate_set()

Accessor to the candidate set.

Returns:
candidate_setopenturns.Sample

A deepcopy of the candidate set.

get_indices(sample)

When provided a subsample of the candidate set, returns the indices of its points in the candidate set.

Parameters:
sample2-d list of float

A subsample of the candidate set.

Returns:
indiceslist of int

Indices of the points of the sample within the candidate set.

select_design(size, initial_design_indices=[])

Select a design with kernel herding.

Parameters:
sizepositive int

Number of points to be selected

initial_design_indiceslist of positive int

List of the indices of already selected points (empty by default) in the Sample of candidate points

Returns:
designopenturns.Sample

Sample of all selected points

design_indiceslist of positive int or None

List of the indices of the selected points in the Sample of candidate points