gemclus.gemini.WassersteinOvA

class gemclus.gemini.WassersteinOvA(metric='euclidean', epsilon=1e-12)[source]

Implements the one-vs-all Wasserstein GEMINI which compares the Wasserstein distance between a cluster distribution and the data distribution.

\mathcal{I} = \mathbb{E}_{y \sim p(y)}[\mathcal{W}_\delta(p(x|y)\|p(x|y))]

where \delta is a metric defined between the samples of the data space.

Parameters:
metric: {‘cosine’, ‘euclidean’, ‘l2’,’l1’,’manhattan’,’cityblock’, ‘precomputed’}, default=’euclidean’

The metric to use in combination with the Wasserstein objective. It corresponds to one value of PAIRED_DISTANCES. Currently, all metric parameters are the default ones. If the metric is set to ‘precomputed’, then a custom distance matrix must be passed to the argument affinity of the evaluate method.

epsilon: float, default=1e-12

The precision for clipping the prediction values in order to avoid numerical instabilities.

__init__(metric='euclidean', epsilon=1e-12)[source]
compute_affinity(X, y=None)

Compute the distance between all samples of X.

Parameters:
X: ndarray of shape (n_samples, n_features)

The samples between which all affinities must be computed

y: ndarray of shape (n_samples, n_samples), default=None

Values of the affinity between samples in case of a “precomputed” affinity. Ignored if None and the affinity is not precomputed.

Returns:
affinity: ndarray of shape (n_samples, n_samples)

The distance between all samples if it is needed for the GEMINI computations, None otherwise.

evaluate(y_pred, affinity, return_grad=False)[source]

Compute the GEMINI objective given the predictions $p(y|x)$ and an affinity matrix. The computation must return as well the gradients of the GEMINI w.r.t. the predictions. Depending on the context, the affinity matrix affinity can be either a kernel matrix or a distance matrix resulting from the compute_affinity method.

Parameters:
y_pred: ndarray of shape (n_samples, n_clusters)

The conditional distribution (prediction) of clustering assignment per sample.

affinity: ndarray of shape (n_samples, n_samples)

The affinity matrix resulting from the compute_affinity method. The matrix must be symmetric.

return_grad: bool, default=False

If True, the method should return the gradient of the GEMINI w.r.t. the predictions $p(y|x)$.

Returns:
gemini: float

The gemini score of the model given the predictions and affinities.

gradients: ndarray of shape (n_samples, n_clusters)

The derivative w.r.t. the predictions y_pred: $\nabla_{p (y|x)} \mathcal{I} $

Examples using gemclus.gemini.WassersteinOvA

Scoring any model with GEMINI

Scoring any model with GEMINI