gemclus.gemini
.MI¶
- class gemclus.gemini.MI(epsilon=1e-12)[source]¶
Implements the classical mutual information between cluster conditional probabilities and the complete data probabilities:
\[\mathcal{I} = \mathbb{E}_{y \sim p(y)}[\text{KL}(p(x|y)\|p(x))]\]This class is a simplified shortcut for KLGEMINI(ovo=False).
- Parameters:
- epsilon: float, default=1e-12
The precision for clipping the prediction values in order to avoid numerical instabilities.
- compute_affinity(X, y=None)¶
Unused for f-divergences.
- Returns:
- None
- evaluate(y_pred, affinity, return_grad=False)¶
Compute the GEMINI objective given the predictions \($p(y|x)$\) and an affinity matrix. The computation must return as well the gradients of the GEMINI w.r.t. the predictions. Depending on the context, the affinity matrix affinity can be either a kernel matrix or a distance matrix resulting from the compute_affinity method.
- Parameters:
- y_pred: ndarray of shape (n_samples, n_clusters)
The conditional distribution (prediction) of clustering assignment per sample.
- affinity: ndarray of shape (n_samples, n_samples)
The affinity matrix resulting from the compute_affinity method. The matrix must be symmetric.
- return_grad: bool, default=False
If True, the method should return the gradient of the GEMINI w.r.t. the predictions \($p(y|x)$\).
- Returns:
- gemini: float
The gemini score of the model given the predictions and affinities.
- gradients: ndarray of shape (n_samples, n_clusters)
The derivative w.r.t. the predictions y_pred: \($\nabla_{p (y|x)} \mathcal{I} $\)