Non parametric clustering

This example illustrates how we can run nonparametric clustering using GEMINI.

import numpy as np
from matplotlib import pyplot as plt
from sklearn import metrics

from gemclus import data
from gemclus.nonparametric import CategoricalMMD

Draw samples from a GMM

# Generate samples on that are simple to separate
N = 100  # Number of nodes in the graph
# GMM parameters
means = np.array([[1, -1], [1, 1], [-1, -1], [-1, 1]])*2
covariances = [np.eye(2)*0.5]*4
X, y = data.draw_gmm(N, means, covariances, np.ones(4) / 4, random_state=1789)

Train the model

Create the Non parametric GEMINI clustering model and call the .fit method to optimise the cluster assignment of the nodes

model = CategoricalMMD(n_clusters=4, ovo=True, random_state=0, learning_rate=1e-2)
y_pred = model.fit_predict(X)

Final Clustering

plt.scatter(X[:, 0], X[:, 1], c=y_pred)
plt.show()

ari_score = metrics.adjusted_rand_score(y, y_pred)
gemini_score = model.score(X)
print(f"Final ARI score: {ari_score:.3f}")
print(f"GEMINI score is {gemini_score:.3f}")
plot nonparametric clustering
Final ARI score: 0.975
GEMINI score is 3.328

Total running time of the script: (0 minutes 0.393 seconds)

Gallery generated by Sphinx-Gallery