.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/scoring/plot_gemini_scoring.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_scoring_plot_gemini_scoring.py: ============================== Scoring any model with GEMINI ============================== We show in this example how we can score the prediction of another model using GEMINI. We do not seek to perform clustering but rather to evaluate. .. GENERATED FROM PYTHON SOURCE LINES 9-13 .. code-block:: Python import numpy as np from sklearn import datasets, preprocessing, linear_model, naive_bayes from gemclus import gemini .. GENERATED FROM PYTHON SOURCE LINES 14-16 Load a simple real dataset -------------------------------------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 18-23 .. code-block:: Python X, y = datasets.load_breast_cancer(return_X_y=True) # Preprocess this dataset X = preprocessing.RobustScaler().fit_transform(X) .. GENERATED FROM PYTHON SOURCE LINES 24-27 Train two supervised models -------------------------------------------------------------- We will train two different models on the breast cancer dataset .. GENERATED FROM PYTHON SOURCE LINES 29-38 .. code-block:: Python # The first model is a simple logistic regression with l2 penalty clf1 = linear_model.LogisticRegression(random_state=0).fit(X, y) p_y_given_x_1 = clf1.predict_proba(X) # The second model is naive Bayes using Gaussian hypotheses on the data clf2 = naive_bayes.GaussianNB().fit(X, y) p_y_given_x_2 = clf2.predict_proba(X) .. GENERATED FROM PYTHON SOURCE LINES 39-42 Scoring with GEMINI ------------------- We can now score the clustering performances of both model with GEMINI. .. GENERATED FROM PYTHON SOURCE LINES 44-57 .. code-block:: Python # Let's start with the WassersteinGEMINI (one-vs-all) and the Euclidean distance wasserstein_scoring = gemini.WassersteinGEMINI(metric="euclidean") # We need to precompute the affinity matching this Wasserstein (will be the Euclidean metric here) affinity = wasserstein_scoring.compute_affinity(X) clf1_score = wasserstein_scoring.evaluate(p_y_given_x_1, affinity) clf2_score = wasserstein_scoring.evaluate(p_y_given_x_2, affinity) print("Wasserstein OvA (Euclidean):") print(f"\t=>Linear regression: {clf1_score:.3f}") print(f"\t=>Naive Bayes: {clf2_score:.3f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Wasserstein OvA (Euclidean): =>Linear regression: 2.878 =>Naive Bayes: 3.005 .. GENERATED FROM PYTHON SOURCE LINES 58-62 Supervised Scoring with GEMINI ------------------------------ By replacing the Euclidean distance for a label-informed distance we can obtain a supervised metric. .. GENERATED FROM PYTHON SOURCE LINES 64-65 We now specify that the metric is precomputed instead .. GENERATED FROM PYTHON SOURCE LINES 65-76 .. code-block:: Python wasserstein_scoring = gemini.WassersteinGEMINI(metric="precomputed") # So, we precompute a distance where samples have distance 0 if they share the same label, 1 otherwise y_one_hot = np.eye(2)[y] precomputed_distance = 1 - np.matmul(y_one_hot, y_one_hot.T) clf1_score = wasserstein_scoring.evaluate(p_y_given_x_1, precomputed_distance) clf2_score = wasserstein_scoring.evaluate(p_y_given_x_2, precomputed_distance) print("Wasserstein OvA (Supervised):") print(f"\t=>Linear regression: {clf1_score:.3f}") print(f"\t=>Naive Bayes: {clf2_score:.3f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Wasserstein OvA (Supervised): =>Linear regression: 0.431 =>Naive Bayes: 0.403 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.534 seconds) .. _sphx_glr_download_auto_examples_scoring_plot_gemini_scoring.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_gemini_scoring.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_gemini_scoring.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_gemini_scoring.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_