Welcome to GemClus documentation!

Welcome and thank you for checking GemClus out, this really pleasures us.

About the package

What is GemClus?

GemClus is a Python package intended for discriminative clustering. This packages aims at providing different clustering models that share the same discriminative nature, specifically in the sense of Minka [1].

Why GemClus?

GemClus originates from our work on the generalised mutual information (GEMINI). GEMINI is a clustering-dedicated function derived from information theory that allows to do clustering without hypotheses on the data distributions. This work led us to realise that multiple discriminative models lacked implementations in Python. We tried to bridge this gap by providing a tool that simultaneously offers all of the GEMINI spectrum and implementations of other discriminative clustering methods. These methods include small neural networks, logistic regression, decision trees and work from other paper that we will relevant to the discriminative clustering field in the GEMINI spirit.

Scope of GemClus

The scope of this package is especially for small-scale models: we provide implementations of linear models, trees, small neural networks using only NumPy. We try to provide also some synthetic datasets which could be of interest to the scientific community. We welcome any novel contribution, missing discriminative model or even unimplemented dataset.

Contents

Pypy CircleCI Downloads codecov Tests

GEMCLUS - A package for discriminative clustering using GEMINI

The gemclus package provides simple tools to perform discriminative clustering using the generalised mutual information (GEMINI). The package was written to be a scikit-learn compatible extension.

You can find the complete documentation of the package here: https://gemini-clustering.github.io/

The documentation for the latest updates is at: https://gemini-clustering.github.io/main

The official source code can be found here: https://github.com/gemini-clustering/GemClus

Installation

Official package

Use the following instruction for installing the package:

pip install gemclus

The library requires a couple scientific package to run:

  • NumPy

  • Scipy

  • POT

  • Scikit-learn

Latest version

You may download the latest version of the package by installing the content of the repo.

git clone https://github.com/gemini-clustering/GemClus
cd GemClus
pip install .

Contributing

We are open to suggestions of models that can be relevant to the discriminative clustering spirit of GemClus.

Acknowledgements

This work has been supported by the French government, through the 3IA Côte d’Azur, Investment in the Future, project managed by the National Research Agency (ANR) with the reference number ANR-19-P3IA-0002. We would also like to thank the France Canada Research Fund (FFCR) for their contribution to the project. This work was partly supported by EU Horizon 2020 project AI4Media, under contract no. 951911.

Also many many thanks to Pierre-Alexandre Mattei, Frederic Precioso and Charles Bouveyron for their contribution in the GEMINI project, as well as Mickaël Leclercq and Arnaud Droit. Special thanks go to Jhonatan Torres for his insights on the development.

References

3IA, Université Côte d'Azur Université Laval INRIA Laboratoire d Informatique Signaux et Systèmes