gemclus.data
.gstm¶
- gemclus.data.gstm(n=500, alpha=2, df=1, random_state=None)[source]¶
Reproduces the Gaussian-Student Mixture dataset from the GEMINI article.
- Parameters:
- n: int, default=500
The number of samples to draw from the dataset.
- alpha: float, default=2:
This parameter controls how close the means of the Gaussian distribution and the location of the Student-t distribution are.
- df: float, default=1
The degrees of freedom for the Student-t distribution.
- random_state: int, RandomState instance or None, default=None
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple runs.
- Returns:
- X: ndarray of shape (n,2)
The samples of the dataset in an array of shape n_samples x n_features
- y: ndarray of shape (n,)
The component of the GMM from which each sample was drawn.
References
- GEMINI - Ohl, L., Mattei, P. A., Bouveyron, C., Harchaoui, W., Leclercq, M., Droit, A., & Precioso, F.
(2022, October). Generalised Mutual Information for Discriminative Clustering. In Advances in Neural Information Processing Systems.
Examples using gemclus.data.gstm
¶
Example of decision boundary map for a mixture of Gaussian and low-degree Student distributions