VBGMM
¶
-
class
ibex.sklearn.mixture.
VBGMM
(*args, **kwargs)¶ Bases:
sklearn.mixture.dpgmm.VBGMM
,ibex._base.FrameMixin
Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Variational Inference for the Gaussian Mixture Model
Deprecated since version 0.18: This class will be removed in 0.20. Use
sklearn.mixture.BayesianGaussianMixture
with parameterweight_concentration_prior_type='dirichlet_distribution'
instead.Variational inference for a Gaussian mixture model probability distribution. This class allows for easy and efficient inference of an approximate posterior distribution over the parameters of a Gaussian mixture model with a fixed number of components.
Initialization is with normally-distributed means and identity covariance, for proper convergence.
Read more in the User Guide.
- n_components : int, default 1
- Number of mixture components.
- covariance_type : string, default ‘diag’
- String describing the type of covariance parameters to use. Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’.
- alpha : float, default 1
- Real number representing the concentration parameter of the dirichlet distribution. Intuitively, the higher the value of alpha the more likely the variational mixture of Gaussians model will use all components it can.
- tol : float, default 1e-3
- Convergence threshold.
- n_iter : int, default 10
- Maximum number of iterations to perform before convergence.
- params : string, default ‘wmc’
- Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars.
- init_params : string, default ‘wmc’
- Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.
- verbose : int, default 0
- Controls output verbosity.
- covariance_type : string
- String describing the type of covariance parameters used by the DP-GMM. Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’.
- n_features : int
- Dimensionality of the Gaussians.
- n_components : int (read-only)
- Number of mixture components.
- weights_ : array, shape (n_components,)
- Mixing weights for each mixture component.
- means_ : array, shape (n_components, n_features)
- Mean parameters for each mixture component.
- precs_ : array
Precision (inverse covariance) parameters for each mixture component. The shape depends on covariance_type:
(`n_components`, 'n_features') if 'spherical', (`n_features`, `n_features`) if 'tied', (`n_components`, `n_features`) if 'diag', (`n_components`, `n_features`, `n_features`) if 'full'
- converged_ : bool
- True when convergence was reached in fit(), False otherwise.
GMM : Finite Gaussian mixture model fit with EM DPGMM : Infinite Gaussian mixture model, using the dirichlet
process, fit with a variational algorithm-
aic
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
- Akaike information criterion for the current model fit
and the proposed data.
X : array of shape(n_samples, n_dimensions)
aic : float (the lower the better)
- A parameter
-
bic
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
- Bayesian information criterion for the current model fit
and the proposed data.
X : array of shape(n_samples, n_dimensions)
bic : float (the lower the better)
- A parameter
-
fit
(X, y=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Estimate model parameters with the EM algorithm.
A initialization step is performed before entering the expectation-maximization (EM) algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.
- X : array_like, shape (n, n_features)
- List of n_features-dimensional data points. Each row corresponds to a single data point.
self
- A parameter
-
fit_predict
(X, y=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Fit and then predict labels for data.
Warning: Due to the final maximization step in the EM algorithm, with low iterations the prediction may not be 100% accurate.
New in version 0.17: fit_predict method in Gaussian Mixture Model.
X : array-like, shape = [n_samples, n_features]
C : array, shape = (n_samples,) component memberships
- A parameter
-
predict
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Predict label for data.
X : array-like, shape = [n_samples, n_features]
C : array, shape = (n_samples,) component memberships
- A parameter
-
predict_proba
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
- Predict posterior probability of data under each Gaussian
in the model.
X : array-like, shape = [n_samples, n_features]
- responsibilities : array-like, shape = (n_samples, n_components)
- Returns the probability of the sample for each Gaussian (state) in the model.
- A parameter
-
score
(X, y=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Compute the log probability under the model.
- X : array_like, shape (n_samples, n_features)
- List of n_features-dimensional data points. Each row corresponds to a single data point.
- logprob : array_like, shape (n_samples,)
- Log probabilities of each data point in X
- A parameter
-
score_samples
(X)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Return the likelihood of the data under the model.
Compute the bound on log probability of X under the model and return the posterior distribution (responsibilities) of each mixture component for each element of X.
This is done by computing the parameters for the mean-field of z for each observation.
- X : array_like, shape (n_samples, n_features)
- List of n_features-dimensional data points. Each row corresponds to a single data point.
- logprob : array_like, shape (n_samples,)
- Log probabilities of each data point in X
- responsibilities : array_like, shape (n_samples, n_components)
- Posterior probabilities of each mixture component for each observation
- A parameter
- A parameter