GaussianRandomProjection
¶
-
class
ibex.sklearn.random_projection.
GaussianRandomProjection
(n_components='auto', eps=0.1, random_state=None)¶ Bases:
sklearn.random_projection.GaussianRandomProjection
,ibex._base.FrameMixin
Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Reduce dimensionality through Gaussian random projection
The components of the random matrix are drawn from N(0, 1 / n_components).
Read more in the User Guide.
- n_components : int or ‘auto’, optional (default = ‘auto’)
Dimensionality of the target projection space.
n_components can be automatically adjusted according to the number of samples in the dataset and the bound given by the Johnson-Lindenstrauss lemma. In that case the quality of the embedding is controlled by the
eps
parameter.It should be noted that Johnson-Lindenstrauss lemma can yield very conservative estimated of the required number of components as it makes no assumption on the structure of the dataset.
- eps : strictly positive float, optional (default=0.1)
Parameter to control the quality of the embedding according to the Johnson-Lindenstrauss lemma when n_components is set to ‘auto’.
Smaller values lead to better embedding and higher number of dimensions (n_components) in the target projection space.
- random_state : int, RandomState instance or None, optional (default=None)
- Control the pseudo random number generator used to generate the matrix at fit time. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- n_component_ : int
- Concrete number of components computed when n_components=”auto”.
- components_ : numpy array of shape [n_components, n_features]
- Random matrix used for the projection.
SparseRandomProjection
-
fit
(X, y=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Generate a sparse random projection matrix
- X : numpy array or scipy.sparse of shape [n_samples, n_features]
- Training set: only the shape is used to find optimal random matrix dimensions based on the theory referenced in the afore mentioned papers.
y : is not used: placeholder to allow for usage in a Pipeline.
self
- A parameter
-
fit_transform
(X, y=None, **fit_params)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- X : numpy array of shape [n_samples, n_features]
- Training set.
- y : numpy array of shape [n_samples]
- Target values.
- X_new : numpy array of shape [n_samples, n_features_new]
- Transformed array.
- A parameter
-
transform
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Project the data by using matrix product with the random matrix
- X : numpy array or scipy.sparse of shape [n_samples, n_features]
- The input data to project into a smaller dimensional space.
- X_new : numpy array or scipy sparse of shape [n_samples, n_components]
- Projected array.
- A parameter
- A parameter