RidgeClassifier

class ibex.sklearn.linear_model.RidgeClassifier(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, class_weight=None, solver='auto', random_state=None)

Bases: sklearn.linear_model.ridge.RidgeClassifier, ibex._base.FrameMixin

Note

The documentation following is of the class wrapped by this class. There are some changes, in particular:

Note

The documentation following is of the original class wrapped by this class. This class wraps the attribute coef_.

Example:

>>> import numpy as np
>>> from sklearn import datasets
>>> import pandas as pd
>>>
>>> iris = datasets.load_iris()
>>> features, targets, iris = iris['feature_names'], iris['target_names'], pd.DataFrame(
...     np.c_[iris['data'], iris['target']],
...     columns=iris['feature_names']+['class'])
>>> iris['class'] = iris['class'].map(pd.Series(targets))
>>>
>>> iris.head()
                sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)             0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
2                4.7               3.2                1.3               0.2
3                4.6               3.1                1.5               0.2
4                5.0               3.6                1.4               0.2

    class
0  setosa
1  setosa
2  setosa
3  setosa
4  setosa
>>>
>>> from ibex.sklearn import linear_model as pd_linear_model
>>>
>>> clf =  pd_linear_model.RidgeClassifier().fit(iris[features], iris['class'])
>>>
>>> clf.coef_
sepal length (cm)   ...
sepal width (cm)    ...
petal length (cm)   ...
petal width (cm)    ...
dtype: float64

Note

The documentation following is of the original class wrapped by this class. This class wraps the attribute intercept_.

Example:

>>> import numpy as np
>>> from sklearn import datasets
>>> import pandas as pd
>>>
>>> iris = datasets.load_iris()
>>> features, targets, iris = iris['feature_names'], iris['target_names'], pd.DataFrame(
...     np.c_[iris['data'], iris['target']],
...     columns=iris['feature_names']+['class'])
>>> iris['class'] = iris['class'].map(pd.Series(targets))
>>>
>>> iris.head()
                sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)             0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
2                4.7               3.2                1.3               0.2
3                4.6               3.1                1.5               0.2
4                5.0               3.6                1.4               0.2

    class
0  setosa
1  setosa
2  setosa
3  setosa
4  setosa
>>> from ibex.sklearn import linear_model as pd_linear_model
>>>
>>> clf = pd_linear_model.RidgeClassifier().fit(iris[features], iris['class'])
>>>
>>> clf.intercept_
sepal length (cm)   ...
sepal width (cm)    ...
petal length (cm)   ...
petal width (cm)    ...
dtype: float64

Classifier using Ridge regression.

Read more in the User Guide.

alpha : float
Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to C^-1 in other linear models such as LogisticRegression or LinearSVC.
fit_intercept : boolean
Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize : boolean, optional, default False
This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.
copy_X : boolean, optional, default True
If True, X will be copied; else, it may be overwritten.
max_iter : int, optional
Maximum number of iterations for conjugate gradient solver. The default value is determined by scipy.sparse.linalg.
tol : float
Precision of the solution.
class_weight : dict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

solver : {‘auto’, ‘svd’, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’, ‘saga’}

Solver to use in the computational routines:

  • ‘auto’ chooses the solver automatically based on the type of data.

  • ‘svd’ uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than ‘cholesky’.

  • ‘cholesky’ uses the standard scipy.linalg.solve function to obtain a closed-form solution.

  • ‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for large-scale data (possibility to set tol and max_iter).

  • ‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest but may not be available in old scipy versions. It also uses an iterative procedure.

  • ‘sag’ uses a Stochastic Average Gradient descent, and ‘saga’ uses its unbiased and more flexible version named SAGA. Both methods use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    New in version 0.17: Stochastic Average Gradient descent solver.

    New in version 0.19: SAGA solver.

random_state : int, RandomState instance or None, optional, default None
The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. Used when solver == ‘sag’.
coef_ : array, shape (n_features,) or (n_classes, n_features)
Weight vector(s).
intercept_ : float | array, shape = (n_targets,)
Independent term in decision function. Set to 0.0 if fit_intercept = False.
n_iter_ : array or None, shape (n_targets,)
Actual number of iterations for each target. Available only for sag and lsqr solvers. Other solvers will return None.

Ridge, RidgeClassifierCV

For multi-class classification, n_class classifiers are trained in a one-versus-all approach. Concretely, this is implemented by taking advantage of the multi-variate response support in Ridge.

decision_function(X)

Note

The documentation following is of the class wrapped by this class. There are some changes, in particular:

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.

X : {array-like, sparse matrix}, shape = (n_samples, n_features)
Samples.
array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes)
Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.
fit(X, y, sample_weight=None)[source]

Note

The documentation following is of the class wrapped by this class. There are some changes, in particular:

Fit Ridge regression model.

X : {array-like, sparse matrix}, shape = [n_samples,n_features]
Training data
y : array-like, shape = [n_samples]
Target values
sample_weight : float or numpy array of shape (n_samples,)

Sample weight.

New in version 0.17: sample_weight support to Classifier.

self : returns an instance of self.

predict(X)

Note

The documentation following is of the class wrapped by this class. There are some changes, in particular:

Predict class labels for samples in X.

X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Samples.
C : array, shape = [n_samples]
Predicted class label per sample.
score(X, y, sample_weight=None)

Note

The documentation following is of the class wrapped by this class. There are some changes, in particular:

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

X : array-like, shape = (n_samples, n_features)
Test samples.
y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True labels for X.
sample_weight : array-like, shape = [n_samples], optional
Sample weights.
score : float
Mean accuracy of self.predict(X) wrt. y.