NuSVC
¶
-
class
ibex.sklearn.svm.
NuSVC
(nu=0.5, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape='ovr', random_state=None)¶ Bases:
sklearn.svm.classes.NuSVC
,ibex._base.FrameMixin
Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Note
The documentation following is of the original class wrapped by this class. This class wraps the attribute
coef_
.Example:
>>> import numpy as np >>> from sklearn import datasets >>> import pandas as pd >>> >>> iris = datasets.load_iris() >>> features, targets, iris = iris['feature_names'], iris['target_names'], pd.DataFrame( ... np.c_[iris['data'], iris['target']], ... columns=iris['feature_names']+['class']) >>> iris['class'] = iris['class'].map(pd.Series(targets)) >>> >>> iris.head() sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) 0 5.1 3.5 1.4 0.2 1 4.9 3.0 1.4 0.2 2 4.7 3.2 1.3 0.2 3 4.6 3.1 1.5 0.2 4 5.0 3.6 1.4 0.2 class 0 setosa 1 setosa 2 setosa 3 setosa 4 setosa
>>> >>> from ibex.sklearn import svm as pd_svm >>> >>> clf = pd_svm.NuSVC(kernel='linear').fit(iris[features], iris['class']) >>> >>> clf.coef_ sepal length (cm) ... sepal width (cm) ... petal length (cm) ... petal width (cm) ... dtype: float64
Note
The documentation following is of the original class wrapped by this class. This class wraps the attribute
intercept_
.Example:
>>> import numpy as np >>> from sklearn import datasets >>> import pandas as pd >>> >>> iris = datasets.load_iris() >>> features, targets, iris = iris['feature_names'], iris['target_names'], pd.DataFrame( ... np.c_[iris['data'], iris['target']], ... columns=iris['feature_names']+['class']) >>> iris['class'] = iris['class'].map(pd.Series(targets)) >>> >>> iris.head() sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) 0 5.1 3.5 1.4 0.2 1 4.9 3.0 1.4 0.2 2 4.7 3.2 1.3 0.2 3 4.6 3.1 1.5 0.2 4 5.0 3.6 1.4 0.2 class 0 setosa 1 setosa 2 setosa 3 setosa 4 setosa
>>> from ibex.sklearn import svm as pd_svm >>> >>> clf = pd_svm.NuSVC(kernel='linear').fit(iris[features], iris['class']) >>> >>> clf.intercept_ sepal length (cm) ... sepal width (cm) ... petal length (cm) ... petal width (cm) ... dtype: float64
Nu-Support Vector Classification.
Similar to SVC but uses a parameter to control the number of support vectors.
The implementation is based on libsvm.
Read more in the User Guide.
- nu : float, optional (default=0.5)
- An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Should be in the interval (0, 1].
- kernel : string, optional (default=’rbf’)
- Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to precompute the kernel matrix.
- degree : int, optional (default=3)
- Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
- gamma : float, optional (default=’auto’)
- Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.
- coef0 : float, optional (default=0.0)
- Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
- probability : boolean, optional (default=False)
- Whether to enable probability estimates. This must be enabled prior to calling fit, and will slow down that method.
- shrinking : boolean, optional (default=True)
- Whether to use the shrinking heuristic.
- tol : float, optional (default=1e-3)
- Tolerance for stopping criterion.
- cache_size : float, optional
- Specify the size of the kernel cache (in MB).
- class_weight : {dict, ‘balanced’}, optional
- Set the parameter C of class i to class_weight[i]*C for
SVC. If not given, all classes are supposed to have
weight one. The “balanced” mode uses the values of y to automatically
adjust weights inversely proportional to class frequencies as
n_samples / (n_classes * np.bincount(y))
- verbose : bool, default: False
- Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
- max_iter : int, optional (default=-1)
- Hard limit on iterations within solver, or -1 for no limit.
- decision_function_shape : ‘ovo’, ‘ovr’, default=’ovr’
Whether to return a one-vs-rest (‘ovr’) decision function of shape (n_samples, n_classes) as all other classifiers, or the original one-vs-one (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes - 1) / 2).
Changed in version 0.19: decision_function_shape is ‘ovr’ by default.
New in version 0.17: decision_function_shape=’ovr’ is recommended.
Changed in version 0.17: Deprecated decision_function_shape=’ovo’ and None.
- random_state : int, RandomState instance or None, optional (default=None)
- The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- support_ : array-like, shape = [n_SV]
- Indices of support vectors.
- support_vectors_ : array-like, shape = [n_SV, n_features]
- Support vectors.
- n_support_ : array-like, dtype=int32, shape = [n_class]
- Number of support vectors for each class.
- dual_coef_ : array, shape = [n_class-1, n_SV]
- Coefficients of the support vector in the decision function. For multiclass, coefficient for all 1-vs-1 classifiers. The layout of the coefficients in the multiclass case is somewhat non-trivial. See the section about multi-class classification in the SVM section of the User Guide for details.
- coef_ : array, shape = [n_class-1, n_features]
Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.
coef_ is readonly property derived from dual_coef_ and support_vectors_.
- intercept_ : array, shape = [n_class * (n_class-1) / 2]
- Constants in decision function.
>>> import numpy as np >>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]]) >>> y = np.array([1, 1, 2, 2]) >>> from sklearn.svm import NuSVC >>> clf = NuSVC() >>> clf.fit(X, y) NuSVC(cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf', max_iter=-1, nu=0.5, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) >>> print(clf.predict([[-0.8, -1]])) [1]
- SVC
- Support Vector Machine for classification using libsvm.
- LinearSVC
- Scalable linear Support Vector Machine for classification using liblinear.
-
decision_function
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Distance of the samples X to the separating hyperplane.
X : array-like, shape (n_samples, n_features)
- X : array-like, shape (n_samples, n_classes * (n_classes-1) / 2)
- Returns the decision function of the sample for each class in the model. If decision_function_shape=’ovr’, the shape is (n_samples, n_classes)
- A parameter
-
fit
(X, y, sample_weight=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Fit the SVM model according to the given training data.
- X : {array-like, sparse matrix}, shape (n_samples, n_features)
- Training vectors, where n_samples is the number of samples and n_features is the number of features. For kernel=”precomputed”, the expected shape of X is (n_samples, n_samples).
- y : array-like, shape (n_samples,)
- Target values (class labels in classification, real numbers in regression)
- sample_weight : array-like, shape (n_samples,)
- Per-sample weights. Rescale C per sample. Higher weights force the classifier to put more emphasis on these points.
- self : object
- Returns self.
If X and y are not C-ordered and contiguous arrays of np.float64 and X is not a scipy.sparse.csr_matrix, X and/or y may be copied.
If X is a dense array, then the other methods will not support sparse matrices as input.
- A parameter
-
predict
(X)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Perform classification on samples in X.
For an one-class model, +1 or -1 is returned.
- X : {array-like, sparse matrix}, shape (n_samples, n_features)
- For kernel=”precomputed”, the expected shape of X is [n_samples_test, n_samples_train]
- y_pred : array, shape (n_samples,)
- Class labels for samples in X.
- A parameter
-
predict_log_proba
()¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Compute log probabilities of possible outcomes for samples in X.
The model need to have probability information computed at training time: fit with attribute probability set to True.
- X : array-like, shape (n_samples, n_features)
- For kernel=”precomputed”, the expected shape of X is [n_samples_test, n_samples_train]
- T : array-like, shape (n_samples, n_classes)
- Returns the log-probabilities of the sample for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
The probability model is created using cross validation, so the results can be slightly different than those obtained by predict. Also, it will produce meaningless results on very small datasets.
- A parameter
-
predict_proba
()¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Compute probabilities of possible outcomes for samples in X.
The model need to have probability information computed at training time: fit with attribute probability set to True.
- X : array-like, shape (n_samples, n_features)
- For kernel=”precomputed”, the expected shape of X is [n_samples_test, n_samples_train]
- T : array-like, shape (n_samples, n_classes)
- Returns the probability of the sample for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
The probability model is created using cross validation, so the results can be slightly different than those obtained by predict. Also, it will produce meaningless results on very small datasets.
- A parameter
-
score
(X, y, sample_weight=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- X : array-like, shape = (n_samples, n_features)
- Test samples.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
- True labels for X.
- sample_weight : array-like, shape = [n_samples], optional
- Sample weights.
- score : float
- Mean accuracy of self.predict(X) wrt. y.
- A parameter
- A parameter