MultiLabelBinarizer
¶
-
class
ibex.sklearn.preprocessing.
MultiLabelBinarizer
(classes=None, sparse_output=False)¶ Bases:
sklearn.preprocessing.label.MultiLabelBinarizer
,ibex._base.FrameMixin
Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Transform between iterable of iterables and a multilabel format
Although a list of sets or tuples is a very intuitive format for multilabel data, it is unwieldy to process. This transformer converts between this intuitive format and the supported multilabel format: a (samples x classes) binary matrix indicating the presence of a class label.
- classes : array-like of shape [n_classes] (optional)
- Indicates an ordering for the class labels
- sparse_output : boolean (default: False),
- Set to true if output binary array is desired in CSR sparse format
- classes_ : array of labels
- A copy of the classes parameter where provided, or otherwise, the sorted set of classes found when fitting.
>>> from sklearn.preprocessing import MultiLabelBinarizer >>> mlb = MultiLabelBinarizer() >>> mlb.fit_transform([(1, 2), (3,)]) array([[1, 1, 0], [0, 0, 1]]) >>> mlb.classes_ array([1, 2, 3])
>>> mlb.fit_transform([set(['sci-fi', 'thriller']), set(['comedy'])]) array([[0, 1, 1], [1, 0, 0]]) >>> list(mlb.classes_) ['comedy', 'sci-fi', 'thriller']
- sklearn.preprocessing.OneHotEncoder : encode categorical integer features
- using a one-hot aka one-of-K scheme.
-
fit
(y)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Fit the label sets binarizer, storing classes_
- y : iterable of iterables
- A set of labels (any orderable and hashable object) for each sample. If the classes parameter is set, y will not be iterated.
self : returns this MultiLabelBinarizer instance
- A parameter
-
fit_transform
(y)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Fit the label sets binarizer and transform the given label sets
- y : iterable of iterables
- A set of labels (any orderable and hashable object) for each sample. If the classes parameter is set, y will not be iterated.
- y_indicator : array or CSR matrix, shape (n_samples, n_classes)
- A matrix such that y_indicator[i, j] = 1 iff classes_[j] is in y[i], and 0 otherwise.
- A parameter
-
inverse_transform
(yt)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Transform the given indicator matrix into label sets
- yt : array or sparse matrix of shape (n_samples, n_classes)
- A matrix containing only 1s ands 0s.
- y : list of tuples
- The set of labels for each sample such that y[i] consists of classes_[j] for each yt[i, j] == 1.
- A parameter
-
transform
(y)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Transform the given label sets
- y : iterable of iterables
- A set of labels (any orderable and hashable object) for each sample. If the classes parameter is set, y will not be iterated.
- y_indicator : array or CSR matrix, shape (n_samples, n_classes)
- A matrix such that y_indicator[i, j] = 1 iff classes_[j] is in y[i], and 0 otherwise.
- A parameter
- A parameter