ibex.FrameMixin
¶
-
class
ibex.
FrameMixin
[source]¶ A base class for steps taking pandas entities, not numpy entities.
Subclass this step to indicate that a step takes pandas entities.
Example
This is a simple, illustrative “identity” transformer, which simply relays its input.
>>> import pandas as pd >>> from sklearn import base >>> import ibex >>> >>> class Id( ... base.BaseEstimator, # (1) ... base.TransformerMixin, # (2) ... ibex.FrameMixin): # (3) ... ... def fit(self, X, y=None): ... self.x_columns = X.columns # (4) ... if y is not None and isinstance(y, pd.DataFrame): ... self.y_columns = y.columns ... return self ... ... def transform(self, X, *args, **kwargs): ... return X[self.x_columns] # (5)
Note the following general points:
- We subclass
sklearn.base.BaseEstimator
, as this is an estimator. - We subclass
sklearn.base.TransformerMixin
, as, in this case, this is specifically a transformer. - We subclass
ibex.FrameMixin
, as this estimator deals withpandas
entities.
4. In
fit
, we make sure to setibex.FrameMixin.x_columns
;, and, if relevant,ibex.FrameMixin.y_columns
(ify
is apandas.DataFrame
); this will ensure that the transformer will “remember” the columns it should see in further calls.5. In
transform
, we first usex_columns
. This will verify the columns ofX
, and also reorder them according to the original order seen infit
(if needed).Suppose we define two
pandas.DataFrame
objects,X_1
andX_2
, with different columns:>>> import pandas as pd >>> >>> X_1 = pd.DataFrame({'a': [1, 2, 3], 'b': [3, 4, 5]}) >>> X_2 = X_1.rename(columns={'b': 'd'})
The following
fit
-transform
combination will work:>>> Id().fit(X_1).transform(X_1) a b 0 1 3 1 2 4 2 3 5
The following
fit
-transform
combination will fail:>>> try: ... Id().fit(X_1).transform(X_2) ... except KeyError: ... print('caught') caught
The following
transform
will fail, as the estimator was not fitted:>>> try: ... from sklearn.exceptions import NotFittedError ... except ImportError: ... from sklearn.utils.validation import NotFittedError # Older Versions >>> try: ... Id().transform(X_2) ... except NotFittedError: ... print('caught') caught
Steps can be piped into each other:
>>> (Id() | Id()).fit(X_1).transform(X_1) a b 0 1 3 1 2 4 2 3 5
Steps can be added:
>>> (Id() + Id()).fit(X_1).transform(X_1) id_0 id_1 a b a b 0 1 3 1 3 1 2 4 2 4 2 3 5 3 5
-
__add__
(other)[source]¶ Returns: ibex.sklearn.pipeline.FeatureUnion
-
__or__
(other)[source]¶ Pipes the result of this step to other.
Parameters: other – A different step object whose class subclasses this one. Returns: ibex.sklearn.pipeline.Pipeline
-
__weakref__
¶ list of weak references to the object (if defined)
-
x_columns
¶ The X columns set in the last call to fit.
Set this property at fit, and call it in other methods:
-
y_columns
¶ The y columns set in the last call to fit.
Set this property at fit, and call it in other methods:
New in version 0.1.2.
- We subclass