ibex.FrameMixin¶
-
class
ibex.FrameMixin[source]¶ A base class for steps taking pandas entities, not numpy entities.
Subclass this step to indicate that a step takes pandas entities.
Example
This is a simple, illustrative “identity” transformer, which simply relays its input.
>>> import pandas as pd >>> from sklearn import base >>> import ibex >>> >>> class Id( ... base.BaseEstimator, # (1) ... base.TransformerMixin, # (2) ... ibex.FrameMixin): # (3) ... ... def fit(self, X, y=None): ... self.x_columns = X.columns # (4) ... if y is not None and isinstance(y, pd.DataFrame): ... self.y_columns = y.columns ... return self ... ... def transform(self, X, *args, **kwargs): ... return X[self.x_columns] # (5)
Note the following general points:
- We subclass
sklearn.base.BaseEstimator, as this is an estimator. - We subclass
sklearn.base.TransformerMixin, as, in this case, this is specifically a transformer. - We subclass
ibex.FrameMixin, as this estimator deals withpandasentities.
4. In
fit, we make sure to setibex.FrameMixin.x_columns;, and, if relevant,ibex.FrameMixin.y_columns(ifyis apandas.DataFrame); this will ensure that the transformer will “remember” the columns it should see in further calls.5. In
transform, we first usex_columns. This will verify the columns ofX, and also reorder them according to the original order seen infit(if needed).Suppose we define two
pandas.DataFrameobjects,X_1andX_2, with different columns:>>> import pandas as pd >>> >>> X_1 = pd.DataFrame({'a': [1, 2, 3], 'b': [3, 4, 5]}) >>> X_2 = X_1.rename(columns={'b': 'd'})
The following
fit-transformcombination will work:>>> Id().fit(X_1).transform(X_1) a b 0 1 3 1 2 4 2 3 5
The following
fit-transformcombination will fail:>>> try: ... Id().fit(X_1).transform(X_2) ... except KeyError: ... print('caught') caught
The following
transformwill fail, as the estimator was not fitted:>>> try: ... from sklearn.exceptions import NotFittedError ... except ImportError: ... from sklearn.utils.validation import NotFittedError # Older Versions >>> try: ... Id().transform(X_2) ... except NotFittedError: ... print('caught') caught
Steps can be piped into each other:
>>> (Id() | Id()).fit(X_1).transform(X_1) a b 0 1 3 1 2 4 2 3 5
Steps can be added:
>>> (Id() + Id()).fit(X_1).transform(X_1) id_0 id_1 a b a b 0 1 3 1 3 1 2 4 2 4 2 3 5 3 5
-
__add__(other)[source]¶ Returns: ibex.sklearn.pipeline.FeatureUnion
-
__or__(other)[source]¶ Pipes the result of this step to other.
Parameters: other – A different step object whose class subclasses this one. Returns: ibex.sklearn.pipeline.Pipeline
-
__weakref__¶ list of weak references to the object (if defined)
-
x_columns¶ The X columns set in the last call to fit.
Set this property at fit, and call it in other methods:
-
y_columns¶ The y columns set in the last call to fit.
Set this property at fit, and call it in other methods:
New in version 0.1.2.
- We subclass