Ibex library aims for two (somewhat independent) goals:

The first, primary goal, is providing pandas adapters for estimators conforming to the sickit-learn protocol, in particular those of scikit-learn itself

Relation of Ibex to some other packages in the scientific python stack.

Consider the preceding UML figure. numpy is a (highly efficient) low-level data structure (strictly speaking, it is more of a buffer interface). both matplotlib and sklearn provide a numpy interface. Subsequently, pandas provided a higher-level interface to numpy, and some plotting libraries, e.g., seaborn provide a pandas interface to plotting, while being implemented by matplotlib, but . Similarly, the first aim of Ibex is to provide a pandas interface to machine learning, while being implemented by sklearn.

The second goal is providing easier, and more succinct ways of combining estimators, features, and pipelines.



Ibex has a very small interface. The core library has a single public class and two functions. The rest of the library is a (mainly auto-generated) wrapper for sklearn, with nearly all of the classes and functions having a straightforward correspondence to sklearn.

ibex.FrameMixin is a mixin class providing both some utilities for pandas support for higher-up classes, as well as pipeline and feature operators. It is described in Adapting Estimators. ibex.frame() is a function taking an estimator conforming to the sickit-learn protocol (either an object or a class), and returning a pandas-aware estimator (correspondingly, an object or a class). If estimators are already wrapped (which is the case for all of sklearn), it is not necessary to be concerned with these at all.

ibex.trans() is a utility function that creates an estimator applying a regular Python function, or a different estimator, to a pandas.DataFrame, optionally specifying the input and output columns. Again, you do not need to use it if you are just planning on using sklearn estimators.

Ibex (mostly automatically) wraps all of sklearn in ibex.sklearn. In almost all cases (except those noted explicitly), the wrapping has a direct correspondence with sklearn.