Overview ========= Goals ----- Ibex library aims for two (somewhat independent) goals: The first, primary goal, is providing `pandas `_ adapters for `estimators conforming to the sickit-learn protocol `_, in particular those of `scikit-learn `_ itself .. uml:: :caption: Relation of Ibex to some other packages in the scientific python stack. skinparam monochrome true skinparam shadowing false skinparam package { FontColor #777777 BorderColor lightgrey } package "Plotting" { [seaborn] [bokeh] [matplotlib] } package "Machine Learning" { [sklearn] [**ibex**] } package "Data Structures" { [numpy] [pandas] } [sklearn] -> [numpy] : interfaced by [matplotlib] -> [numpy] : interfaced by [pandas] ..> [numpy] : implemented over [seaborn] -> [pandas] : interfaced by [bokeh] -> [pandas] : interfaced by [seaborn] ..-> [matplotlib] : implemented over [**ibex**] -> [pandas] : interfaced by [**ibex**] ..-> [sklearn] : implemented over Consider the preceding UML figure. :mod:`numpy` is a (highly efficient) low-level data structure (strictly speaking, it is more of a buffer interface). both :mod:`matplotlib` and :mod:`sklearn` provide a ``numpy`` interface. Subsequently, :mod:`pandas` provided a higher-level interface to ``numpy``, and some plotting libraries, e.g., :mod:`seaborn` provide a ``pandas`` interface to plotting, while being implemented by ``matplotlib``, but . Similarly, the first aim of Ibex is to provide a ``pandas`` interface to machine learning, while being implemented by ``sklearn``. The second goal is providing easier, and more succinct ways of combining estimators, features, and pipelines. Motivation ---------- Interface --------- Ibex has a very small interface. The core library has a single public class and two functions. The rest of the library is a (mainly auto-generated) wrapper for :mod:`sklearn`, with nearly all of the classes and functions having a straightforward correspondence to ``sklearn``. :py:class:`ibex.FrameMixin` is a mixin class providing both some utilities for :mod:`pandas` support for higher-up classes, as well as pipeline and feature operators. It is described in :ref:`adapting`. :py:func:`ibex.frame` is a function taking an `estimator conforming to the sickit-learn protocol `_ (either an object or a class), and returning a ``pandas``-aware estimator (correspondingly, an object or a class). If estimators are already wrapped (which is the case for all of ``sklearn``), it is not necessary to be concerned with these at all. :py:func:`ibex.trans` is a utility function that creates an estimator applying a regular Python function, or a different estimator, to a :class:`pandas.DataFrame`, optionally specifying the input and output columns. Again, you do not need to use it if you are just planning on using ``sklearn`` estimators. Ibex (mostly automatically) wraps all of :py:mod:`sklearn` in :py:mod:`ibex.sklearn`. In almost all cases (except those noted explicitly), the wrapping has a direct correspondence with ``sklearn``. Documentation Structure ----------------------- :py:mod:`sklearn.preprocessing` :py:mod:`ibex.sklearn.preprocessing` :py:class:`sklearn.preprocessing.FunctionTransformer` :py:class:`ibex.sklearn.preprocessing.FunctionTransformer` :py:class:`ibex.sklearn.pipeline.FeatureUnion`