ARDRegression
¶
-
class
ibex.sklearn.linear_model.
ARDRegression
(n_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, compute_score=False, threshold_lambda=10000.0, fit_intercept=True, normalize=False, copy_X=True, verbose=False)¶ Bases:
sklearn.linear_model.bayes.ARDRegression
,ibex._base.FrameMixin
Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Note
The documentation following is of the original class wrapped by this class. This class wraps the attribute
coef_
.Example:
>>> import pandas as pd >>> import numpy as np >>> from ibex.sklearn import datasets >>> from ibex.sklearn.linear_model import LinearRegression as PdLinearRegression
>>> iris = datasets.load_iris() >>> features = iris['feature_names'] >>> iris = pd.DataFrame( ... np.c_[iris['data'], iris['target']], ... columns=features+['class'])
>>> iris[features] sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) 0 5.1 3.5 1.4 0.2 1 4.9 3.0 1.4 0.2 2 4.7 3.2 1.3 0.2 3 4.6 3.1 1.5 0.2 4 5.0 3.6 1.4 0.2 ...
>>> from ibex.sklearn import linear_model as pd_linear_model >>> >>> prd = pd_linear_model.ARDRegression().fit(iris[features], iris['class']) >>> >>> prd.coef_ sepal length (cm) ... sepal width (cm) ... petal length (cm) ... petal width (cm) ... dtype: float64
Note
The documentation following is of the original class wrapped by this class. This class wraps the attribute
intercept_
.Example:
>>> import pandas as pd >>> import numpy as np >>> from ibex.sklearn import datasets >>> from ibex.sklearn.linear_model import LinearRegression as PdLinearRegression
>>> iris = datasets.load_iris() >>> features = iris['feature_names'] >>> iris = pd.DataFrame( ... np.c_[iris['data'], iris['target']], ... columns=features+['class'])
>>> iris[features] sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) 0 5.1 3.5 1.4 0.2 1 4.9 3.0 1.4 0.2 2 4.7 3.2 1.3 0.2 3 4.6 3.1 1.5 0.2 4 5.0 3.6 1.4 0.2 ...
>>> >>> from ibex.sklearn import linear_model as pd_linear_model >>> >>> prd = pd_linear_model.ARDRegression().fit(iris[features], iris['class']) >>> >>> #scalar intercept >>> type(prd.intercept_) <class 'numpy.float64'>
Bayesian ARD regression.
Fit the weights of a regression model, using an ARD prior. The weights of the regression model are assumed to be in Gaussian distributions. Also estimate the parameters lambda (precisions of the distributions of the weights) and alpha (precision of the distribution of the noise). The estimation is done by an iterative procedures (Evidence Maximization)
Read more in the User Guide.
- n_iter : int, optional
- Maximum number of iterations. Default is 300
- tol : float, optional
- Stop the algorithm if w has converged. Default is 1.e-3.
- alpha_1 : float, optional
- Hyper-parameter : shape parameter for the Gamma distribution prior over the alpha parameter. Default is 1.e-6.
- alpha_2 : float, optional
- Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the alpha parameter. Default is 1.e-6.
- lambda_1 : float, optional
- Hyper-parameter : shape parameter for the Gamma distribution prior over the lambda parameter. Default is 1.e-6.
- lambda_2 : float, optional
- Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the lambda parameter. Default is 1.e-6.
- compute_score : boolean, optional
- If True, compute the objective function at each step of the model. Default is False.
- threshold_lambda : float, optional
- threshold for removing (pruning) weights with high precision from the computation. Default is 1.e+4.
- fit_intercept : boolean, optional
- whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered). Default is True.
- normalize : boolean, optional, default False
- This parameter is ignored when
fit_intercept
is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please usesklearn.preprocessing.StandardScaler
before callingfit
on an estimator withnormalize=False
. - copy_X : boolean, optional, default True.
- If True, X will be copied; else, it may be overwritten.
- verbose : boolean, optional, default False
- Verbose mode when fitting the model.
- coef_ : array, shape = (n_features)
- Coefficients of the regression model (mean of distribution)
- alpha_ : float
- estimated precision of the noise.
- lambda_ : array, shape = (n_features)
- estimated precisions of the weights.
- sigma_ : array, shape = (n_features, n_features)
- estimated variance-covariance matrix of the weights
- scores_ : float
- if computed, value of the objective function (to be maximized)
>>> from sklearn import linear_model >>> clf = linear_model.ARDRegression() >>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2]) ... ARDRegression(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True, fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300, normalize=False, threshold_lambda=10000.0, tol=0.001, verbose=False) >>> clf.predict([[1, 1]]) array([ 1.])
For an example, see examples/linear_model/plot_ard.py.
D. J. C. MacKay, Bayesian nonlinear modeling for the prediction competition, ASHRAE Transactions, 1994.
R. Salakhutdinov, Lecture notes on Statistical Machine Learning, http://www.utstat.toronto.edu/~rsalakhu/sta4273/notes/Lecture2.pdf#page=15 Their beta is our
self.alpha_
Their alpha is ourself.lambda_
ARD is a little different than the slide: only dimensions/features for whichself.lambda_ < self.threshold_lambda
are kept and the rest are discarded.-
fit
(X, y)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
- Fit the ARDRegression model according to the given training data
and parameters.
Iterative procedure to maximize the evidence
- X : array-like, shape = [n_samples, n_features]
- Training vector, where n_samples in the number of samples and n_features is the number of features.
- y : array, shape = [n_samples]
- Target values (integers). Will be cast to X’s dtype if necessary
self : returns an instance of self.
- A parameter
-
predict
(X, return_std=False)[source]¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Predict using the linear model.
In addition to the mean of the predictive distribution, also its standard deviation can be returned.
- X : {array-like, sparse matrix}, shape = (n_samples, n_features)
- Samples.
- return_std : boolean, optional
- Whether to return the standard deviation of posterior prediction.
- y_mean : array, shape = (n_samples,)
- Mean of predictive distribution of query points.
- y_std : array, shape = (n_samples,)
- Standard deviation of predictive distribution of query points.
- A parameter
-
score
(X, y, sample_weight=None)¶ Note
The documentation following is of the class wrapped by this class. There are some changes, in particular:
- A parameter
X
denotes apandas.DataFrame
. - A parameter
y
denotes apandas.Series
.
Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
- X : array-like, shape = (n_samples, n_features)
- Test samples.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
- True values for X.
- sample_weight : array-like, shape = [n_samples], optional
- Sample weights.
- score : float
- R^2 of self.predict(X) wrt. y.
- A parameter
- A parameter