Hyperparameter optimization with approximate gradient

Category: optimization
#machine learning #hyperparameters #HOAG

TL;DR: I describe a method for hyperparameter optimization by gradient descent.

Most machine learning models rely on at least one hyperparameter to control for model complexity. For example, logistic regression commonly relies on a regularization parameter that controls the amount of $\ell_2$ regularization. Similarly, kernel methods also have hyperparameters …

Lightning v0.1

Category: misc
#Python #scikit-learn #machine learning #lightning

Announce: first public release of lightning!, a library for large-scale linear classification, regression and ranking in Python. The library was started a couple of years ago by Mathieu Blondel who also contributed the vast majority of source code. I joined recently its development and decided it was about time for …

scikit-learn-contrib, an umbrella for scikit-learn related projects.

Category: misc
#Python #scikit-learn #machine learning #lightning

Together with other scikit-learn developers we've created an umbrella organization for scikit-learn-related projects named scikit-learn-contrib. The idea is for this organization to host projects that are deemed too specific or too experimental to be included in the scikit-learn codebase but still offer an API which is compatible with scikit-learn and …

SAGA algorithm in the lightning library

Category: misc
#Python #scikit-learn #machine learning #lightning

Recently I've implemented, together with Arnaud Rachez, the SAGA[1] algorithm in the lightning machine learning library (which by the way, has been recently moved to the new scikit-learn-contrib project). The lightning library uses the same API as scikit-learn but is particularly adapted to online learning. As for the SAGA …

On the consistency of ordinal regression methods

Category: misc
#consistency #machine learning

My latests work (with Francis Bach and Alexandre Gramfort) is on the consistency of ordinal regression methods. It has the wildly imaginative title of "On the Consistency of Ordinal Regression Methods" and is currently under review but you can read the draft of it on ArXiv. If you have any …

Holdout cross-validation generator

Category: misc
#Python #scikit-learn #machine learning #model selection

Cross-validation iterators in scikit-learn are simply generator objects, that is, Python objects that implement the __iter__ method and that for each call to this method return (or more precisely, yield) the indices or a boolean mask for the train and test set. Hence, implementing new cross-validation iterators that behave as …

Surrogate Loss Functions in Machine Learning

Category: misc
#machine learning #consistency #calibration

TL; DR These are some notes on calibration of surrogate loss functions in the context of machine learning. But mostly it is an excuse to post some images I made.

In the binary-class classification setting we are given $n$ training samples $\{(X_1, Y_1), \ldots, (X_n, Y_n)\}$, where $X_i$ belongs to …

Numerical optimizers for Logistic Regression

Category: misc
#machine learning #logistic regression #Python #SciPy

In this post I compar several implementations of Logistic Regression. The task was to implement a Logistic Regression model using standard optimization tools from scipy.optimize and compare them against state of the art implementations such as LIBLINEAR.

In this blog post I'll write down all the implementation details of …

Logistic Ordinal Regression

Category: misc
#machine learning #ordinal regression #Python #ranking

TL;DR: I've implemented a logistic ordinal regression or proportional odds model. Here is the Python code

The logistic ordinal regression model, also known as the proportional odds was introduced in the early 80s by McCullagh [1, 2] and is a generalized linear model specially tailored for the case of …

Isotonic Regression

Category: misc
#isotonic regression #machine learning #Python #scikit-learn

My latest contribution for scikit-learn is an implementation of the isotonic regression model that I coded with Nelle Varoquaux and Alexandre Gramfort. This model finds the best least squares fit to a set of points, given the constraint that the fit must be a non-decreasing function. The example on the …