On the Link Between Polynomials and Optimization

Part I: Residual Polynomials and the Chebyshev method.

There's a fascinating link between minimization of quadratic functions and polynomials. A link that goes deep and allows to phrase optimization problems in the language of polynomials and vice versa. Using this connection, we can tap into centuries of research in the theory of polynomials and shed new light on …

How to Evaluate the Logistic Loss and not NaN trying

A naive implementation of the logistic regression loss can results in numerical indeterminacy even for moderate values. This post takes a closer look into the source of these instabilities and discusses more robust Python implementations.

Notes on the Frank-Wolfe Algorithm, Part II: A Primal-dual Analysis

This blog post extends the convergence theory from the first part of these notes on the Frank-Wolfe (FW) algorithm with convergence guarantees on the primal-dual gap which generalize and strengthen the convergence guarantees obtained in the first part.

Three Operator Splitting

I discuss a recently proposed optimization algorithm: the Davis-Yin three operator splitting.

Notes on the Frank-Wolfe Algorithm, Part I

This blog post is the first in a series discussing different theoretical and practical aspects of the Frank-Wolfe algorithm.

$$ \def\xx{\boldsymbol x} \def\yy{\boldsymbol y} \def\ss{\boldsymbol s} \def\dd …

Optimization inequalities cheatsheet

Most proofs in optimization consist in using inequalities for a particular function class in some creative way. This is a cheatsheet with inequalities that I use most often. It considers …

A fully asynchronous variant of the SAGA algorithm

My friend Rémi Leblond has recently uploaded to ArXiv our preprint on an asynchronous version of the SAGA optimization algorithm.

The main contribution is to develop a parallel (fully asynchronous, no locks) variant of the SAGA algorighm. This is a stochastic variance-reduced method for general optimization, specially adapted for problems …

Hyperparameter optimization with approximate gradient

TL;DR: I describe a method for hyperparameter optimization by gradient descent.

Most machine …

Lightning v0.1

Announce: first public release of lightning!, a library for large-scale linear classification, regression and ranking in Python. The library was started a couple of years ago by Mathieu Blondel who also contributed the vast majority of source code. I joined recently its development and decided it was about time for …

scikit-learn-contrib, an umbrella for scikit-learn related projects.

Together with other scikit-learn developers we've created an umbrella organization for scikit-learn-related projects named scikit-learn-contrib. The idea is for this organization to host projects that are deemed too specific or too experimental to be included in the scikit-learn codebase but still offer an API which is compatible with scikit-learn and …