LARS algorithm

⊕
By Fabian Pedregosa.

Category: misc
#scikit-learn #sparse

Thu 30 September 2010

I've been working lately with Alexandre Gramfort coding the LARS algorithm in scikits.learn. This algorithm computes the solution to several general linear models used in machine learning: LAR, Lasso, Elasticnet and Forward Stagewise. Unlike the implementation by coordinate descent, the LARS algorithm gives the full coefficient path along the regularization parameter, and thus it is specially well suited for performing model selection.

The algorithm is coded mostly in python, with some tiny parts in C (because I already had the code for Cholesky deletes in C) and a cython interface for the BLAS function dtrsv, which will be proposed to scipy once I stabilize this code. The algorithm is mostly complete, allowing some optimizations, like using a precomputed Gram matrix or specify maximum number of features/iterations, but could still be extended to compute other models, like ElasticNet or Forward Stagewise. I haven't done any benchmarks yet, but preliminary ones by Alexandre Gramfort showed that it is roughly equivalent to this Matlab implementation. Using PyMVPA, it shouldn't be difficult to benchmark it against the R implementation, though.