fa.bianp.net

Smells like hacker spirit

I was last weekend in FOSDEM presenting scikits.learn (here are the slides I used at the Data Analytics Devroom). Kudos to Olivier Grisel and all the people who organized such a fun and authentic meeting!

image0

image1

New examples in scikits.learn 0.6

Latest release of scikits.learn comes with an awesome collection of examples. These are some of my favorites:

Faces recognition

This example by Olivier Grisel, downloads a 58MB faces dataset from Labeled Faces in the Wild, and is able to perform PCA for feature extraction and SVC for classification, yielding …

Weighted samples for SVMs

Based on the work of libsvm-dense by Ming-Wei Chang, Hsuan-Tien Lin, Ming-Hen Tsai, Chia-Hua Ho and Hsiang-Fu Yu I patched the libsvm distribution shipped with scikits.learn to allow setting weights for individual instances. The motivation behind this is to be able force a classifier to focus its attention in …

Coming soon ...

Highlights for this release: * New stochastic gradient descent module by Peter Prettenhofer * Improved svm module: memory efficiency, automatic class weights. * Wrap for liblinear's Multi-class SVC (option multi_class in LinearSVC) * New features and performance improvements of text feature extraction. * Improved sparse matrix support, both in main classes (GridSearch) as in sparse …

memory efficient bindigs for libsvm

scikits.learn.svm now uses LibSVM-dense instead of LibSVM for some support vector machine related algorithms when input is a dense matrix. As a result most of the copies associated with argument passing are avoided, giving 50% less memory footprint and several times less than the python bindings that ship …

solve triangular matrices using scipy.linalg

For some time now I've been missing a function in scipy that exploits the triangular structure of a matrix to efficiently solve the associated system, so I decided to implement it by binding the LAPACK method "trtrs", which also checks for singularities and is capable handling several right-hand sides. Contrary …

LARS algorithm

I've been working lately with Alexandre Gramfort coding the LARS algorithm in scikits.learn. This algorithm computes the solution to several general linear models used in machine learning: LAR, Lasso, Elasticnet and Forward Stagewise. Unlike the implementation by coordinate descent, the LARS algorithm gives the full coefficient path along the …

Second scikits.learn coding sprint

Las week took place in Paris the second scikits.learn sprint. It was two days of insane activity (115 commits, 6 branches, 33 coffees) in which we did a lot of work, both implementing new algorithms and fixing or improving old ones. This includes: * sparse version of Lasso by coordinate …

Support for sparse matrices in scikits.learn

I recently added support for sparse matrices (as defined in scipy.sparse) in some classifiers of scikits.learn. In those classes, the fit method will perform the algorithm without converting to a dense representation and will also store parameters in an efficient format. Right now, the only classese that implements …

Flags to debug python C extensions.

I often find myself debugging python C extensions from gdb, but usually some variables are hidden because aggressive optimizations that distutils sets by default. What I did not know, is that you can prevent those optimizations by passing flags -O0 -fno-inline to gcc in keyword extra_compile_args (note: this will only …