A profiler for Python extensions

Category: General, Python

Profiling Python extensions has not been a pleasant experience for me, so I made my own package to do the job. Existing alternatives were either hard to use, forcing you to recompile with custom flags like gprofile or desperately slow like valgrind/callgrind. The package I'll talk about is called …

scikit-learn coding sprint in Paris

Category: General, scikit-learn

Yesterday was the scikit-learn coding sprint in Paris. It was great to meet with old developers (Vincent Michel) and new ones: some of whom I was already familiar with from the mailing list while others came just to say hi and get familiar with the code. It was really great …

py3k in scikit-learn

Category: General

One thing I'd really like to see done in this Friday's scikit-learn sprint is to have full support for Python 3. There's a branch were the hard word has been done (porting C extensions, automatic 2to3 conversion, etc.), although joblib still has some bugs and no one has attempted to …

Computing the vector norm

Category: misc
#linear algebra #norm #scipy

Update: a fast and stable norm was added to scipy.linalg in August 2011 and will be available in scipy 0.10 Last week I discussed with Gael how we should compute the euclidean norm of a vector a using SciPy. Two approaches suggest themselves, either calling scipy.linalg.norm …

Smells like hacker spirit

Category: misc
#python #sklearn

I was last weekend in FOSDEM presenting scikits.learn (here are the slides I used at the Data Analytics Devroom). Kudos to Olivier Grisel and all the people who organized such a fun and authentic meeting!



New examples in scikits.learn 0.6

Category: General, scikit-learn, Tecnologí­a

Latest release of scikits.learn comes with an awesome collection of examples. These are some of my favorites:

Faces recognition

This example by Olivier Grisel, downloads a 58MB faces dataset from Labeled Faces in the Wild, and is able to perform PCA for feature extraction and SVC for classification, yielding …

Weighted samples for SVMs

Category: sklearn, python

Based on the work of libsvm-dense by Ming-Wei Chang, Hsuan-Tien Lin, Ming-Hen Tsai, Chia-Hua Ho and Hsiang-Fu Yu I patched the libsvm distribution shipped with scikits.learn to allow setting weights for individual instances. The motivation behind this is to be able force a classifier to focus its attention in …

Coming soon ...

Category: scikit-learn, Tecnologí­a

Highlights for this release: * New stochastic gradient descent module by Peter Prettenhofer * Improved svm module: memory efficiency, automatic class weights. * Wrap for liblinear's Multi-class SVC (option multi_class in LinearSVC) * New features and performance improvements of text feature extraction. * Improved sparse matrix support, both in main classes (GridSearch) as in sparse …

memory efficient bindigs for libsvm

Category: General, scikit-learn

scikits.learn.svm now uses LibSVM-dense instead of LibSVM for some support vector machine related algorithms when input is a dense matrix. As a result most of the copies associated with argument passing are avoided, giving 50% less memory footprint and several times less than the python bindings that ship …

solve triangular matrices using scipy.linalg

Category: scipy, Tecnologí­a

For some time now I've been missing a function in scipy that exploits the triangular structure of a matrix to efficiently solve the associated system, so I decided to implement it by binding the LAPACK method "trtrs", which also checks for singularities and is capable handling several right-hand sides. Contrary …