PyData Paris - April 2015

Category: misc
#Python #Paris #NumPy #Numba

Last Friday was PyData Paris, in words of the organizers, ''a gathering of users and developers of data analysis tools in Python''.

The organizers did a great job in putting together and the event started already with a full room for Gael's keynote

Gael's keynote

My take-away message from the talks is …

Data-driven hemodynamic response function estimation

Category: misc
#fMRI #GLM #python

My latest research paper[1] deals with the estimation of the hemodynamic response function (HRF) from fMRI data.

This is an important topic since the knowledge of a hemodynamic response function is what makes it possible to extract the brain activation maps that are used in most of the impressive …

Plot memory usage as a function of time

Category: misc
#memory_profiler #mprof #profile

One of the lesser known features of the memory_profiler package is its ability to plot memory consumption as a function of time. This was implemented by my friend Philippe Gervais, previously a colleague at INRIA and now at Google.

With this feature it is possible to generate very easily a …

Surrogate Loss Functions in Machine Learning

Category: misc
#machine learning #consistency #calibration

TL; DR These are some notes on calibration of surrogate loss functions in the context of machine learning. But mostly it is an excuse to post some images I made.

In the binary-class classification setting we are given $n$ training samples $\{(X_1, Y_1), \ldots, (X_n, Y_n)\}$, where $X_i$ belongs to …

Different ways to get memory consumption or lessons learned from ``memory_profiler``

Category: misc
#Python #memory #memory_profiler

As part of the development of memory_profiler I've tried several ways to get memory usage of a program from within Python. In this post I'll describe the different alternatives I've tested.

The psutil library

psutil is a python library that provides an interface for retrieving information on running processes. It …

Numerical optimizers for Logistic Regression

Category: misc
#machine learning #logistic regression #Python #SciPy

In this post I compar several implementations of Logistic Regression. The task was to implement a Logistic Regression model using standard optimization tools from scipy.optimize and compare them against state of the art implementations such as LIBLINEAR.

In this blog post I'll write down all the implementation details of …

Logistic Ordinal Regression

Category: misc
#machine learning #ordinal regression #Python #ranking

TL;DR: I've implemented a logistic ordinal regression or proportional odds model. Here is the Python code

The logistic ordinal regression model, also known as the proportional odds was introduced in the early 80s by McCullagh [1, 2] and is a generalized linear model specially tailored for the case of …

Isotonic Regression

Category: misc
#isotonic regression #machine learning #Python #scikit-learn

My latest contribution for scikit-learn is an implementation of the isotonic regression model that I coded with Nelle Varoquaux and Alexandre Gramfort. This model finds the best least squares fit to a set of points, given the constraint that the fit must be a non-decreasing function. The example on the …

Householder matrices

Category: misc
#linear algebra #householder #QR

Householder matrices are square matrices of the form

$$ P = I - \beta v v^T$$

where $\beta$ is a scalar and $v$ is a vector. It has the useful property that for suitable chosen $v$ and $\beta$ it makes the product $P x$ to zero out all of the coordinates but …

Loss Functions for Ordinal regression

Category: misc
#machine learning #ordinal regression #loss function

Note: this post contains a fair amount of LaTeX, if you don't visualize the math correctly come to its original location

In machine learning it is common to formulate the classification task as a minimization problem over a given loss function. Given data input data $(x_1, ..., x_n)$ and associated labels …