The idea for what was later called Monte Carlo method occurred to me when I was playing solitaire during my illness.

Stanislaw Ulam, *Adventures of a Mathematician*

The Russian Roulette offers a simple way to construct an unbiased estimator for the limit of a sequence. It allows for example to …

When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the *implicit bias* of optimization algorithms. In this post we'll characterize the implicit bias of gradient-based methods on a class of regression problems that includes linear least squares and Huber …

This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution.