Optimization Nuggets: Implicit Bias of Gradient-based Methods

Losses with Unique Finite Root.

When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the implicit bias of optimization algorithms. In this post we'll characterize the implicit bias of gradient-based methods on a class of regression problems that includes linear least squares and Huber …

Optimization Nuggets: Exponential Convergence of SGD

This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution.