**Outline**

- 1. Introduction to optimization algorithms.
- 2. A unifying principle: Surrogate Minimization.
- 3 Gradient descent.
- 4. Newton's method.
- 5. Quasi-Newton.
- 6. Exploting Structure: The Stochastic Gradient Method.
- 7. Software: scipy.optimize.

**Credits**. This material was created by Fabian Pedregosa for a lecture in Berkeley's EECS 127 / 227AT class.
Source code can be found here.
The template and the visualizations are modified from the distill article How momentum really works. Some parts of the introduction and the software examples are based on the scipy lecture notes.