New Convergence Aspects of Stochastic Gradient Descend with Diminishing Learning Rate ---- Marten van Dijk

Marten van Dijk from the University of Connecticut, is on sabbatical at CWI.
  • When Nov 28, 2019 from 11:00 AM to 12:00 PM (Europe/Amsterdam / UTC100)
  • Where L120
  • Add event to calendar iCal

The classical convergence analysis of Stochastic Gradient Descend (SGD) is carried out under the assumption that the norm of the stochastic gradient is uniformly bounded. While this might hold for some loss functions, it is violated for cases where the objective function is strongly convex. We will discuss bounds on the expected convergence rate of SGD with diminishing learning rate that are based on an alternative convergence analysis which does not assume a bounded gradient. We show an upper bound and explain a corresponding tight lower bound which is only a factor 32 smaller and is dimension independent. We also explain how our framework fits the asynchronous parallel setting, and prove convergence of the Hogwild! algorithm with diminishing learning rate. So far, we discussed strongly convex objective functions; if time permits, we will introduce omega-convexity and explain how this can be used to analyse the convergence rates of SGD with diminishing learning rate for objective functions in the range from plain convex to strongly convex.

This is based on joint work with Lam M. Nguyen, Phuong Ha Nguyen, Peter Richtarik and Katya Scheinberg (ICML 2018), joint work with Phuong Ha Nguyen and Lam M. Nguyen (NeurIPS 2019), and joint work with Lam M. Nguyen, Phuong Ha Nguyen and Dzung T. Phan (ICML 2019).