Stochastic Variance Reduced Methods Based on Sketching and Projecting
We present a new perspective on stochastic variance reduced (SVR) methods as methods that maintain an estimate of the Jacobian of an auxiliary vector valued function. This auxiliary vector valued function is formed by stacking the individual data functions from the empirical risk minimization problem. Through this observation we extend the class of SVR methods by updating the Jacobian estimate using randomized sparse sketches of the true Jacobian. By choosing different randomized sketches we recover known methods: the SAG and SAGA method, their mini-batch variants and even non-uniform sampling variants. These new SVR methods all converge linearly, as dictated by a single convergence theorem. When specialized to known methods, our convergence theorem recovers the best known convergence results for SAGA, and furthermore, we obtain new results for mini-batch and non-uniform sampling variants of SAGA. Thus our work unites all SAGA variants under one framework.
Keywords: Empirical risk minimization, stochastic gradient descent, stochastic variance reduced methods, randomized sketching, SAGA.