Solving Empirical Risk Minimization in the Current Matrix Multiplication   Time

Yin Tat Lee; Zhao Song; Qiuyi Zhang

arXiv:1905.04447·cs.DS·May 14, 2019·22 cites

Solving Empirical Risk Minimization in the Current Matrix Multiplication Time

Yin Tat Lee, Zhao Song, Qiuyi Zhang

PDF

Open Access

TL;DR

This paper introduces a new algorithm for solving a broad class of convex optimization problems related to empirical risk minimization, achieving near-optimal matrix multiplication time with robust deterministic methods and efficient data structures.

Contribution

The paper presents a deterministic interior point method with a novel data structure, extending current matrix multiplication time algorithms to a wider range of convex problems.

Findings

01

Achieves runtime close to matrix multiplication bounds for empirical risk minimization.

02

Provides a robust deterministic central path algorithm.

03

Extends recent linear programming solutions to broader convex problems.

Abstract

Many convex problems in machine learning and computer science share the same form: \begin{align*} \min_{x} \sum_{i} f_i( A_i x + b_i), \end{align*} where $f_{i}$ are convex functions on $R^{n_{i}}$ with constant $n_{i}$ , $A_{i} \in R^{n_{i} \times d}$ , $b_{i} \in R^{n_{i}}$ and $\sum_{i} n_{i} = n$ . This problem generalizes linear programming and includes many problems in empirical risk minimization. In this paper, we give an algorithm that runs in time \begin{align*} O^* ( ( n^{\omega} + n^{2.5 - \alpha/2} + n^{2+ 1/6} ) \log (n / \delta) ) \end{align*} where $ω$ is the exponent of matrix multiplication, $α$ is the dual exponent of matrix multiplication, and $δ$ is the relative accuracy. Note that the runtime has only a log dependence on the condition numbers or other data dependent parameters and these are captured in $δ$ . For the current bound $\omega…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs · Sparse and Compressive Sensing Techniques