The Relational Data Borg is Learning

Dan Olteanu

arXiv:2008.07864·cs.DB·August 19, 2020

The Relational Data Borg is Learning

Dan Olteanu

PDF

Open Access

TL;DR

This paper presents a database-inspired approach to machine learning over relational data, leveraging algebraic, combinatorial, and system techniques to significantly improve runtime performance.

Contribution

It introduces a set of techniques that exploit data structure knowledge to optimize machine learning tasks over relational data, combining theoretical and system-level innovations.

Findings

01

Performance of machine learning tasks is significantly improved.

02

Techniques lower both complexity and constant factors in learning time.

03

Applicable to various supervised and unsupervised learning methods.

Abstract

This paper overviews an approach that addresses machine learning over relational data as a database problem. This is justified by two observations. First, the input to the learning task is commonly the result of a feature extraction query over the relational data. Second, the learning task requires the computation of group-by aggregates. This approach has been already investigated for a number of supervised and unsupervised learning tasks, including: ridge linear regression, factorisation machines, support vector machines, decision trees, principal component analysis, and k-means; and also for linear algebra over data matrices. The main message of this work is that the runtime performance of machine learning can be dramatically boosted by a toolbox of techniques that exploit the knowledge of the underlying data. This includes theoretical development on the algebraic, combinatorial,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Advanced Database Systems and Queries · Data Mining Algorithms and Applications