Representation Independent Analytics Over Structured Data
Yodsawalai Chodpathumwan (1), Jose Picado (2), Arash Termehchy (2),, Alan Fern (2), Yizhou Sun (3) ((1) University of Illinois at, Urbana-Champaign, (2) Oregon State University, (3) Northeaster University)

TL;DR
This paper introduces the concept of representation independence in database analytics, aiming to develop algorithms effective across various data structures, and empirically evaluates existing algorithms' independence levels.
Contribution
The paper formalizes the notion of representation independence and analyzes its properties for a range of data analytics algorithms, highlighting their limitations and potential for generalization.
Findings
Most algorithms are not inherently representation independent.
Certain heuristics exhibit greater representation independence under specific conditions.
Empirical analysis reveals the varying degrees of independence among popular algorithms.
Abstract
Database analytics algorithms leverage quantifiable structural properties of the data to predict interesting concepts and relationships. The same information, however, can be represented using many different structures and the structural properties observed over particular representations do not necessarily hold for alternative structures. Thus, there is no guarantee that current database analytics algorithms will still provide the correct insights, no matter what structures are chosen to organize the database. Because these algorithms tend to be highly effective over some choices of structure, such as that of the databases used to validate them, but not so effective with others, database analytics has largely remained the province of experts who can find the desired forms for these algorithms. We argue that in order to make database analytics usable, we should use or develop algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Database Systems and Queries · Data Mining Algorithms and Applications
