Table2Vec: Automated Universal Representation Learning to Encode All-round Data DNA for Benchmarkable and Explainable Enterprise Data Science
Longbing Cao, Chengzhang Zhu

TL;DR
Table2Vec is a neural encoder that automatically learns universal, benchmarkable representations of enterprise data, enabling comprehensive understanding and decision-making across heterogeneous data sources.
Contribution
The paper introduces Table2Vec, a novel neural encoder for automated universal enterprise data representation learning, integrating data quality analysis and supporting diverse enterprise data science tasks.
Findings
Outperforms existing shallow, boosting, and deep learning methods in enterprise analytics.
Provides all-round, representative customer data DNA for enterprise-wide tasks.
Supports automated, ethical, and whole-of-enterprise machine learning applications.
Abstract
Enterprise data typically involves multiple heterogeneous data sources and external data that respectively record business activities, transactions, customer demographics, status, behaviors, interactions and communications with the enterprise, and the consumption and feedback of its products, services, production, marketing, operations, and management, etc. A critical challenge in enterprise data science is to enable an effective whole-of-enterprise data understanding and data-driven discovery and decision-making on all-round enterprise DNA. We introduce a neural encoder Table2Vec for automated universal representation learning of entities such as customers from all-round enterprise DNA with automated data characteristics analysis and data quality augmentation. The learned universal representations serve as representative and benchmarkable enterprise data genomes and can be used for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Big Data and Business Intelligence
