Answering Analytical Queries on Text Data with Temporal Term Histograms
Kai Lin, Subhasis Dasgupta, Amarnath Gupta

TL;DR
This paper introduces temporal term histograms (TTH) as a new primitive for analyzing time-stamped text data, enabling analytical operations that are not supported by current data management systems.
Contribution
The paper proposes a novel algebra and implementation for TTH, facilitating analytical queries on temporal text data within relational databases.
Findings
TTH effectively supports analytical queries on temporal text data.
The algebra provides a formal framework with operators and rules for TTH.
Implementation demonstrates practical integration with relational databases.
Abstract
Temporal text, i.e., time-stamped text data are found abundantly in a variety of data sources like newspapers, blogs and social media posts. While today's data management systems provide facilities for searching full-text data, they do not provide any simple primitives for performing analytical operations with text. This paper proposes the temporal term histograms (TTH) as an intermediate primitive that can be used for analytical tasks. We propose an algebra, with operators and equivalence rules for TTH and present a reference implementation on a relational database system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Advanced Text Analysis Techniques · Data Management and Algorithms
