Coresets for Data Discretization and Sine Wave Fitting
Alaa Maalouf, Murad Tukan, Eric Price, Daniel Kane, Dan, Feldman

TL;DR
This paper introduces small, efficient coresets for streaming sine wave fitting and data discretization, enabling accurate approximation of sensor data with minimal memory, applicable to anomaly detection and other monitoring tasks.
Contribution
It presents the first construction of coresets of size polylogarithmic in data range for sine wave fitting in streaming models, advancing data approximation techniques.
Findings
Coresets of size O(log(N)^O(1)) approximate sine wave costs within 1±ε.
Streaming algorithms require only polylogarithmic memory.
Experimental results validate the theoretical bounds.
Abstract
In the \emph{monitoring} problem, the input is an unbounded stream of integers in , that are obtained from a sensor (such as GPS or heart beats of a human). The goal (e.g., for anomaly detection) is to approximate the points received so far in by a single frequency , e.g. , where , is a feasible set of solutions, and is a given regularization function. For any approximation error , we prove that \emph{every} set of integers has a weighted subset (sometimes called core-set) of cardinality that approximates (for every ) up to a multiplicative factor of . Using known coreset techniques, this implies streaming algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques
MethodsGreedy Policy Search
