Learning MSO-definable hypotheses on string
Martin Grohe, Christof L\"oding, Martin Ritzert

TL;DR
This paper investigates the feasibility of efficiently learning MSO-definable hypotheses over string data, demonstrating both limitations and possibilities depending on data preprocessing and access methods.
Contribution
It establishes conditions under which learning MSO-definable hypotheses can be done in sublinear or polynomial time, highlighting the impact of data access and preprocessing.
Findings
Learning in sublinear time is impossible without preprocessing.
Linear preprocessing enables polynomial-time learning of MSO hypotheses.
Results depend on data access models and preprocessing strategies.
Abstract
We study the classification problems over string data for hypotheses specified by formulas of monadic second-order logic MSO. The goal is to design learning algorithms that run in time polynomial in the size of the training set, independently of or at least sublinear in the size of the whole data set. We prove negative as well as positive results. If the data set is an unprocessed string to which our algorithms have local access, then learning in sublinear time is impossible even for hypotheses definable in a small fragment of first-order logic. If we allow for a linear time pre-processing of the string data to build an index data structure, then learning of MSO-definable hypotheses is possible in time polynomial in the size of the training set, independently of the size of the whole data set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · semigroups and automata theory
