Extremal Fitting CQs do not Generalize
Balder ten Cate, Maurice Funk, Jean Christoph Jung, Carsten Lutz

TL;DR
This paper proves that for conjunctive queries, algorithms producing either the most-specific or most-general fitting queries cannot also be PAC-learned efficiently from polynomial samples, highlighting a fundamental limitation.
Contribution
It establishes a theoretical impossibility result showing the incompatibility of optimal generalization and specific fitting in polynomial PAC learning for CQs.
Findings
Most-specific fitting CQs are not sample-efficient PAC learners.
Most-general fitting CQs (when they exist) are not sample-efficient PAC learners.
The proofs use polynomial constructions of relativized homomorphism dualities.
Abstract
A fitting algorithm for conjunctive queries (CQs) produces, given a set of positively and negatively labeled data examples, a CQ that fits these examples. In general, there may be many non-equivalent fitting CQs and thus the algorithm has some freedom in producing its output. Additional desirable properties of the produced CQ are that it generalizes well to unseen examples in the sense of PAC learning and that it is most general or most specific in the set of all fitting CQs. In this research note, we show that these desiderata are incompatible when we require PAC-style generalization from a polynomial sample: we prove that any fitting algorithm that produces a most-specific fitting CQ cannot be a sample-efficient PAC learning algorithm, and the same is true for fitting algorithms that produce a most-general fitting CQ (when it exists). Our proofs rely on a polynomial construction of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Data Management and Algorithms
