Cost-Effective Conceptual Design Using Taxonomies
Ali Vakilian, Yodsawalai Chodpathumwan, Arash Termehchy, Amir, Nayyeri

TL;DR
This paper addresses the challenge of selecting cost-effective concepts from taxonomies to annotate in data sets, optimizing query answering effectiveness within limited resources, and provides algorithms with proven complexity bounds.
Contribution
It introduces a formal framework for cost-effective conceptual design using taxonomies, proves NP-hardness, and offers approximation and pseudo-polynomial algorithms with empirical validation.
Findings
Algorithms effectively quantify improvements in query answering.
Pseudo-polynomial algorithm outperforms approximation in practice.
Framework applicable to real-world data and taxonomies.
Abstract
It is known that annotating named entities in unstructured and semi-structured data sets by their concepts improves the effectiveness of answering queries over these data sets. As every enterprise has a limited budget of time or computational resources, it has to annotate a subset of concepts in a given domain whose costs of annotation do not exceed the budget. We call such a subset of concepts a {\it conceptual design} for the annotated data set. We focus on finding a conceptual design that provides the most effective answers to queries over the annotated data set, i.e., a {\it cost-effective conceptual design}. Since, it is often less time-consuming and costly to annotate general concepts than specific concepts, we use information on superclass/subclass relationships between concepts in taxonomies to find a cost-effective conceptual design. We quantify the amount by which a conceptual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Semantic Web and Ontologies · Data Quality and Management
