Scientific Table Search Using Keyword Queries
Kyle Yingkai Gao, Jamie Callan

TL;DR
This paper introduces a novel approach for scientific table search that leverages table structure, semantic understanding, and external knowledge to improve ranking accuracy, supported by a new dataset and experimental validation.
Contribution
It proposes a probabilistic framework combining structural and semantic info for scientific table search, and releases the TableArXiv dataset for benchmarking.
Findings
Significantly improved ranking accuracy over baselines
Effective use of external knowledge for query expansion
High-quality dataset for scientific table search evaluation
Abstract
Tables are common and important in scientific documents, yet most text-based document search systems do not capture structures and semantics specific to tables. How to bridge different types of mismatch between keywords queries and scientific tables and what influences ranking quality needs to be carefully investigated. This paper considers the structure of tables and gives different emphasis to table components. On the query side, thanks to external knowledge such as knowledge bases and ontologies, key concepts are extracted and used to build structured queries, and target quantity types are identified and used to expand original queries. A probabilistic framework is proposed to incorporate structural and semantic information from both query and table sides. We also construct and release TableArXiv, a high quality dataset with 105 queries and corresponding relevance judgements for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Data Quality and Management · Semantic Web and Ontologies
