TL;DR
Bio-SODA is a question answering system over scientific knowledge graphs that does not require training data, using a graph-based approach and node centrality for ranking SPARQL queries, outperforming existing systems.
Contribution
It introduces Bio-SODA, a novel training-free NLP engine for scientific knowledge graphs that translates questions into SPARQL using graph-based methods and node centrality.
Findings
Bio-SODA outperforms existing KGQA systems by at least 20% F1-score.
It effectively handles complex scientific datasets without training data.
Experimental results include success on the bioinformatics QALD challenge.
Abstract
The problem of natural language processing over structured data has become a growing research field, both within the relational database and the Semantic Web community, with significant efforts involved in question answering over knowledge graphs (KGQA). However, many of these approaches are either specifically targeted at open-domain question answering using DBpedia, or require large training datasets to translate a natural language question to SPARQL in order to query the knowledge graph. Hence, these approaches often cannot be applied directly to complex scientific datasets where no prior training data is available. In this paper, we focus on the challenges of natural language processing over knowledge graphs of scientific datasets. In particular, we introduce Bio-SODA, a natural language processing engine that does not require training data in the form of question-answer pairs for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
