90% F1 Score in Relational Triple Extraction: Is it Real ?
Pratik Saini, Samiran Pal, Tapas Nayak, Indrajit Bhattacharya

TL;DR
This paper critically evaluates state-of-the-art relational triple extraction models under realistic conditions, revealing significant performance drops and proposing a BERT-based classifier to improve results in more practical scenarios.
Contribution
It introduces a comprehensive benchmark including zero-triple sentences and proposes a simple BERT-based classifier to enhance model performance in realistic settings.
Findings
F1 scores decline by 6-15% in realistic settings
Including zero-triple sentences affects model evaluation
The BERT-based classifier improves extraction performance
Abstract
Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores () in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentences with zero triples (zero-cardinality), thereby simplifying the task. In this paper, we present a benchmark study of state-of-the-art joint entity and relation extraction models under a more realistic setting. We include sentences that lack any triples in our experiments, providing a comprehensive evaluation. Our findings reveal a significant decline (approximately 10-15\% in one dataset and 6-14\% in another dataset) in the models' F1 scores within this realistic experimental setup.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
