Modeling citation worthiness by using attention-based bidirectional long   short-term memory networks and interpretable models

Tong Zeng; Daniel E. Acuna

arXiv:2405.12206·cs.CL·May 21, 2024·Scientometrics

Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models

Tong Zeng, Daniel E. Acuna

PDF

1 Repo

TL;DR

This paper introduces a deep learning model with attention mechanisms to identify sentences that require citations, utilizing a large new dataset, and demonstrates state-of-the-art performance and interpretability in citation worthiness detection.

Contribution

The authors develop a BiLSTM with attention model trained on a large new dataset, achieving superior performance and interpretability for citation worthiness detection.

Findings

01

State-of-the-art $F_1$ score of 0.507 on ACL-ARC dataset

02

High performance $F_1$ score of 0.856 on PMOA-CITE dataset

03

Sections and surrounding sentences are crucial for predictions

Abstract

Scientist learn early on how to cite scientific sources to support their claims. Sometimes, however, scientists have challenges determining where a citation should be situated -- or, even worse, fail to cite a source altogether. Automatically detecting sentences that need a citation (i.e., citation worthiness) could solve both of these issues, leading to more robust and well-constructed scientific arguments. Previous researchers have applied machine learning to this task but have used small datasets and models that do not take advantage of recent algorithmic developments such as attention mechanisms in deep learning. We hypothesize that we can develop significantly accurate deep learning architectures that learn from large supervised datasets constructed from open access publications. In this work, we propose a Bidirectional Long Short-Term Memory (BiLSTM) network with attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sciosci/cite-worthiness
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM