Team LA at SCIDOCA shared task 2025: Citation Discovery via relation-based zero-shot retrieval
Trieu An, Long Nguyen, Minh Le Nguyen

TL;DR
This paper presents a relation-based zero-shot retrieval system that combines similarity-based candidate selection with LLMs to improve citation prediction accuracy in complex, high-similarity abstract scenarios.
Contribution
We introduce a novel two-stage framework that first retrieves candidate abstracts using relational features, then applies an LLM for precise citation identification.
Findings
Effective retrieval of top-k similar abstracts
Improved citation prediction accuracy
Demonstrated success on SCIDOCA 2025 dataset
Abstract
The Citation Discovery Shared Task focuses on predicting the correct citation from a given candidate pool for a given paragraph. The main challenges stem from the length of the abstract paragraphs and the high similarity among candidate abstracts, making it difficult to determine the exact paper to cite. To address this, we develop a system that first retrieves the top-k most similar abstracts based on extracted relational features from the given paragraph. From this subset, we leverage a Large Language Model (LLM) to accurately identify the most relevant citation. We evaluate our framework on the training dataset provided by the SCIDOCA 2025 organizers, demonstrating its effectiveness in citation prediction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
