Team LA at SCIDOCA shared task 2025: Citation Discovery via relation-based zero-shot retrieval

Trieu An; Long Nguyen; Minh Le Nguyen

arXiv:2506.18316·cs.IR·June 24, 2025

Team LA at SCIDOCA shared task 2025: Citation Discovery via relation-based zero-shot retrieval

Trieu An, Long Nguyen, Minh Le Nguyen

PDF

TL;DR

This paper presents a relation-based zero-shot retrieval system that combines similarity-based candidate selection with LLMs to improve citation prediction accuracy in complex, high-similarity abstract scenarios.

Contribution

We introduce a novel two-stage framework that first retrieves candidate abstracts using relational features, then applies an LLM for precise citation identification.

Findings

01

Effective retrieval of top-k similar abstracts

02

Improved citation prediction accuracy

03

Demonstrated success on SCIDOCA 2025 dataset

Abstract

The Citation Discovery Shared Task focuses on predicting the correct citation from a given candidate pool for a given paragraph. The main challenges stem from the length of the abstract paragraphs and the high similarity among candidate abstracts, making it difficult to determine the exact paper to cite. To address this, we develop a system that first retrieves the top-k most similar abstracts based on extracted relational features from the given paragraph. From this subset, we leverage a Large Language Model (LLM) to accurately identify the most relevant citation. We evaluate our framework on the training dataset provided by the SCIDOCA 2025 organizers, demonstrating its effectiveness in citation prediction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.