Advancing Academic Knowledge Retrieval via LLM-enhanced Representation   Similarity Fusion

Wei Dai; Peng Fu; Chunjing Gan

arXiv:2410.10455·cs.IR·October 15, 2024

Advancing Academic Knowledge Retrieval via LLM-enhanced Representation Similarity Fusion

Wei Dai, Peng Fu, Chunjing Gan

PDF

Open Access

TL;DR

This paper presents LLM-KnowSimFuser, a novel retrieval model that combines fine-tuned LLMs with similarity fusion to improve academic knowledge retrieval, achieving top performance in the KDD Cup 2024 challenge.

Contribution

It introduces a new LLM-enhanced retrieval approach with a weighted similarity fusion mechanism, demonstrating superior performance on academic retrieval datasets.

Findings

01

Achieved a score of 0.20726 on the final leaderboard.

02

Outperformed baseline models in the KDD Cup 2024 AQA Challenge.

03

Validated the effectiveness of LLM-based similarity fusion in academic retrieval.

Abstract

In an era marked by robust technological growth and swift information renewal, furnishing researchers and the populace with top-tier, avant-garde academic insights spanning various domains has become an urgent necessity. The KDD Cup 2024 AQA Challenge is geared towards advancing retrieval models to identify pertinent academic terminologies from suitable papers for scientific inquiries. This paper introduces the LLM-KnowSimFuser proposed by Robo Space, which wins the 2nd place in the competition. With inspirations drawed from the superior performance of LLMs on multiple tasks, after careful analysis of the provided datasets, we firstly perform fine-tuning and inference using LLM-enhanced pre-trained retrieval models to introduce the tremendous language understanding and open-domain knowledge of LLMs into this task, followed by a weighted fusion based on the similarity matrix derived from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Natural Language Processing Techniques · Topic Modeling