Advancing Academic Knowledge Retrieval via LLM-enhanced Representation Similarity Fusion
Wei Dai, Peng Fu, Chunjing Gan

TL;DR
This paper presents LLM-KnowSimFuser, a novel retrieval model that combines fine-tuned LLMs with similarity fusion to improve academic knowledge retrieval, achieving top performance in the KDD Cup 2024 challenge.
Contribution
It introduces a new LLM-enhanced retrieval approach with a weighted similarity fusion mechanism, demonstrating superior performance on academic retrieval datasets.
Findings
Achieved a score of 0.20726 on the final leaderboard.
Outperformed baseline models in the KDD Cup 2024 AQA Challenge.
Validated the effectiveness of LLM-based similarity fusion in academic retrieval.
Abstract
In an era marked by robust technological growth and swift information renewal, furnishing researchers and the populace with top-tier, avant-garde academic insights spanning various domains has become an urgent necessity. The KDD Cup 2024 AQA Challenge is geared towards advancing retrieval models to identify pertinent academic terminologies from suitable papers for scientific inquiries. This paper introduces the LLM-KnowSimFuser proposed by Robo Space, which wins the 2nd place in the competition. With inspirations drawed from the superior performance of LLMs on multiple tasks, after careful analysis of the provided datasets, we firstly perform fine-tuning and inference using LLM-enhanced pre-trained retrieval models to introduce the tremendous language understanding and open-domain knowledge of LLMs into this task, followed by a weighted fusion based on the similarity matrix derived from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment · Natural Language Processing Techniques · Topic Modeling
