How Reliable are LLMs for Reasoning on the Re-ranking task?

Nafis Tanveer Islam; Zhiming Zhao

arXiv:2508.18444·cs.CL·August 27, 2025

How Reliable are LLMs for Reasoning on the Re-ranking task?

Nafis Tanveer Islam, Zhiming Zhao

PDF

TL;DR

This paper investigates the reliability of Large Language Models in re-ranking tasks, analyzing how different training methods influence their semantic understanding and explainability, especially with limited data.

Contribution

It provides an in-depth analysis of how training methods impact LLMs' reasoning and explainability in re-ranking, highlighting the gap between semantic understanding and evaluation optimization.

Findings

01

Some training methods yield better explainability.

02

LLMs often gain abstract knowledge rather than true semantic understanding.

03

Explainability varies significantly across training approaches.

Abstract

With the improving semantic understanding capability of Large Language Models (LLMs), they exhibit a greater awareness and alignment with human values, but this comes at the cost of transparency. Although promising results are achieved via experimental analysis, an in-depth understanding of the LLM's internal workings is unavoidable to comprehend the reasoning behind the re-ranking, which provides end users with an explanation that enables them to make an informed decision. Moreover, in newly developed systems with limited user engagement and insufficient ranking data, accurately re-ranking content remains a significant challenge. While various training methods affect the training of LLMs and generate inference, our analysis has found that some training methods exhibit better explainability than others, implying that an accurate semantic understanding has not been learned through all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.