Identifying Key Terms in Prompts for Relevance Evaluation with GPT   Models

Jaekeol Choi

arXiv:2405.06931·cs.IR·May 14, 2024

Identifying Key Terms in Prompts for Relevance Evaluation with GPT Models

Jaekeol Choi

PDF

Open Access

TL;DR

This paper investigates how specific terms in prompts influence the effectiveness of GPT models in relevance evaluation tasks, emphasizing the importance of prompt design and term selection for improved performance.

Contribution

It identifies key terms that enhance relevance evaluation with GPT models and analyzes the impact of prompt wording and examples on evaluation accuracy.

Findings

01

Using 'answer' improves relevance assessment over 'relevant'

02

Balancing scope of relevance is crucial for accuracy

03

Few-shot examples refine relevance criteria

Abstract

Relevance evaluation of a query and a passage is essential in Information Retrieval (IR). Recently, numerous studies have been conducted on tasks related to relevance judgment using Large Language Models (LLMs) such as GPT-4, demonstrating significant improvements. However, the efficacy of LLMs is considerably influenced by the design of the prompt. The purpose of this paper is to identify which specific terms in prompts positively or negatively impact relevance evaluation with LLMs. We employed two types of prompts: those used in previous research and generated automatically by LLMs. By comparing the performance of these prompts in both few-shot and zero-shot settings, we analyze the influence of specific terms in the prompts. We have observed two main findings from our study. First, we discovered that prompts using the term answerlead to more effective relevance evaluations than those…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling

MethodsLinear Layer · Multi-Head Attention · Dense Connections · Position-Wise Feed-Forward Layer · Dropout · Label Smoothing · Residual Connection · Absolute Position Encodings · Byte Pair Encoding · Adam