Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering
Hongyu Yang, Liyang He, Min Hou, Shuanghong Shen, Rui Li, Jiahui Hou,, Jianhui Ma, Junda Zhao

TL;DR
This paper introduces ALMupQA, a novel framework that enhances programming question answering by aligning LLM responses with diverse user preferences using multi-perspective ranking and retrieval-augmented learning.
Contribution
The paper proposes a new multi-perspective preference ranking method and retrieval-augmented in-context learning to improve LLM responses in CCQA tasks, addressing user preference diversity.
Findings
11% improvement in BLEU score
20% increase in BERTScore
17.5% increase in CodeBERTScore
Abstract
Code Community Question Answering (CCQA) seeks to tackle programming-related issues, thereby boosting productivity in both software engineering and academic research. Recent advancements in Reinforcement Learning from Human Feedback (RLHF) have transformed the fine-tuning process of Large Language Models (LLMs) to produce responses that closely mimic human behavior. Leveraging LLMs with RLHF for practical CCQA applications has thus emerged as a promising area of study. Unlike standard code question-answering tasks, CCQA involves multiple possible answers, with varying user preferences for each response. Additionally, code communities often show a preference for new APIs. These challenges prevent LLMs from generating responses that cater to the diverse preferences of users in CCQA tasks. To address these issues, we propose a novel framework called Aligning LLMs through Multi-perspective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Service-Oriented Architecture and Web Services · Intelligent Tutoring Systems and Adaptive Learning
MethodsBalanced Selection
