Reinforcement Learning from Human Feedback: Whose Culture, Whose Values, Whose Perspectives?
Kristian Gonz\'alez Barman, Simon Lohse, and Henk de Regt

TL;DR
This paper advocates for incorporating diverse human perspectives in Reinforcement Learning from Human Feedback to enhance ethical and epistemic responsiveness in Large Language Models, proposing practical steps for improvement.
Contribution
It introduces a pluralist approach to RLHF, emphasizing the importance of diverse cultural and ethical perspectives in shaping LLM development.
Findings
Highlights the ethical benefits of pluralism in RLHF
Proposes actionable steps for more inclusive LLM training
Connects social epistemology with AI feedback mechanisms
Abstract
We argue for the epistemic and ethical advantages of pluralism in Reinforcement Learning from Human Feedback (RLHF) in the context of Large Language Models (LLM). Drawing on social epistemology and pluralist philosophy of science, we suggest ways in which RHLF can be made more responsive to human needs and how we can address challenges along the way. The paper concludes with an agenda for change, i.e. concrete, actionable steps to improve LLM development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
