Integrating gender inclusivity into large language models via instruction tuning
Alina Wr\'oblewska, Bartosz \.Zuk

TL;DR
This paper proposes a method to reduce gender bias in large language models for Polish by instruction tuning with a gender-inclusive dataset and guidelines, aiming to promote fairer language generation.
Contribution
It introduces a systematic instruction tuning approach using the IPIS dataset and explicit guidelines to embed gender inclusivity into multilingual and Polish-specific LLMs.
Findings
Reduced gender bias in model outputs
Effective integration of gender-inclusive guidelines
Applicable to multiple LLM architectures
Abstract
Imagine a language with masculine, feminine, and neuter grammatical genders, yet, due to historical and political conventions, masculine forms are predominantly used to refer to men, women and mixed-gender groups. This is the reality of contemporary Polish. A social consequence of this unfair linguistic system is that large language models (LLMs) trained on Polish texts inherit and reinforce this masculine bias, generating gender-imbalanced outputs. This study addresses this issue by tuning LLMs using the IPIS dataset, a collection of human-crafted gender-inclusive proofreading in Polish and Polish-to-English translation instructions. Grounded in a theoretical linguistic framework, we design a system prompt with explicit gender-inclusive guidelines for Polish. In our experiments, we IPIS-tune multilingual LLMs (Llama-8B, Mistral-7B and Mistral-Nemo) and Polish-specific LLMs (Bielik and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
