TL;DR
This paper investigates how pruning large language models affects fairness in opinion summarisation, introducing a novel pruning method that improves fairness and maintains performance.
Contribution
The paper presents HGLA, a new pruning technique that better preserves fairness in LLM-generated summaries compared to existing methods.
Findings
Pruning impacts fairness more than calibration sets.
HGLA improves fairness over existing pruning methods.
Human evaluation confirms HGLA's fairness advantages.
Abstract
Model compression through post-training pruning offers a way to reduce model size and computational requirements without significantly impacting model performance. However, the effect of pruning on the fairness of LLM-generated summaries remains unexplored, particularly for opinion summarisation where biased outputs could influence public views.In this paper, we present a comprehensive empirical analysis of opinion summarisation, examining three state-of-the-art pruning methods and various calibration sets across three open-source LLMs using four fairness metrics. Our systematic analysis reveals that pruning methods have a greater impact on fairness than calibration sets. Building on these insights, we propose High Gradient Low Activation (HGLA) pruning, which identifies and removes parameters that are redundant for input processing but influential in output generation. Our experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
