TL;DR
This paper investigates how argument retrieval systems can be evaluated for fairness in exposing different perspectives, revealing that high relevance does not always equate to fair or diverse argument stance exposure.
Contribution
It introduces and analyzes fairness-aware ranking and diversity metrics specifically for argument retrieval systems, highlighting the gap between relevance and fairness.
Findings
Systems with high relevance are not always fair in argument stance exposure.
Fairness and diversity metrics can reveal biases in argument retrieval rankings.
Relationships between fairness and diversity metrics inform better evaluation practices.
Abstract
Existing commercial search engines often struggle to represent different perspectives of a search query. Argument retrieval systems address this limitation of search engines and provide both positive (PRO) and negative (CON) perspectives about a user's information need on a controversial topic (e.g., climate change). The effectiveness of such argument retrieval systems is typically evaluated based on topical relevance and argument quality, without taking into account the often differing number of documents shown for the argument stances (PRO or CON). Therefore, systems may retrieve relevant passages, but with a biased exposure of arguments. In this work, we analyze a range of non-stochastic fairness-aware ranking and diversity metrics to evaluate the extent to which argument stances are fairly exposed in argument retrieval systems. Using the official runs of the argument retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
