Visualization for interactively adjusting the de-bias effect of word embedding
Arisa Sugino, Takayuki Itoh

TL;DR
This paper introduces an interactive visualization method to adjust gender bias in Japanese word embeddings, balancing bias removal with model performance preservation through user-guided parameter tuning.
Contribution
It proposes a novel visualization-based approach allowing users to control debiasing levels per category, addressing bias and performance trade-offs in word embeddings.
Findings
Debiasing effects vary across categories.
User-adjusted debiasing improves fairness.
Trade-offs between bias reduction and performance are manageable.
Abstract
Word embedding, which converts words into numerical values, is an important natural language processing technique and widely used. One of the serious problems of word embedding is that the bias will be learned and affect the model if the dataset used for pre-training contains bias. On the other hand, indiscriminate removal of bias from word embeddings may result in the loss of information, even if the bias is undesirable to us. As a result, a risk of model performance degradation due to bias removal will be another problem. As a solution to this problem, we focus on gender bias in Japanese and propose an interactive visualization method to adjust the degree of debias for each word category. Specifically, we visualize the accuracy in a category classification task after debiasing, and allow the user to adjust the parameters based on the visualization results, so that the debiasing can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics
