Anonymization and Information Loss

Ke Wu; Baozhong Yang; Zhenkun Ying; Dexin Zhou

arXiv:2511.15364·q-fin.GN·November 20, 2025

Anonymization and Information Loss

Ke Wu, Baozhong Yang, Zhenkun Ying, Dexin Zhou

PDF

Open Access

TL;DR

This paper demonstrates that anonymization in financial texts, while protecting firm identity, causes significant information loss, especially affecting economic signal extraction and sentiment analysis, with implications for financial NLP applications.

Contribution

It reveals the extent of information loss caused by anonymization in financial texts and compares its impact to look-ahead bias in sentiment analysis tasks.

Findings

01

Anonymization reduces the ability to extract economic signals from financial texts.

02

The information loss is more severe with numerical and object entity removal.

03

Anonymization's impact surpasses look-ahead bias in sentiment extraction from earnings calls.

Abstract

We show that while anonymization effectively obscures firm identity, it significantly reduces the power of textual understanding, thereby diminishing models' ability to extract meaningful economic signals from financial texts. This information loss is particularly severe when numerical and object entities are removed from texts and is amplified in texts characterized by high linguistic uncertainty and firm specificity. Importantly, in the setting of sentiment extraction from earnings call transcripts, we find that information loss induced by anonymization is more pervasive and severe than the effects of look-ahead bias, suggesting that the costs of anonymization may outweigh its benefits in certain financial applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuditing, Earnings Management, Governance · Financial Markets and Investment Strategies · Financial Reporting and XBRL