TopoLedgerBERT: Topological Learning of Ledger Description Embeddings using Siamese BERT-Networks
Sander Noels, S\'ebastien Viaene, Tijl De Bie

TL;DR
TopoLedgerBERT is a novel hierarchical sentence embedding method that leverages Siamese BERT networks and data augmentation to improve ledger account mapping accuracy.
Contribution
It introduces a new embedding approach that incorporates hierarchical chart information and data augmentation for better ledger account alignment.
Findings
Outperforms benchmark methods in accuracy
Achieves higher mean reciprocal rank
Effectively captures hierarchical and semantic information
Abstract
This paper addresses a long-standing problem in the field of accounting: mapping company-specific ledger accounts to a standardized chart of accounts. We propose a novel solution, TopoLedgerBERT, a unique sentence embedding method devised specifically for ledger account mapping. This model integrates hierarchical information from the charts of accounts into the sentence embedding process, aiming to accurately capture both the semantic similarity and the hierarchical structure of the ledger accounts. In addition, we introduce a data augmentation strategy that enriches the training data and, as a result, increases the performance of our proposed model. Compared to benchmark methods, TopoLedgerBERT demonstrates superior performance in terms of accuracy and mean reciprocal rank.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Wikis in Education and Collaboration · Natural Language Processing Techniques
