Positional Description for Numerical Normalization
Deepanshu Gupta, Javier Latorre

TL;DR
This paper introduces a Positional Description Scheme (PDS) for digit sequences that improves numerical normalization in language models, enhancing arithmetic accuracy and reducing errors with minimal training data.
Contribution
The paper proposes a novel PDS approach that simplifies number normalization, improves arithmetic capabilities, and addresses text normalization challenges in language models.
Findings
PDS improves arithmetic accuracy by 23% to 51%.
PDS reduces fatal numerical normalization errors.
PDS enables effective text normalization in TTS and speech recognition.
Abstract
We present a Positional Description Scheme (PDS) tailored for digit sequences, integrating placeholder value information for each digit. Given the structural limitations of subword tokenization algorithms, language models encounter critical Text Normalization (TN) challenges when handling numerical tasks. Our schema addresses this challenge through straightforward pre-processing, preserving the model architecture while significantly simplifying number normalization, rendering the problem tractable. This simplifies the task and facilitates more compact production-ready models capable of learning from smaller datasets. Furthermore, our investigations reveal that PDS enhances the arithmetic processing capabilities of language models, resulting in a relative accuracy improvement of 23% to 51% on complex arithmetic tasks. We demonstrate that PDS effectively mitigates fatal numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInertial Sensor and Navigation · Robotics and Sensor-Based Localization
