Representing Numbers in NLP: a Survey and a Vision
Avijit Thawani, Jay Pujara, Pedro A. Szekely, Filip Ilievski

TL;DR
This paper surveys NLP approaches to number representation, categorizing tasks and methods, analyzing existing models, and proposing a comprehensive framework for holistic numeracy in NLP systems.
Contribution
It provides a detailed taxonomy of numeracy tasks, analyzes various representational choices, and outlines a vision for unified evaluation and design trade-offs in number representation in NLP.
Findings
Identified 7 subtasks of numeracy in NLP
Analyzed 18 number encoders and decoders
Proposed a unified framework for evaluation
Abstract
NLP systems rarely give special consideration to numbers found in text. This starkly contrasts with the consensus in neuroscience that, in the brain, numbers are represented differently from words. We arrange recent NLP work on numeracy into a comprehensive taxonomy of tasks and methods. We break down the subjective notion of numeracy into 7 subtasks, arranged along two dimensions: granularity (exact vs approximate) and units (abstract vs grounded). We analyze the myriad representational choices made by 18 previously published number encoders and decoders. We synthesize best practices for representing numbers in text and articulate a vision for holistic numeracy in NLP, comprised of design trade-offs and a unified evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
