Interleaving Text and Number Embeddings to Solve Mathemathics Problems
Marvin Alberts, Gianmarco Gabrieli, Irina Espejo Morales

TL;DR
This paper introduces an advanced numerical embedding method that combines interleaved text and number representations, significantly improving mathematical reasoning in large language models by reducing artifacts and handling diverse magnitudes.
Contribution
It presents a novel approach with an MLP-based embedding space and routing layer, enhancing LLMs' ability to distinguish and manipulate numerical and textual data effectively.
Findings
Achieves R^2=0.9988 on numerical tasks across magnitudes
Reduces numerical artifacts and biases compared to baselines
Operates with only 45M parameters in the encoder-decoder model
Abstract
Integrating text and numbers effectively is a crucial step towards enhancing Large Language Models (LLMs) capabilities in assisting in scientific tasks. While most current approaches rely on discrete tokenization of numbers, for instance, conversion to scientific notation or base 10-decomposition, a recent approach proposed a continuous numerical encoding as an inductive bias. In this paper, we build upon this approach by introducing more expressive numerical embeddings. Our method addresses key shortcomings, including the elimination of numerical artefacts and the ability to handle a wide range of magnitudes without clipping. Our work presents two key contributions. First, we employ an MLP to assign distinct directions in the embedding space to different numbers. Our second contribution is the introduction of a routing layer that differentiates between numerical and text embeddings.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics Education and Teaching Techniques · Cognitive and developmental aspects of mathematical skills
MethodsBalanced Selection
