MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical   Speech-to-Formula

Sieun Hyeon; Kyudan Jung; Jaehee Won; Nam-Joon Kim; Hyun Gon Ryu,; Hyuk-Jae Lee; Jaeyoung Do

arXiv:2412.15655·cs.CL·April 14, 2025

MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-Formula

Sieun Hyeon, Kyudan Jung, Jaehee Won, Nam-Joon Kim, Hyun Gon Ryu,, Hyuk-Jae Lee, Jaeyoung Do

PDF

Open Access 1 Repo

TL;DR

MathSpeech is a pipeline that combines small language models with ASR to accurately convert spoken mathematical expressions into LaTeX, improving clarity and correctness over existing methods.

Contribution

We propose MathSpeech, a novel approach using small language models to enhance mathematical speech-to-formula conversion accuracy, rivaling large models like GPT-4o.

Findings

01

Reduced CER from 0.390 to 0.298

02

Achieved higher ROUGE and BLEU scores than GPT-4o

03

Demonstrated effective LaTeX generation with small models

Abstract

In various academic and professional settings, such as mathematics lectures or research presentations, it is often necessary to convey mathematical expressions orally. However, reading mathematical expressions aloud without accompanying visuals can significantly hinder comprehension, especially for those who are hearing-impaired or rely on subtitles due to language barriers. For instance, when a presenter reads Euler's Formula, current Automatic Speech Recognition (ASR) models often produce a verbose and error-prone textual description (e.g., e to the power of i x equals cosine of x plus i $side$ of x), instead of the concise $L A T E X$ format (i.e., $e^{i x} = cos (x) + i sin (x)$ ), which hampers clear understanding and communication. To address this issue, we introduce MathSpeech, a novel pipeline that integrates ASR models with small Language Models (sLMs) to correct errors…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hyeonsieun/mathspeech
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques