Scaling Accessible Mathematics on arXiv: HTML Conversion and MathML 4
Deyan Ginev, Brian Caruso, Bruce Miller, Jeff Sank, Jacob Weiskoff

TL;DR
This paper discusses the ongoing development of arXiv's HTML conversion tools, focusing on improving accessibility, fidelity, and efficiency through community efforts, corpus-scale conversion, MathML annotations, and a Rust port of LaTeXML.
Contribution
It introduces new enhancements in HTML fidelity, MathML annotations, and a Rust-based LaTeXML port to improve arXiv's HTML conversion and accessibility features.
Findings
Half of 6,000 user reports resolved
Achieved 75% error-free HTML, aiming for 90%
Developed initial MathML 4 annotations for speech output
Abstract
We report on the ongoing development of arXiv's HTML Papers offering, available on every new TeX/LaTeX submission since its initial release in 2023. The main highlights from 2025 and early 2026 are: (i) community-driven improvements to HTML fidelity and service health, with roughly half of 6,000 user reports resolved; (ii) corpus-scale conversion work aimed at 90% error-free HTML (currently 75%); (iii) initial MathML 4 Intent annotations for accessible speech output; (iv) an in-progress Rust port of LaTeXML, reducing compute costs and enabling faster previews on submission. The arXiv HTML Papers project remains experimental, but is gradually maturing as we better understand the needs of arXiv's readers and the technical opportunities presented by new standards and by advances in programming languages and AI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
