Towards Optimizing OCR for Accessibility

Peya Mowar; Tanuja Ganu; Saikat Guha

arXiv:2206.10254·cs.CV·June 27, 2022·1 cites

Towards Optimizing OCR for Accessibility

Peya Mowar, Tanuja Ganu, Saikat Guha

PDF

Open Access

TL;DR

This paper explores how preserving visual cues like structure and emphasis in OCR can significantly improve the listening experience for print-disabled users, addressing a gap in current OCR and TTS systems.

Contribution

It identifies key semantic goals for accessible reading and demonstrates that maintaining visual cues in audio form enhances comprehension and enjoyment.

Findings

01

Preserving visual cues improves listening experience

02

Even one or two cues significantly help understanding

03

Semantic goals guide OCR optimization for accessibility

Abstract

Visual cues such as structure, emphasis, and icons play an important role in efficient information foraging by sighted individuals and make for a pleasurable reading experience. Blind, low-vision and other print-disabled individuals miss out on these cues since current OCR and text-to-speech software ignore them, resulting in a tedious reading experience. We identify four semantic goals for an enjoyable listening experience, and identify syntactic visual cues that help make progress towards these goals. Empirically, we find that preserving even one or two visual cues in aural form significantly enhances the experience for listening to print content.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Accessibility for Disabilities · Tactile and Sensory Interactions · Subtitles and Audiovisual Media