LIP: Lightweight Intelligent Preprocessor for meaningful text-to-speech
Harshvardhan Anand, Nansi Begam, Richa Verma, Sourav Ghosh,, Harichandana B.S.S, Sumit Kumar

TL;DR
This paper introduces LIP, a lightweight preprocessor that improves text readability for TTS systems by handling emojis, punctuation, PII, and offensive words, enabling real-time, privacy-aware speech synthesis.
Contribution
The paper presents the first lightweight, real-time preprocessor for TTS that effectively manages emojis, punctuation, and PII, enhancing speech synthesis quality.
Findings
76.5% user preference for LIP-enabled TTS over standard TTS
Memory footprint of only 3.55 MB for the preprocessor
Inference time of 4 ms for 50-character text
Abstract
Existing Text-to-Speech (TTS) systems need to read messages from the email which may have Personal Identifiable Information (PII) to text messages that can have a streak of emojis and punctuation. 92% of the world's online population use emoji with more than 10 billion emojis sent everyday. Lack of preprocessor leads to messages being read as-is including punctuation and infographics like emoticons. This problem worsens if there is a continuous sequence of punctuation/emojis that are quite common in real-world communications like messaging, Social Networking Site (SNS) interactions, etc. In this work, we aim to introduce a lightweight intelligent preprocessor (LIP) that can enhance the readability of a message before being passed downstream to existing TTS systems. We propose multiple sub-modules including: expanding contraction, censoring swear words, and masking of PII, as part of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Communication and Language · Natural Language Processing Techniques · Text Readability and Simplification
