On-Device Sentence Similarity for SMS Dataset
Arun D Prabhu, Nikhil Arora, Shubham Vatsal, Gopi Ramena, Sukumar, Moharana, Naresh Purre

TL;DR
This paper presents a novel on-device pipeline for measuring sentence similarity in SMS texts, addressing challenges like incomplete structure and grammatical inconsistencies to improve mobile applications.
Contribution
It introduces a unique pipeline utilizing POS-based keyword extraction and statistical similarity measures tailored for SMS data on mobile devices.
Findings
Effective handling of semantic variations in SMS texts
On-device implementation suitable for mobile applications
Scalable approach adaptable to various SMS similarity tasks
Abstract
Determining the sentence similarity between Short Message Service (SMS) texts/sentences plays a significant role in mobile device industry. Gauging the similarity between SMS data is thus necessary for various applications like enhanced searching and navigation, clubbing together SMS of similar type when given a custom label or tag is provided by user irrespective of their sender etc. The problem faced with SMS data is its incomplete structure and grammatical inconsistencies. In this paper, we propose a unique pipeline for evaluating the text similarity between SMS texts. We use Part of Speech (POS) model for keyword extraction by taking advantage of the partial structure embedded in SMS texts and similarity comparisons are carried out using statistical methods. The proposed pipeline deals with major semantic variations across SMS data as well as makes it effective for its application…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
