On-Device Sentence Similarity for SMS Dataset

Arun D Prabhu; Nikhil Arora; Shubham Vatsal; Gopi Ramena; Sukumar; Moharana; Naresh Purre

arXiv:2012.02819·cs.CL·January 3, 2022

On-Device Sentence Similarity for SMS Dataset

Arun D Prabhu, Nikhil Arora, Shubham Vatsal, Gopi Ramena, Sukumar, Moharana, Naresh Purre

PDF

TL;DR

This paper presents a novel on-device pipeline for measuring sentence similarity in SMS texts, addressing challenges like incomplete structure and grammatical inconsistencies to improve mobile applications.

Contribution

It introduces a unique pipeline utilizing POS-based keyword extraction and statistical similarity measures tailored for SMS data on mobile devices.

Findings

01

Effective handling of semantic variations in SMS texts

02

On-device implementation suitable for mobile applications

03

Scalable approach adaptable to various SMS similarity tasks

Abstract

Determining the sentence similarity between Short Message Service (SMS) texts/sentences plays a significant role in mobile device industry. Gauging the similarity between SMS data is thus necessary for various applications like enhanced searching and navigation, clubbing together SMS of similar type when given a custom label or tag is provided by user irrespective of their sender etc. The problem faced with SMS data is its incomplete structure and grammatical inconsistencies. In this paper, we propose a unique pipeline for evaluating the text similarity between SMS texts. We use Part of Speech (POS) model for keyword extraction by taking advantage of the partial structure embedded in SMS texts and similarity comparisons are carried out using statistical methods. The proposed pipeline deals with major semantic variations across SMS data as well as makes it effective for its application…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.