KS_JU@DPIL-FIRE2016:Detecting Paraphrases in Indian Languages Using   Multinomial Logistic Regression Model

Kamal Sarkar

arXiv:1612.08171·cs.CL·December 28, 2016·2 cites

KS_JU@DPIL-FIRE2016:Detecting Paraphrases in Indian Languages Using Multinomial Logistic Regression Model

Kamal Sarkar

PDF

Open Access

TL;DR

This paper presents a paraphrase detection system for Indian languages using a multinomial logistic regression model with lexical and semantic features, achieving high F-measures in a shared task.

Contribution

The work introduces a multilingual paraphrase detection approach utilizing logistic regression with diverse features, demonstrating competitive performance across four Indian languages.

Findings

01

Achieved up to 0.95 F-measure in Punjabi

02

System performed well across all four languages

03

Second highest overall F1-score among participating teams

Abstract

In this work, we describe a system that detects paraphrases in Indian Languages as part of our participation in the shared Task on detecting paraphrases in Indian Languages (DPIL) organized by Forum for Information Retrieval Evaluation (FIRE) in 2016. Our paraphrase detection method uses a multinomial logistic regression model trained with a variety of features which are basically lexical and semantic level similarities between two sentences in a pair. The performance of the system has been evaluated against the test set released for the FIRE 2016 shared task on DPIL. Our system achieves the highest f-measure of 0.95 on task1 in Punjabi language.The performance of our system on task1 in Hindi language is f-measure of 0.90. Out of 11 teams participated in the shared task, only four teams participated in all four languages, Hindi, Punjabi, Malayalam and Tamil, but the remaining 7 teams…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsLogistic Regression