Methodology and Results for the Competition on Semantic Similarity Evaluation and Entailment Recognition for PROPOR 2016
Luciano Barbosa, Paulo R. Cavalin, Victor Guimaraes, Matthias, Kormaksson

TL;DR
This paper details the methodology and results of a team participating in a Portuguese language semantic similarity and entailment competition, emphasizing the effectiveness of low-dimensional semantic word vector features over deep learning approaches.
Contribution
The paper introduces a comparative evaluation of semantic word vector strategies, highlighting the superior performance of low-dimensional features in Portuguese language tasks.
Findings
Low-dimensional feature strategies outperformed deep learning methods.
Achieved best accuracy and F1 in entailment recognition for Brazilian Portuguese.
Ranked second in semantic similarity for Portuguese.
Abstract
In this paper, we present the methodology and the results obtained by our teams, dubbed Blue Man Group, in the ASSIN (from the Portuguese {\it Avalia\c{c}\~ao de Similaridade Sem\^antica e Infer\^encia Textual}) competition, held at PROPOR 2016\footnote{International Conference on the Computational Processing of the Portuguese Language - http://propor2016.di.fc.ul.pt/}. Our team's strategy consisted of evaluating methods based on semantic word vectors, following two distinct directions: 1) to make use of low-dimensional, compact, feature sets, and 2) deep learning-based strategies dealing with high-dimensional feature vectors. Evaluation results demonstrated that the first strategy was more promising, so that the results from the second strategy have been discarded. As a result, by considering the best run of each of the six teams, we have been able to achieve the best accuracy and F1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text and Document Classification Technologies · Knowledge Management and Technology
